They are genuinely different, and your before_model placement is correct for what you’re doing. The distinction is semantic, not arbitrary.
HumanInTheLoopMiddleware.after_model fires because it needs to see the AIMessage.tool_calls the model just emitted, the interrupt payload contains those concrete tool call args, the human reviews them, and the resume value either approves/edits/rejects those specific calls before they reach the tool node. It is tool-call-level approval: review what the model decided to do.
Your before_model interrupt is session-level approval: enforce that the agent cannot begin its next reasoning turn until the protocol says it may. By the time before_model fires, the previous tool results are already in state. The FSM knows the agent just received search results and is in “presented” state. Suspending at before_model means the agent cannot form the next action at all, it cannot autonomously book, until the human resumes. That is stricter and more correct for your use case.
If you had put the interrupt in after_model instead, you’d fire after the model’s text reply (after it says “here are your results”), which loses one full reasoning turn. The model could have emitted a booking tool call in that turn before you intercepted it. before_model closes that window entirely.
One concrete consequence to document: because before_model fires at the start of the next model turn, there is a one-turn lag between the triggering event (FSM entering “presented”) and the interrupt. Any tool calls emitted in the turn that drove the FSM into “presented” will already have executed before the human is asked. This is by design; the search happens, results are delivered, then the gate fires. Worth making explicit in your docs so users don’t expect the gate to intercept the search itself.
It works, but it has one practical weakness: event_label is a string, so a typo or an integration that returns an unrecognized label silently fails to match any transition. You end up in an FSM dead-end with no diagnostic. HumanInTheLoopMiddleware avoids this with a typed discriminated union:
Decision = ApproveDecision | EditDecision | RejectDecision | RespondDecision
# each has a Literal `type` field
The more idiomatic approach would mirror that pattern. Instead of a raw event_label string in the resume payload, define a typed choice type per branching state:
class ProtocolResume(TypedDict):
event_label: Literal["approve", "modify", "deny"]
# any branch-specific payload, e.g. modified args for "modify"
payload: NotRequired[dict[str, Any]]
Then in your FSM’s interrupt handler, match on resume["event_label"] and raise ProtocolViolationError immediately if the value is not in the set of valid labels for that source state. This gives you the same string-based dispatch you already have, but with early failure and a clear error rather than a silent dead-end.
The deeper LangGraph-idiomatic answer is to look at how Command(goto=...) interacts with graph routing. If your branching human choices lead to meaningfully different graph paths (e.g., “modify” loops back to the search node, while “deny” goes to a cancellation node), you could model each branch as a separate Command(goto="<node>") returned from before_model, skipping FSM states entirely for those branches. But this only makes sense if the branches are graph-level divergences. If all three outcomes (“approve”, “modify”, “deny”) stay within the same agent loop and only differ in FSM state, your event_label dispatch is the right level — just add the typed guard.
One other thing worth noting: if the ProtocolState.fsm_trace already records all transitions, you get an audit log of every interrupt and its resolution for free, which is exactly the evidence you’d want in a payment/booking incident review. That’s a good default to have in.
Again this is my subjective analysis and community can disagree!!! But If I was able to provide you with some useful feedback, I would appreciate a heart ![]()