The agent ran. The logs kept scrolling. After roughly 40 seconds of watching the same node names cycle through the output, it became clear the graph was not progressing — it was orbiting. Same state, same conditional branch, same node, repeating until the process either crashed or hit a recursion wall.
The first assumption was that the conditional logic was evaluating incorrectly — some comparison returning a wrong value, maybe a type mismatch on the state key. That assumption cost a significant amount of debugging time, because it was wrong. The conditional logic was evaluating exactly as written. The routing function was returning a node name on every path, including the path that was supposed to stop the graph. There was no END return. There was no iteration ceiling. The graph had no exit — only routes.

What the Broken Version Actually Looked Like
Here is the pattern that causes the loop. The conditional function receives state, checks a condition, and returns a node name in both branches:
from langgraph.graph import StateGraph, END
from typing import TypedDict
class AgentState(TypedDict):
messages: list
iteration_count: int
should_stop: bool
def route_after_generate(state: AgentState) -> str:
# BUG: returns node name even when exit criteria is met
if state["should_stop"]:
return "generate" # <-- wrong: this keeps looping
return "generate"
workflow = StateGraph(AgentState)
workflow.add_node("generate", generate_node)
workflow.add_conditional_edges(
"generate",
route_after_generate,
{
"generate": "generate",
# END is never in this map
}
)
workflow.set_entry_point("generate")
graph = workflow.compile()
The route_after_generate function returns "generate" regardless of should_stop. Both branches of the if-statement point to the same node. The edge map has no END key. LangGraph keeps executing because it has been given no instruction to stop — only an instruction to continue.
When this graph runs, the recursion limit eventually fires: GraphRecursionError: Recursion limit of 25 reached without hitting a stop condition. That error is actually useful — it confirms the exit path is missing, not that the logic is wrong.
The Cascade That Makes It Hard to Catch
The subtle part is that the conditional function looks correct on inspection. It checks the state. It has an if-statement. The routing map exists. Nothing visually signals that END is absent — especially when you're reading the function in isolation rather than checking what keys are registered in the edge map.
The chain failure follows this path: the exit condition evaluates correctly inside the function → the function returns a node name anyway → the edge map has no END key to route to → the graph treats the result as a valid continuation → the same node fires again → state increments → the condition evaluates correctly again → same result → loop continues until the recursion limit terminates the run with an error that mentions recursion, not a missing exit path.
Most debugging effort goes toward the wrong layer. The recursion error points at execution depth, so the instinct is to raise recursion_limit in the config or add a counter to the state. Neither fixes it. The counter just means the loop runs longer before the same crash.
Fixed Version: Explicit END Return and a Configured Iteration Ceiling
Two changes are required. The conditional function must return END when the exit criteria is met, and that string must be registered in the edge map. Additionally, a recursion_limit in the run config acts as a hard ceiling for any case where state logic fails to set the stop flag:
from langgraph.graph import StateGraph, END
from typing import TypedDict
class AgentState(TypedDict):
messages: list
iteration_count: int
should_stop: bool
def generate_node(state: AgentState) -> AgentState:
# Simulate work and increment counter
updated_count = state["iteration_count"] + 1
stop = updated_count >= 3 # exit after 3 iterations
return {
**state,
"iteration_count": updated_count,
"should_stop": stop,
"messages": state["messages"] + [f"Step {updated_count}"]
}
def route_after_generate(state: AgentState) -> str:
if state["should_stop"]:
return END # explicit termination path
return "generate"
workflow = StateGraph(AgentState)
workflow.add_node("generate", generate_node)
workflow.add_conditional_edges(
"generate",
route_after_generate,
{
"generate": "generate",
END: END # END must be registered in the map
}
)
workflow.set_entry_point("generate")
graph = workflow.compile()
# Hard ceiling via config — catches runaway logic before it recurses
config = {"recursion_limit": 10}
initial_state = {
"messages": [],
"iteration_count": 0,
"should_stop": False
}
result = graph.invoke(initial_state, config=config)
The routing function now returns END — not the string "END", but the imported constant from langgraph.graph. The edge map registers END: END so LangGraph knows that return value is a valid terminal path. The recursion_limit: 10 in config is a backup, not the primary exit — the state-based condition does the real work.
Per the LangGraph official docs for StateGraph.add_conditional_edges, every string value returned by the routing function must appear as a key in the edge map. If END is returned but not registered, the behavior is undefined — and in practice it either loops or raises.
Loop Flow vs. Proper Exit Flow
The structural difference between the broken and working versions comes down to which paths are registered. Here is how the two graphs behave:
generate → conditional function evaluates → returns "generate" on all paths → edge map routes back to generate → loop continues indefinitely → recursion limit fires → crash.
generate → conditional function evaluates → returns "generate" if continuing, returns END if done → edge map routes to generate or terminates → graph exits cleanly when stop condition is reached.
The exit path is not optional. A conditional edge without a terminal return is not a conditional edge — it is a one-way loop with a routing label attached to it.
Debugging a Suspected Loop with interrupt_before and interrupt_after
When the graph is already built and the loop is not immediately obvious, LangGraph's checkpoint interrupt system lets you pause execution at specific nodes and inspect state before and after each transition. This is the practical debugging pattern:
from langgraph.checkpoint.memory import MemorySaver
checkpointer = MemorySaver()
graph = workflow.compile(
checkpointer=checkpointer,
interrupt_before=["generate"] # pause before each generate call
)
thread_config = {
"configurable": {"thread_id": "debug-run-01"},
"recursion_limit": 10
}
# Stream events and inspect state at each interrupt
for event in graph.stream(initial_state, config=thread_config):
print(event)
# Inspect the current state to see if should_stop is updating
current_state = graph.get_state(thread_config)
print("iteration_count:", current_state.values.get("iteration_count"))
print("should_stop:", current_state.values.get("should_stop"))
# Resume execution
graph.invoke(None, config=thread_config)
Using interrupt_before=["generate"] pauses the graph before each execution of the generate node. At each pause, graph.get_state() returns the current state snapshot, which lets you verify whether should_stop is being set correctly and whether iteration_count is incrementing. If should_stop never flips to True, the node function is not updating state correctly. If it does flip but the loop continues, the routing function is not reading it — or the edge map is missing the END key.
The real ROI of this debugging pattern is that it surfaces the exact step where state diverges from expectation, rather than forcing you to infer the failure from a recursion error message that only tells you the graph ran too long.
Note: interrupt_before and interrupt_after require a checkpointer. Without one, the compile step will raise an error. MemorySaver is sufficient for local debugging; production workflows typically use a persistent checkpointer with a real storage backend.
Where This Fix Breaks Down
The pattern above handles the most common loop cause — a missing END return in the conditional function. It does not cover every failure mode.
If the node function itself has a bug that prevents state from updating — for example, returning a new dict without spreading existing state keys — then should_stop will never be set even though the routing function is correct. The graph will run until recursion_limit fires. In that case, the fix is in the node function, not the edge map.
Multi-agent graphs with subgraphs have a separate problem: each subgraph has its own END, and the parent graph requires its own termination path. Fixing the conditional edge in a subgraph does not propagate a stop to the parent. Both layers need explicit exit conditions.
The recursion_limit in config also does not produce a clean failure — it raises GraphRecursionError, which may need to be caught and handled upstream if this graph is embedded in a larger system. A caught recursion error still leaves the workflow in an incomplete state unless there is explicit error handling for that case.

Execution Checklist Before Running a Looping Graph
- Open the conditional routing function and verify that at least one return path returns
END(the imported constant, not the string"END") when the exit criteria evaluates to true. - Open the
add_conditional_edgescall and confirmEND: ENDappears as a key-value pair in the edge map — if it is missing, the return value from the routing function has no registered destination. - Check the node function that updates state and confirm it spreads or carries forward all existing state keys, not just the ones being modified — a node that returns only
{"should_stop": True}will drop every other key from the state. - Add
"recursion_limit": 10to the run config before the first test run — this limits damage from any loop that survives the routing fix and gives a faster failure signal during development. - Compile the graph with
interrupt_before=["your_looping_node"]and aMemorySavercheckpointer, then callgraph.get_state()after each step to confirm the exit flag is updating as expected before removing the interrupt.
If should_stop is True in the state snapshot but the graph still routes back to the same node, the edge map is missing the terminal key. That is the only remaining cause once the node function and routing function are both correct.
Get the workflow breakdown
If you're building multi-node LangGraph agents, the setup notes for conditional routing, checkpointer config, and state management patterns are available on the list. Join for the next breakdown.
Every conditional edge in a LangGraph workflow needs both a continuation path and a termination path — the graph will not infer one from the absence of the other. If the routing function can return a value that is not in the edge map, the behavior is undefined and the loop is a predictable consequence.