LangGraph State Not Persisting Between Nodes: Why It Disappears and How to Fix It

LangGraph state vanishing between nodes is a reducer problem, not a connection issue. Fix it with Annotated types, MemorySaver, and proper thread config.

The workflow ran. The downstream node received an empty state dict where the accumulated messages should have been. No error. No warning. Just missing data flowing into a node that expected it.

The graph was wired correctly — edges connected, nodes returning values, compilation successful. The problem was invisible at the schema level and only surfaced when the second or third node in the chain produced garbage output because it had nothing to work with. By that point, debugging pointed at the wrong place: the node logic, the prompt, the model call. Not the state definition sitting quietly at the top of the file.

What LangGraph Actually Does With State Between Nodes

LangGraph does not pass a mutable object through the graph. At each node transition, it applies a reducer function to merge the node’s return value into the existing state. If no reducer is defined for a field, LangGraph uses a default overwrite strategy — the new value replaces whatever was there before.

For a single-node workflow, this is invisible. For a multi-node chain where node A builds a list that node B and node C need to extend, it is a silent data loss problem. Every node that returns a value for that field wipes the previous one. The state never accumulates. It just resets.

This is not a bug in LangGraph. It is the documented default. The expectation is that developers declare explicit reducers for any field that should accumulate rather than overwrite. Most developers building their first multi-node graph do not know this until something downstream breaks.

The Broken Version: State Without Reducers

Here is the state definition that causes the overwrite problem. It looks completely reasonable:

from typing import TypedDict, List
from langchain_core.messages import BaseMessage

class AgentState(TypedDict):
    messages: List[BaseMessage]
    context: str
    tool_results: List[str]

With this definition, every node that returns {"messages": [new_message]} replaces the entire messages list. Node A adds a human message. Node B adds an AI response. Node C receives only the AI response — the human message is gone. The conversation history that the agent needs to reason correctly has been overwritten at every step.

The graph wiring is fine. The node logic is fine. The problem lives entirely in the state schema, and nothing in the execution logs tells you that.

The Fixed Version: Annotated Reducers

The correct approach uses Python’s Annotated type to attach a reducer function directly to the field definition. LangGraph reads this annotation at runtime and applies the reducer instead of the default overwrite.

import operator
from typing import Annotated, TypedDict, List
from langchain_core.messages import BaseMessage

class AgentState(TypedDict):
    messages: Annotated[List[BaseMessage], operator.add]
    context: str
    tool_results: Annotated[List[str], operator.add]

Now when node A returns {"messages": [human_message]} and node B returns {"messages": [ai_response]}, LangGraph calls operator.add on the two lists and produces [human_message, ai_response]. Node C receives the full history. Fields without a reducer annotation — like context — still use the default overwrite behavior, which is correct for scalar values that should be replaced rather than accumulated.

The distinction matters at scale. A five-node graph where three fields accumulate and two fields overwrite needs this declared explicitly at the schema level, not managed manually inside each node’s return logic.

Custom Reducers for Non-List Fields

Not every accumulation pattern fits operator.add. For fields where you need deduplication, merging dicts, or priority-based updates, a custom reducer function works the same way:

from typing import Annotated, TypedDict, Dict, Any

def merge_dicts(left: Dict[str, Any], right: Dict[str, Any]) -> Dict[str, Any]:
    """Merge two dicts, with right taking precedence on key conflicts."""
    return {**left, **right}

def deduplicate_list(left: list, right: list) -> list:
    """Append new items, skip duplicates."""
    seen = set(left)
    return left + [item for item in right if item not in seen]

class AgentState(TypedDict):
    messages: Annotated[list, operator.add]
    metadata: Annotated[Dict[str, Any], merge_dicts]
    visited_nodes: Annotated[list, deduplicate_list]
    current_step: str  # overwrites — no reducer needed

The reducer receives the existing field value as left and the incoming node return value as right. Whatever the function returns becomes the new field value. This gives precise control over every field’s accumulation behavior without any logic inside the nodes themselves.

Here is a comparison of the three common patterns:

Replace (no reducer): Use for scalar values like current_step, status, or context where only the latest value matters and history is irrelevant. Every node return overwrites the previous value.

Append (operator.add): Use for message histories, log lists, or tool result arrays where every addition must be preserved. Each node’s return value is concatenated to the existing list. Suitable when duplicates are acceptable or impossible.

Custom reducer: Use for dict merging, deduplication, priority resolution, or any logic that cannot be expressed as simple concatenation. The function runs on every node transition that touches the field.

Checkpoint Configuration for Cross-Run Persistence

Reducers fix state accumulation within a single graph run. They do not persist state across separate invocations or process restarts. For that, a checkpointer is required.

LangGraph’s MemorySaver keeps checkpoints in memory — useful for development and testing but gone when the process terminates. For production workflows that need state to survive restarts, use a persistent backend like SqliteSaver or a Postgres-backed checkpointer.

from langgraph.graph import StateGraph
from langgraph.checkpoint.memory import MemorySaver
from langgraph.checkpoint.sqlite import SqliteSaver

# Development: in-memory checkpointer
memory_checkpointer = MemorySaver()

# Production: SQLite-backed persistence
sqlite_checkpointer = SqliteSaver.from_conn_string("./checkpoints.db")

# Compile with checkpointer
graph = workflow.compile(checkpointer=sqlite_checkpointer)

# Every invocation needs a thread_id to identify the conversation/run
config = {"configurable": {"thread_id": "user-session-42"}}

result = graph.invoke(
    {"messages": [HumanMessage(content="Start the workflow")]},
    config=config
)

The thread_id is not optional when using a checkpointer. Omitting it causes a runtime error: thread_id is required when using a checkpointer. Each unique thread_id maintains its own independent state history, which is what enables separate user sessions, parallel workflow runs, or resumable long-running agents.

At roughly the point where a workflow spans multiple HTTP requests or survives a server restart, in-memory checkpointing stops being sufficient. Switch to SQLite or Postgres before that happens rather than after.

Debugging State With graph.get_state()

When state is behaving unexpectedly, the fastest diagnostic is inspecting the checkpoint directly rather than adding print statements inside nodes.

# Inspect current state after an invocation
config = {"configurable": {"thread_id": "user-session-42"}}
current_state = graph.get_state(config)

# Access the state values
print(current_state.values)

# See which node was last executed
print(current_state.next)

# Inspect full checkpoint history
state_history = list(graph.get_state_history(config))
for checkpoint in state_history:
    print(f"Step: {checkpoint.metadata.get('step')}")
    print(f"State: {checkpoint.values}")
    print("---")

graph.get_state() returns a StateSnapshot object. The .values attribute contains the actual field values at the last checkpoint. The .next attribute shows which node would execute next if the graph were resumed — useful for debugging interrupt-based workflows where the graph pauses mid-execution.

graph.get_state_history() returns every checkpoint in reverse chronological order. If a field is missing data that should have accumulated, iterating through the history will show exactly which step dropped the value — which almost always points back to a missing reducer on the field definition.

Where This Breaks

Reducer conflicts with LangGraph’s built-in message handling. If you use MessagesState (the pre-built state class from langgraph.graph.message), it already has a reducer on the messages field that handles both appending and replacing by message ID. Defining your own Annotated[List[BaseMessage], operator.add] on top of a class that inherits from MessagesState can produce double-application behavior. Check the inheritance chain before adding a custom reducer to messages.

MemorySaver does not survive process restart. Any workflow using MemorySaver in production will lose all state on redeploy. This is the most common source of “state persistence” reports that are actually checkpointer configuration issues, not reducer issues. The symptom is identical — downstream nodes receive empty state — but the cause is different and requires a different fix.

Custom reducers that raise exceptions silently fail. If a custom reducer function throws an unhandled exception during a node transition, LangGraph may swallow the error depending on the execution mode. The state field will not update, and the workflow continues with stale data. Add explicit error handling inside custom reducer functions and validate input types before operating on them.

Parallel node execution with shared state fields. When two nodes run concurrently and both return values for the same field, the reducer is called for each, but the order of application depends on execution timing. For most accumulation use cases this is fine. For reducers with order-dependent logic, concurrent execution can produce non-deterministic state — use sequential edges or field-level locking for those cases.

State Persistence Troubleshooting Checklist

  1. Open your state class definition and confirm that every list or dict field that should accumulate has an Annotated reducer. A bare List[str] without a reducer will overwrite on every node return — add Annotated[List[str], operator.add] to fix it.
  2. Check whether your state class inherits from MessagesState or defines its own messages field. If both are present, remove the duplicate definition and rely on the built-in reducer from MessagesState.
  3. Confirm that every graph.invoke() or graph.stream() call passes a config dict with {"configurable": {"thread_id": "some-unique-id"}} when a checkpointer is attached. A missing thread_id raises a runtime error; an inconsistent thread_id starts a new state thread rather than resuming the existing one.
  4. Run graph.get_state(config) immediately after an invocation and print current_state.values to verify the field values match expectations before blaming downstream node logic.
  5. If state disappears across process restarts, confirm the checkpointer is SqliteSaver or equivalent, not MemorySaver. Verify the database file path is writable and persists across deployments.
  6. For custom reducers, add a type check at the top of the function to confirm both left and right are the expected types before operating. If either is None, return the non-None value rather than raising an exception that silently stalls the state update.
  7. Use graph.get_state_history(config) to iterate through all checkpoints and identify the exact step where a field stopped accumulating. The step number in checkpoint.metadata maps directly to a node execution — that node’s return value is where the reducer contract broke.

The real ROI here is not faster graph execution — it is that a correctly defined state schema eliminates an entire class of debugging sessions where the symptom points at node logic that was never broken in the first place.

Want the full state schema template with pre-configured reducers for messages, tool results, and metadata fields? Join the list and get the setup notes sent directly.

Get the LangGraph state schema template — pre-built AgentState with Annotated reducers, SqliteSaver config, and the get_state() debug snippet — delivered to your inbox. Join the list for the setup notes.

If graph.get_state_history() shows the field value dropping to empty at step two, the reducer is missing. That is the only check that matters before anything else.

Leave a Reply

Your email address will not be published. Required fields are marked *