Build Your First LangGraph Agent: State, Tools, and HITL in 14 Steps

Q: What is the difference between MemorySaver and PostgresSaver checkpointers?

MemorySaver stores the entire graph state in your process's memory. It is fast, requires no external dependencies, and is ideal for development, automated tests, and demos. The downside is that state is lost when the process restarts. PostgresSaver writes each checkpoint to a Postgres database, giving you durable persistence across restarts and the ability to resume interrupted threads hours or days later. The API is identical — you simply swap out the checkpointer when compiling the graph — which makes the migration from dev to production straightforward.

Why LangGraph? A Framework Comparison

Most agent tutorials show you a simple loop: call the LLM, check if it wants to use a tool, call the tool, call the LLM again, repeat until done. This pattern works perfectly well for single-session, single-task agents. It breaks the moment you need any of the following: conversation state that persists across HTTP requests, the ability to pause mid-execution and ask a human for approval before proceeding, or complex conditional routing where the next step depends on the output of the last.

A state-machine agent models execution as a directed graph rather than a flat loop. Nodes perform work; edges define transitions; a shared state object carries data between nodes. This architecture makes the agent's behaviour explicit, testable, and inspectable. You can visualise the graph, set breakpoints at specific nodes, replay historical runs from a saved checkpoint, and swap out individual nodes without touching the rest of the graph.

As of June 2026, LangGraph v1.2.0+ is the most widely adopted framework for this pattern in Python. Klarna uses it to power customer-service agents at scale. LinkedIn runs LangGraph agents for content moderation workflows. Uber and Replit have both published case studies describing LangGraph-based systems in production. The framework has the battle-tested checkpointing, the HITL primitives, and the community support that justify the learning investment.

That said, LangGraph is not the right tool for every job. The table below summarises where each framework excels as of June 2026:

Framework	Best for	Learning curve	HITL support	Model-agnostic	Production maturity
LangGraph	Stateful, HITL, complex routing	2–4 weeks	Native	Yes	High (Klarna, LinkedIn, Uber)
CrewAI	Role-based multi-agent workflows	1–2 weeks	Via hooks	Yes	Medium
PydanticAI	Type-safe, structured-output agents	1 week	Limited	Yes	Growing
Raw tool-calling	Simple single-step agents	Days	Manual	Yes	High (simple cases)

Choose LangGraph when you need stateful multi-turn conversations, native human-in-the-loop pausing, or production-grade persistence. Choose CrewAI when you are orchestrating multiple role-based agents (researcher, writer, critic) without needing fine-grained state control. Choose PydanticAI when your primary requirement is structured, type-validated output and your agent does not need complex routing. Use raw tool-calling when the task is a single-shot extraction or transformation.

This guide builds a research assistant agent in LangGraph. By the end, you will have a fully working graph with tool-calling, conversation memory, and a human approval step before any tool executes. You can list it as a project on your AITC Builder profile immediately after.

Prerequisites and Project Setup

You need Python 3.11 or later. Earlier versions will work for most of the code, but LangGraph's type annotations are tested against 3.11+ and some async patterns behave differently in 3.10.

Install the dependencies:

pip install langgraph langchain-anthropic langchain-core

Set your API key. The examples in this guide use Anthropic Claude, but you can substitute any LangChain-compatible chat model (see the FAQ at the bottom for details):

export ANTHROPIC_API_KEY="your-key-here"

What you will build across the 14 steps:

Steps 1–4: Define the agent state schema and the graph skeleton.
Steps 5–8: Add an LLM node, define a tool, wire up conditional routing.
Steps 9–11: Add MemorySaver checkpointing for persistent conversation memory.
Steps 12–14: Add a human-in-the-loop interrupt so the agent pauses for approval before calling any tool.

The finished agent is a simple research assistant that searches for information on demand, maintains conversation history across turns, and requires explicit human approval before executing any search. It is a minimal but complete production-ready pattern — not a toy.

Note on model names Code examples in this guide use claude-sonnet-4-6, the current Claude Sonnet model. Substitute whichever model you have access to. The LangGraph graph structure is entirely independent of model choice.

Steps 1–4: Defining State and the Graph Skeleton

Step 1: Define the AgentState TypedDict

Every LangGraph graph has a state object: a Python TypedDict that defines exactly what data flows between nodes. Choosing the right state schema upfront is the single most important architectural decision you will make. Underspecifying it — using a plain dict, or leaving out fields you will need later — is the root cause of most debugging headaches in LangGraph projects.

from typing import TypedDict, Annotated
from langchain_core.messages import BaseMessage
import operator

class AgentState(TypedDict):
    messages: Annotated[list[BaseMessage], operator.add]
    next: str

The messages field deserves explanation. The Annotated[list[BaseMessage], operator.add] syntax tells LangGraph how to merge updates from different nodes. When a node returns {"messages": [new_message]}, LangGraph appends that new message to the existing list rather than replacing it. Without the operator.add reducer, every node update would overwrite the entire message history, and you would lose the conversation context at every step.

The next field is a simple string that will carry routing decisions. Not all agents need this field — conditional edges can inspect the state directly — but it makes routing logic easier to read and test.

Step 2: Initialise the StateGraph

from langgraph.graph import StateGraph, END

graph = StateGraph(AgentState)

StateGraph is the core class. You pass it your state schema, and it validates that all node inputs and outputs conform to that schema at runtime. The END sentinel is a special constant representing the terminal state — returning it from a conditional edge means "stop processing".

Step 3: Add a placeholder node to verify the skeleton compiles

def placeholder_node(state: AgentState) -> dict:
    return {}

graph.add_node("agent", placeholder_node)
graph.set_entry_point("agent")

# Verify the skeleton compiles without errors
app = graph.compile()
print(app.get_graph().draw_ascii())

Always compile and test the skeleton before adding complexity. A graph that fails to compile at this stage has a structural problem — incorrect edge definitions, a missing entry point, or a node that is referenced but never added — and catching it here is far easier than catching it after you have added five more nodes.

Step 4: Understand the node contract

Every node in LangGraph follows the same contract: it receives the current AgentState as its single argument and returns a dictionary containing the fields to update. You do not return a complete new state — only the delta. LangGraph merges your returned dictionary into the existing state using the reducer functions you defined (like operator.add for messages).

This means a node that does not touch a particular field simply omits it from the return value. A node that processes messages but does not change the next field can return {"messages": [response]} without mentioning next at all. The existing value of next remains unchanged.

Steps 5–8: LLM Node, Tool Definitions, and Conditional Routing

Step 5: Define a tool

Tools are plain Python functions decorated with @tool. The docstring becomes the tool's description, which the LLM uses to decide when to call it. The type annotations become the tool's input schema. Write clear, specific docstrings — the LLM's ability to use your tools correctly depends almost entirely on the quality of these descriptions.

from langchain_core.tools import tool

@tool
def search(query: str) -> str:
    """Search the web for information about a topic.

    Use this tool when the user asks for current information,
    research on a topic, or factual data that may require lookup.
    Returns a summary of search results.
    """
    # Replace this stub with a real search implementation:
    # from langchain_community.tools import DuckDuckGoSearchRun
    # return DuckDuckGoSearchRun().run(query)
    return f"Search results for '{query}': [stub — replace with real search]"

tools = [search]

Step 6: Build the LLM node

from langchain_anthropic import ChatAnthropic

llm = ChatAnthropic(model="claude-sonnet-4-6").bind_tools(tools)

def agent_node(state: AgentState) -> dict:
    response = llm.invoke(state["messages"])
    return {"messages": [response]}

bind_tools(tools) attaches the tool schemas to every LLM call. The model will then include tool call requests in its responses when it decides to use a tool. The agent_node function simply invokes the LLM with the current message history and appends the response to the messages list.

Step 7: Add the ToolNode

from langgraph.prebuilt import ToolNode

tool_node = ToolNode(tools)
graph.add_node("agent", agent_node)
graph.add_node("tools", tool_node)

ToolNode is a pre-built node that reads tool call requests from the last message, executes the corresponding Python functions, and returns the results as ToolMessage objects. You do not need to write the tool execution logic yourself — ToolNode handles error catching, result serialisation, and message formatting.

Step 8: Wire up conditional routing

from langchain_core.messages import AIMessage

def should_continue(state: AgentState) -> str:
    last_message = state["messages"][-1]
    # If the last message has tool calls, route to the tools node
    if isinstance(last_message, AIMessage) and last_message.tool_calls:
        return "tools"
    # Otherwise, the LLM has finished — end the graph
    return END

graph.add_conditional_edges("agent", should_continue)
graph.add_edge("tools", "agent")
graph.set_entry_point("agent")

The conditional edge is the routing mechanism. After every LLM response, should_continue inspects the last message. If it contains tool call requests, execution routes to the tools node. If not, the graph ends. The tools node always routes back to agent unconditionally, creating the tool-use loop.

This produces the classic ReAct (Reason + Act) pattern: the LLM reasons, decides to act (call a tool), the tool executes, the result returns to the LLM, which reasons again — cycling until it decides to stop.

Pro tip Add a step counter to your state and increment it in agent_node. Your should_continue function can return END if the counter exceeds a maximum (e.g., 15). This is your backstop against infinite loops during development before you have robust termination logic.

At this point you have a working agent. Compile it and run a quick test:

from langchain_core.messages import HumanMessage

app = graph.compile()

result = app.invoke({
    "messages": [HumanMessage(content="What is LangGraph?")],
    "next": ""
})

print(result["messages"][-1].content)

Steps 9–11: Memory and Checkpointing

Step 9: Add MemorySaver

Without a checkpointer, the agent has no memory. Every invocation starts fresh — the graph has no knowledge of previous turns. For a research assistant, that is a serious limitation: the user cannot refer back to earlier results, and the agent cannot build on prior context.

LangGraph solves this with checkpointers. A checkpointer serialises the entire graph state after every node execution and stores it under a thread ID. The next invocation with the same thread ID loads the saved state and continues from where it left off.

from langgraph.checkpoint.memory import MemorySaver

checkpointer = MemorySaver()
app = graph.compile(checkpointer=checkpointer)

Pro tip Start with MemorySaver in development — it is an in-memory store that works exactly like Postgres checkpointing. Swap in PostgresSaver before deploying to production. The API is identical; only the constructor changes.

Step 10: Use thread config for multi-turn conversation

from langchain_core.messages import HumanMessage

# Thread config ties this invocation to a specific conversation session
config = {"configurable": {"thread_id": "user-123-session-1"}}

# First turn
result = app.invoke(
    {"messages": [HumanMessage(content="Search for LangGraph tutorials")]},
    config
)
print(result["messages"][-1].content)

# Second turn — the agent remembers the first turn
result = app.invoke(
    {"messages": [HumanMessage(content="Summarise what you found")]},
    config
)
print(result["messages"][-1].content)

The thread_id is the key that namespaces conversation history. Use a stable, unique identifier per user session — a UUID generated at the start of the session, or a combination of user ID and session timestamp. Never use a user-provided string directly as the thread ID without sanitising it, as thread IDs with identical values will share state (see the pitfalls section).

Step 11: Inspect the saved state

# View the current state for a thread
state = app.get_state(config)
print(state.values["messages"])   # Full message history
print(state.next)                  # Next node(s) to execute (empty if graph finished)

The ability to inspect state at any point is one of LangGraph's most useful debugging features. You can load any historical thread, inspect what the agent saw and did at each step, and replay from any checkpoint. This is the foundation of LangGraph Studio's visual debugger, which displays the graph topology alongside the live state for any thread.

Steps 12–14: Human-in-the-Loop (Interrupt and Resume)

Human-in-the-loop (HITL) is the feature that separates production agents from demos. In most agentic workflows with real-world consequences — sending an email, executing a database write, making an API call that charges money — you want a human to approve the action before it happens. LangGraph supports this natively via the interrupt_before parameter.

Step 12: Compile with interrupt_before

from langgraph.checkpoint.memory import MemorySaver

checkpointer = MemorySaver()

# Recompile with interrupt_before — the graph will pause before executing "tools"
app = graph.compile(
    checkpointer=checkpointer,
    interrupt_before=["tools"]
)

interrupt_before=["tools"] tells LangGraph to pause execution immediately before the tools node runs — every time, for every thread. The graph saves its state at the interruption point and returns control to your application. The tool calls the LLM requested are visible in the saved state, ready for a human to review before approving or rejecting.

Step 13: First invocation — stream until the interrupt

from langchain_core.messages import HumanMessage

thread = {"configurable": {"thread_id": "review-session-1"}}

# Stream events until the interrupt fires
print("=== Agent responding... ===")
for event in app.stream(
    {"messages": [HumanMessage(content="Search for LangGraph tutorials")]},
    thread
):
    # Each event is a dict: {node_name: state_update}
    for node_name, update in event.items():
        if "messages" in update:
            last = update["messages"][-1]
            print(f"[{node_name}] {last.content or last.tool_calls}")

# Check where the graph paused
state = app.get_state(thread)
print(f"\nGraph paused. Next node(s): {state.next}")

# Show the pending tool calls for human review
if state.values["messages"]:
    last_msg = state.values["messages"][-1]
    if hasattr(last_msg, "tool_calls") and last_msg.tool_calls:
        for tc in last_msg.tool_calls:
            print(f"Pending tool call: {tc['name']}({tc['args']})")

When the LLM decides to call a tool, the graph pauses before executing it. The output above shows you exactly what tool call the LLM has requested. You can now present this to the human operator — in a web UI, a Slack message, an email — and wait for their decision.

Step 14: Resume after human approval

# Human has reviewed and approved — resume by passing None
print("\n=== Human approved. Resuming... ===")
for event in app.stream(None, thread):
    for node_name, update in event.items():
        if "messages" in update:
            last = update["messages"][-1]
            print(f"[{node_name}] {getattr(last, 'content', str(last))}")

# Final result
final_state = app.get_state(thread)
print("\n=== Final answer ===")
print(final_state.values["messages"][-1].content)

Warning Do not pass new messages when resuming — pass None to continue from the interrupted state. Passing a new HumanMessage starts a new turn, not a resume. Your new message would be appended to the history before the pending tool calls are executed, producing unexpected and hard-to-debug behaviour.

To reject the tool call (have the agent try a different approach), you can modify the state before resuming:

# Reject: overwrite the last message to remove the tool calls
from langchain_core.messages import HumanMessage, AIMessage

# Update the state — replace the pending AI message with a human veto
app.update_state(
    thread,
    {"messages": [HumanMessage(content="Do not search. Instead, answer from your training data.")]},
    as_node="agent"
)

# Resume — the graph will now re-enter the agent node with the updated state
for event in app.stream(None, thread):
    for node_name, update in event.items():
        if "messages" in update:
            print(f"[{node_name}] {update['messages'][-1].content}")

The combination of interrupt, state inspection, update_state, and resume gives you fine-grained control over every agent action. This is the production-ready HITL pattern used in real deployments where tools have side effects that cannot be undone.

Pro tip For async web applications, store the thread ID and the interrupt notification in your database when the graph pauses. The user can review and approve via a web endpoint that calls app.stream(None, thread) to resume — potentially hours later, across a completely different process. The Postgres checkpointer makes this trivially durable.

Built this agent? Put it on your profile.

AITC Builder profiles let you list projects with tech stack, repo link, and a description of the problem solved. Founding Builder spots — fewer than 500 — come with a permanent badge.

Add your project

Common Pitfalls and Debugging

Every LangGraph practitioner runs into the same set of problems. The table below maps each pitfall to its symptom and fix, drawn from patterns that appear repeatedly in the LangGraph GitHub issues and community forums as of June 2026:

Pitfall	Symptom	Fix
Infinite loop	Agent calls the same tool repeatedly; graph never reaches `END`	Add a step counter to state; return `END` when it exceeds a maximum. Also set `recursion_limit` in `graph.compile()` as a safety net.
State bloat	Memory usage grows unbounded over long sessions; slow checkpointing	Prune old messages periodically. Keep the last N messages plus the original system prompt. `trim_messages()` from `langchain_core.messages` provides a clean API for this.
Tool hallucination	LLM invents tool names that do not exist in the schema; `ToolNode` raises `KeyError`	Use strict schema validation in `@tool` decorators. List available tools explicitly in the system prompt. Consider adding an error-handling node that catches tool errors and feeds them back to the LLM.
Under-defined state	Conditional routing fails silently; graph takes wrong paths; hard to reproduce	Type your state explicitly with `TypedDict`. Log the full state after each node during development using a `print_state` node wired in with `add_edge`.
Thread ID collision	Separate user sessions bleed into each other; user A sees user B's conversation history	Generate thread IDs with `uuid.uuid4()` at session start. Never use user-provided strings (usernames, email addresses) directly as thread IDs without namespacing or hashing.

Debugging a LangGraph agent in practice requires more than reading tracebacks. Because execution is distributed across nodes, the failure point and the observable symptom are often several steps apart. The most reliable first move is to call app.get_state(config) immediately after each node executes and log the result. You can do this by wrapping your compiled graph in a thin harness that prints the state after every stream event — the event dictionary already tells you which node just ran, so you can correlate state snapshots with node execution without any additional instrumentation. Checking state.next at each point tells you what the graph intends to do next, which makes it immediately obvious when routing has gone wrong before the wrong node actually runs.

Two failure modes account for the majority of confusing LangGraph bugs. The first is a graph that terminates unexpectedly: the agent stops mid-task without reaching a meaningful answer. This is almost always caused by a missing edge or a conditional edge whose routing function returns END prematurely — often because the function checks the wrong field, or because an edge was never added for a particular node. The second mode is a graph that never terminates: the agent loops between the same two nodes indefinitely, racking up API costs and eventually hitting a timeout or rate limit. This is almost always a conditional edge whose function always returns the same non-END value — usually because the termination condition checks for something that the LLM's output never quite matches (a specific string, an empty tool call list, a field that is set to the wrong default). In both cases, logging the full state after each node — not just the last message — is the fastest path to the root cause.

LangGraph Studio is the most efficient tool for this class of debugging. Available for macOS, Windows, and Linux, it renders your graph topology as an interactive diagram and lets you click through any historical thread step by step. You can inspect the input and output state for each node, compare runs side by side, and modify state values in the UI before re-running from any checkpoint. For teams working on complex HITL workflows, the ability to replay an interrupted thread with a modified state — without writing a line of code — saves hours of iterative print-statement debugging.

Pro tip

Add an iteration_count: int field to your AgentState and increment it in your agent node. Then add a conditional edge that routes to END if state["iteration_count"] >= 10. This single safeguard prevents runaway loops during development and gives you a clear signal when your termination logic is broken.

Debugging with LangGraph Studio

LangGraph Studio is a desktop application (macOS, Windows, Linux) that provides a visual interface for inspecting and debugging LangGraph graphs. You point it at your graph definition file, and it renders the graph topology as an interactive diagram. Click any node to see its input state, output state, and timing. Click any thread to replay its execution step by step. You can also modify state values directly in the UI and re-run from any checkpoint — which is exceptionally useful for reproducing edge-case failures.

For teams debugging complex HITL workflows, Studio is the fastest path to understanding exactly where and why an agent's behaviour diverged from expectations. It is free for local development.

For production tracing across distributed systems, pair LangGraph with OpenTelemetry spans on every node. The full OTel instrumentation guide covers spans, tail-sampled traces, and cost attribution in depth.

Production Checklist and Profile CTA

Shipping a LangGraph agent to production involves choices across hosting, persistence, and monitoring. The table below summarises the main deployment options as of June 2026:

Option	Use case	Approx. cost	Notes
LangGraph Studio (local)	Development and demo	Free	Best for iteration; visualises graph state in real time
Docker + Cloud Run	Small production workloads	~£8–40/month	Auto-scales to zero; swap `MemorySaver` for `PostgresSaver` with a managed Postgres instance
LangGraph Cloud	Managed production with full HITL	Per-execution pricing	Fully managed persistence, concurrency, and HITL resumption; no infrastructure to operate
Self-hosted Postgres + FastAPI	Full control, enterprise requirements	Variable	Most flexible; most operational overhead; required if you have data residency constraints (IN or UK data sovereignty)

Before deploying, work through this checklist:

Replace MemorySaver with PostgresSaver and confirm checkpointing survives a process restart.
Add a recursion limit to graph.compile(recursion_limit=25) to catch runaway loops in production.
Add a step counter to your state and return END at a sensible maximum as a belt-and-braces measure.
Instrument every node with OpenTelemetry spans and attach user.id and session.id as span attributes for cost attribution.
Implement message trimming to bound state size for long-running sessions.
Generate thread IDs with uuid.uuid4(), never from user-provided input.
Test the full HITL flow end-to-end: trigger an interrupt, inspect the state, approve via resume, verify the tool executed correctly.
Load test with concurrent threads to verify your Postgres checkpointer handles concurrent writes without deadlocks.

Using this as a portfolio project

An agent you can describe concretely — what it does, what tools it uses, how it handles failure, what it would cost to run at 1,000 daily active users — is far more impressive in a portfolio than a list of frameworks you know. The pattern you built in this guide is a genuine production architecture. LangGraph is the framework that teams at Klarna and LinkedIn trust in production; being able to demonstrate you understand why matters.

Once you have the agent working, add it to your AITC profile as a project. List the tech stack (LangGraph, Anthropic Claude, Postgres), link to the repository, and write two sentences on the problem it solves. Founding Builder profiles — fewer than 500 spots — get a permanent badge and priority placement in the directory. If you are building in India or the UK, this is how the people hiring find you. See today's companion guide on making your AI projects discoverable.

Add your profile — claim your Founding Builder spot

Frequently Asked Questions

Do I need LangGraph or can I just use a simple tool-calling loop?

For single-step agents or simple chains, a direct tool-calling loop is lower overhead and perfectly adequate. LangGraph's value shows up when you need persistent conversation state across turns, conditional routing between multiple nodes, human-in-the-loop interrupts, or complex multi-step workflows where the LLM decides the next step at runtime. If your agent has fewer than three distinct states or never needs to pause for human review, start with raw tool-calling and migrate to LangGraph only when the complexity demands it.

What is the difference between MemorySaver and PostgresSaver checkpointers?

MemorySaver stores the entire graph state in your process's memory. It is fast, requires no external dependencies, and is ideal for development, automated tests, and demos. The downside is that state is lost when the process restarts. PostgresSaver writes each checkpoint to a Postgres database, giving you durable persistence across restarts and the ability to resume interrupted threads hours or days later. The API is identical — you simply swap out the checkpointer when compiling the graph — which makes the migration from development to production straightforward.

How do I prevent my LangGraph agent from looping indefinitely?

The most reliable approach is to add a step counter to your AgentState TypedDict and increment it in every node. Your conditional edge function checks the counter and returns END if it exceeds a maximum (typically 10–20 steps for most agents). You can also add a termination condition based on the content of the last message — for example, if the LLM emits a specific finish token or if the last tool call returned an empty result. graph.compile() also accepts a recursion_limit parameter that raises an error rather than running forever, which is a useful safety net during development.

Can I use LangGraph with models other than Anthropic Claude?

Yes. LangGraph is fully model-agnostic. The graph and state machinery is independent of any LLM. You replace ChatAnthropic with ChatOpenAI, ChatGoogleGenerativeAI, ChatOllama, or any other LangChain chat model. The only constraint is that your chosen model must support tool-calling (function calling) if you want to use ToolNode and bind_tools(). All major frontier models — GPT-4o, Gemini 1.5 Pro, Claude Sonnet 4.x — support tool-calling as of mid-2026. Open-weight models such as Llama 3.1 70B also support it via Ollama or vLLM.

How do I deploy a LangGraph agent to production?

The three main paths are: (1) Wrap the compiled graph in a FastAPI or LangServe endpoint and deploy to Docker on Cloud Run or any container host — this gives you full control over infrastructure at low cost; (2) Use LangGraph Cloud, the managed hosting option from LangChain, which handles HITL persistence, concurrency, and scaling without any infrastructure work; (3) Self-host with a Postgres checkpointer, a task queue such as Celery or Redis Streams for async runs, and a standard ASGI server. Most teams start with option 1 (Docker + Cloud Run) and migrate to option 2 when operational overhead outweighs the cost difference.

Ship the agent. List it on your AITC profile.

AI Tech Connect is the directory where Indian and UK AI Builders get found by the people hiring and collaborating. Add this agent as a project — tech stack, repo link, problem solved. Founding Builder profiles (fewer than 500 spots) get a permanent badge and priority placement.

Create your free profile Browse Builders

← Back to AI Tips