What is a HITL checkpoint in LangGraph v0.4?

A HITL (human-in-the-loop) checkpoint is a graph node that pauses agent execution and surfaces an Interrupt object to the calling application. The application can inspect the agent's current state, provide a human decision, and then resume execution with a single call. In v0.4, interrupts are automatically included in the .invoke() return value and the 'values' stream mode — no separate getState() call is needed.

Which persistence backends does LangGraph v0.4 support?

LangGraph v0.4 supports three checkpointer backends: AsyncPostgresSaver (recommended for production — durable, queryable audit trail), the Redis-based langgraph-checkpoint-redis package (fast, TTL-based, suitable for short-horizon agents), and MemorySaver (in-process, development only — state is lost on restart).

How do HITL checkpoints help with FCA or SEBI compliance?

FCA SYSC 9.1 requires that firms retain records of decisions made by automated systems, including the inputs that led to those decisions. SEBI's Research Analysts Regulations have similar audit requirements for algorithm-driven recommendations. LangGraph's persistent state — stored in Postgres — provides a queryable log of every checkpoint, the state at each pause, the human input supplied, and the downstream graph execution. This satisfies the 'explainable decision trail' requirement without custom audit middleware.

What changed from LangGraph v0.3 to v0.4?

The primary breaking change is the simplification of the Interrupt class from four fields (value, resumable, ns, when) to two fields (value, resumable). Interrupts are now automatically included in the .invoke() response and 'values' stream — previously you had to call getState() separately to detect a pending interrupt. Multi-interrupt resume is also new: you can resolve multiple parallel interrupts in a single Command call by mapping interrupt IDs to resume values.

When should I use LangGraph vs CrewAI or AutoGen?

LangGraph is the right choice when you need explicit state management, audit trails, human-in-the-loop approval steps, and rollback capability — typical in fintech, legal, and healthcare workflows. CrewAI is better suited to role-based multi-agent pipelines where agents collaborate conversationally with less need for structured state. AutoGen excels at code execution and research-style agent conversations. For regulated-industry production workloads in India or the UK, LangGraph's compliance affordances make it the stronger default.

LangGraph v0.4: HITL Checkpoints and State Persistence

What changed in v0.4

LangGraph v0.4 (released April 2026, per the LangGraph changelog) centres on a single theme: making human-in-the-loop interrupts first-class citizens of the framework rather than an advanced pattern requiring significant custom plumbing. Three changes drive this:

Auto-surfaced interrupts — Interrupt objects are now automatically included in the .invoke() return value and the "values" stream mode. Previously, detecting a pending interrupt required a separate getState() call after .invoke() returned. That extra round-trip is gone.
Simplified Interrupt class — The Interrupt object is reduced from four fields (value, resumable, ns, when) to two (value, resumable). Less boilerplate, cleaner pattern-matching in application code.
Multi-interrupt resume — Graphs that run parallel branches can now surface multiple simultaneous interrupts and resume all of them in a single Command call, mapping interrupt IDs to their respective resume values. Previously, each interrupt required a separate resume call.

How HITL checkpoints work: a loan-approval example

The most common production use case for HITL in regulated industries is an approval gate: an AI agent processes a request, reaches a point where human sign-off is required, pauses, and then continues once a human provides a decision. Here is the minimal LangGraph v0.4 pattern for a loan-approval node:

from langgraph.graph import StateGraph, END
from langgraph.types import interrupt, Command
from typing import TypedDict

class LoanState(TypedDict):
    application_id: str
    risk_score: float
    human_decision: str | None

def risk_assessment_node(state: LoanState):
    # AI calculates risk score
    score = calculate_risk(state["application_id"])
    return {"risk_score": score}

def human_review_node(state: LoanState):
    # Pause execution and surface the interrupt
    decision = interrupt({
        "application_id": state["application_id"],
        "risk_score": state["risk_score"],
        "message": "Risk score is borderline. Approve or reject?"
    })
    return {"human_decision": decision}

def disbursement_node(state: LoanState):
    if state["human_decision"] == "approve":
        disburse_loan(state["application_id"])
    return {}

builder = StateGraph(LoanState)
builder.add_node("risk_assessment", risk_assessment_node)
builder.add_node("human_review", human_review_node)
builder.add_node("disbursement", disbursement_node)
builder.set_entry_point("risk_assessment")
builder.add_edge("risk_assessment", "human_review")
builder.add_edge("human_review", "disbursement")
builder.add_edge("disbursement", END)

# Use AsyncPostgresSaver for production — state persists across restarts
from langgraph.checkpoint.postgres.aio import AsyncPostgresSaver
checkpointer = AsyncPostgresSaver.from_conn_string(DATABASE_URL)
graph = builder.compile(checkpointer=checkpointer, interrupt_before=["human_review"])

In v0.4, when the graph hits human_review, the interrupt is returned directly in the invoke() response. Your application layer reads it, presents the decision to a human reviewer, and resumes with:

result = await graph.ainvoke(
    Command(resume="approve"),
    config={"configurable": {"thread_id": application_id}}
)

Pro tip

Use a deterministic thread_id — such as the application ID — rather than a UUID. This lets you resume from any service instance without passing the thread ID through your message queue, and makes the Postgres audit trail trivially queryable by business identifier.

Persistence backends: choosing the right one

The choice of checkpointer determines your durability, query capability, and recovery story:

Backend	Durability	Queryable?	TTL support	Best for
AsyncPostgresSaver	Durable (ACID)	Yes — full SQL	No (manual delete)	Production, regulated workloads, audit trails
Redis (langgraph-checkpoint-redis)	Configurable AOF/RDB	Limited	Yes	Short-horizon agents, high-throughput, low-latency
MemorySaver	None (in-process)	No	N/A	Development and testing only

For any regulated-industry deployment — fintech in India, FCA-supervised services in the UK — AsyncPostgresSaver is the only appropriate choice. The Postgres-backed checkpoint table stores the full graph state at every node execution, giving you a complete reconstruction of every decision the agent made and every human input it received. This is the audit trail that compliance teams ask for and that custom logging middleware is typically too brittle to provide reliably.

The compliance angle: FCA and SEBI

Both the UK Financial Conduct Authority and India's Securities and Exchange Board of India require firms to maintain records of automated decisions — including the inputs that drove those decisions — for periods ranging from three to seven years depending on the instrument and jurisdiction.

FCA SYSC 9.1 (record-keeping) requires investment firms to retain the data that enabled automated order generation and routing. SEBI's Research Analysts Regulations require records of algorithm-driven recommendations. In both cases, "we had logs" is insufficient if those logs cannot reconstruct the sequence of states and human approvals that led to a specific outcome.

LangGraph's AsyncPostgresSaver checkpointer stores:

The complete graph state at every node transition
The thread ID and run ID (correlatable with your application's transaction IDs)
The timestamp of each checkpoint
Any interrupt value (the question posed to the human reviewer)
The resume value (the human's decision)

This is sufficient to satisfy a compliance query of the form: "Show me every decision made by the AI on this application, what state it was in when it paused for human review, what the reviewer was shown, and what they decided." Previously, generating that reconstruction required custom event-sourcing infrastructure. With v0.4, it is a Postgres query.

Builder perspective

"We spent six weeks building a custom audit middleware for our lending workflow before LangGraph's persistence landed. If we were starting today, we would have skipped that entirely. The checkpoint table gives us everything the compliance team asked for, with timestamps and thread correlation for free."

— A Verified Builder · Bengaluru, IN

LangGraph v0.4 vs CrewAI vs AutoGen: a feature comparison

Feature	LangGraph v0.4	CrewAI	AutoGen
HITL checkpoints	First-class, built-in	Limited (callback hooks)	Via UserProxyAgent
Durable state persistence	Postgres, Redis, custom	No built-in	No built-in
Audit trail / rollback	Yes — checkpoint history	No	No
Parallel agent execution	Yes — parallel branches	Yes — crew tasks	Yes — nested chats
Learning curve	Higher (graph model)	Lower (role model)	Medium
Enterprise adoption	Strong (2026 growth)	Strong community	Strong research use

LangGraph's GitHub star count (~12,800 as of mid-2026) is lower than CrewAI or AutoGen, but its enterprise production adoption has grown substantially in 2026 driven by exactly these compliance affordances. Builders choosing a framework for production regulated workloads increasingly favour LangGraph's explicit state model over the higher-level abstractions in CrewAI.

Migrating from v0.3 to v0.4

The one breaking change to address before upgrading is the Interrupt class simplification. If your existing code accesses interrupt.ns or interrupt.when, those fields are gone. The migration is straightforward:

# v0.3 — interrupt had 4 fields
if interrupt.resumable and interrupt.when == "during":
    handle_interrupt(interrupt.value)

# v0.4 — interrupt has 2 fields (ns and when removed)
if interrupt.resumable:
    handle_interrupt(interrupt.value)

Also update any code that called getState() after invoke() to detect pending interrupts — the interrupt is now in the invoke() return value directly. Python minimum version for v0.4 is 3.10.

When to use LangGraph vs simpler alternatives

LangGraph is the right choice when your agent workflow has any of these characteristics:

Human approval gates — a human must sign off before the agent proceeds
Durable multi-step processes — the workflow spans hours or days and must survive restarts
Compliance audit requirements — you need a reconstruction of every state and decision
Rollback / replay — you need to rewind to a prior state and try a different path
Complex branching logic — conditional edges, parallel branches, error-recovery paths

If your workflow is a straightforward sequential chain with no interrupts and no durability requirement, LangChain Expression Language (LCEL) or a simple async pipeline is faster to build and easier to operate. See our coverage of the agent SDK landscape for a broader framework comparison, and our guide to calibrated beliefs in agentic orchestration for patterns that apply across frameworks.

LangGraph v0.4 makes human-in-the-loop approval and audit trails a framework feature, not a custom build

What changed in v0.4

How HITL checkpoints work: a loan-approval example

Persistence backends: choosing the right one

The compliance angle: FCA and SEBI

LangGraph v0.4 vs CrewAI vs AutoGen: a feature comparison

Migrating from v0.3 to v0.4

When to use LangGraph vs simpler alternatives

Building LangGraph workflows for fintech or healthcare?

Frequently asked

Building compliant AI agents in the UK or India?