What changed on 13 May 2026
Cline — the open-source coding agent best known as a VS Code extension and CLI assistant — has released @cline/sdk, an open-source agent runtime that any team can import and embed in their own product. The package is not a thin wrapper or a marketing repackage of the IDE extension. It is the same runtime that powers every Cline surface: the IDE extension, the new CLI, and the new terminal UI. Whatever Cline runs in production, you now run too.
Alongside the SDK, Cline shipped a rebuilt CLI on top of the same runtime, with three additions that change what a coding agent is allowed to be: agent teams (multi-agent collaboration in a single workspace), scheduled jobs (cron-style autonomous runs), and connectors (a typed integration surface for talking to external systems). The benchmark headline is also notable. Running on claude-opus-4.7, the new Cline CLI is reported at 74.2% on Terminal Bench 2.0 per Cline's own announcement, comfortably ahead of OpenCode running kimi-k2.6 at a Cline-reported 37.1% on the same harness. The full announcement was covered by TestingCatalog; the repository sits at github.com/cline/cline, and the product home is cline.bot.
The shorter version: agent runtimes used to mean either (a) glue you wrote yourself, or (b) a hosted SDK locked to one provider's surface. Cline has just put a third option on the table — a production-tested, provider-neutral runtime you can install with one npm command and ship in your own UI.
If you have spent the last six months gluing together LangGraph nodes, file-edit tools, and a shell wrapper to ship an internal coding bot, do the maths on switching cost before your next sprint. A runtime that ships with file diffing, shell execution, browser control and MCP support already wired in will save weeks — and the team behind it absorbs the bug surface for you.
What "agent teams + scheduled jobs + connectors" actually means
These three words — teams, jobs, connectors — are doing a lot of work in the announcement. Stripped of marketing, here is what each one buys a Builder shipping next quarter.
Agent teams
A single workspace can spawn multiple specialised agents — say, a planner, a code-writer, a tester, and a reviewer — each with its own system prompt and toolset, sharing a common file system and conversation log. This is the pattern most teams reinvent badly on their own. Having it in the runtime means cross-agent message routing, shared scratchpad memory, and turn ordering are no longer your problem. If you have already read our piece on multi-agent orchestration with LangGraph, you will recognise the design space — the difference is that Cline ships an opinionated default rather than a graph builder.
Scheduled jobs
Cron-style triggers that run an agent without a human in the loop. The obvious uses are sweep-style tasks: "every Monday at 09:00, scan our dependency manifest for security advisories and open a PR for any patch versions". The less obvious use is event-coupled — wake an agent when CI fails on main, when a flaky test exceeds a threshold, or when a customer-reported bug arrives in your tracker. Scheduled jobs are the seam where coding agents stop being assistants and start being workers.
Connectors
A typed integration surface for the systems an agent talks to outside its file system: GitHub, Jira, Linear, Sentry, Postgres, Slack, and so on. The connector model means tool-call schemas are first-class and version-pinned, rather than a swamp of ad-hoc JSON. For teams that have been wrestling with brittle tool calls — a problem we have written about in tool calling at scale — this is a meaningful uplift.
The 74.2% Terminal Bench 2.0 score, decoded
Terminal Bench 2.0 measures whether a coding agent can complete realistic, end-to-end developer tasks inside a shell — install a dependency, fix a failing test, refactor a module, write a script that satisfies a spec. It is closer to "is this agent useful at my desk?" than older single-turn benchmarks like HumanEval. A 74.2% on Terminal Bench 2.0 puts the Cline CLI on claude-opus-4.7 in the leading band for production coding agents. The 37.1% figure for OpenCode running kimi-k2.6 is a fair point of comparison on open-weight stacks, and it shows how much the runtime — not just the model — contributes to end-to-end performance.
The Builder takeaway is uncomfortable for the "any open model on any wrapper" school of thought: the runtime matters as much as the model. A great model on a mediocre agent loop will still drift, misuse tools, and leave you debugging the same five failure modes you saw last week. Cline has spent two years tuning the loop. That work is now redistributable.
Cline SDK vs Claude Agent SDK vs OpenAI Agents SDK vs LangGraph
Picking an agent runtime in May 2026 is no longer a question of whether one exists — it is a question of which class of trade-off you accept. Here is how the four serious contenders line up.
| Runtime | Open source? | Provider lock-in | Hosted option | What it optimises for |
|---|---|---|---|---|
| Cline SDK | Yes (Apache-2.0) | None — BYOK across Anthropic, OpenAI, Google, OpenRouter, local | No (self-host) | IDE/CLI/TUI coding agents you ship in your own product |
| Claude Agent SDK | Partial (client OSS, runtime hosted) | Anthropic only | Yes — see our Managed Agents guide | Hosted, governed enterprise agents on Anthropic infra |
| OpenAI Agents SDK | Yes (client SDK) | OpenAI-first; bring-your-own-model is possible but second-class | Yes — Assistants / Responses API | OpenAI ecosystem integration and tool-use polish |
| LangGraph | Yes (MIT) | None — provider-neutral graph runtime | Yes — LangGraph Cloud | Composable agent graphs; full control over topology |
Read this table as a decision frame, not a leaderboard. If you are building a hosted enterprise agent and you want a single throat to choke for safety and governance, the Claude Agent SDK or OpenAI Agents SDK is a defensible default. If you want maximum topology control and you have the engineering time to spend, LangGraph wins on flexibility. The Cline SDK earns its place in the table because none of the others ship a production-tested IDE/CLI/TUI loop out of the box. For broader context on the agent-SDK race, see our deep dive on the agent-SDK wars.
A 60-second example: dispatch a coding agent
The SDK's surface is intentionally small — a runtime, a task descriptor, and a result stream. The pattern below is illustrative; check the package README for the canonical API before shipping.
import { ClineAgent } from "@cline/sdk";
const agent = new ClineAgent({
provider: "anthropic",
model: "claude-opus-4-7",
apiKey: process.env.ANTHROPIC_API_KEY,
workspace: "/srv/repo/payments-service",
tools: ["file-edit", "shell", "mcp"],
});
const run = await agent.dispatch({
task: "Add a Postgres health check to the /healthz endpoint and a Vitest case for it.",
approvalMode: "auto",
});
for await (const event of run.stream()) {
if (event.type === "tool-call") console.log("tool:", event.name, event.input);
if (event.type === "file-change") console.log("changed:", event.path);
if (event.type === "message") console.log(event.text);
}
console.log("final:", await run.result());
That is the entire integration surface for the common case. You bring your provider credentials, point at a workspace, declare which tools the agent may use, and dispatch a task. Streamed events give you the hook for your own UI — render diffs, surface tool calls, prompt the user for approval, or wire to a webhook.
Want to discuss this with other verified Builders?
Every article on AI Tech Connect is written by a Verified Builder. Browse profiles, shortlist who you want to hire or collaborate with.
Browse Builders →Two builders, two ways to ship this in the next 30 days
To make this concrete, here are two realistic uses we expect to see immediately — one Indian, one British. Both rely on the same property: the runtime is genuinely embeddable.
Bengaluru: a fintech compliance code reviewer
An Indian fintech with a 200-engineer organisation wraps @cline/sdk inside an internal CLI called compli. Every PR triggers a connector-driven run: the agent pulls the diff, cross-references it against the firm's RBI and DPDP-aligned coding standards (loaded as a system prompt), and posts a review on GitHub with concrete remediation suggestions. The agent team is two specialists — one for security-sensitive paths, one for personal-data flows — sharing a common scratchpad. Total engineering time to ship: a sprint, not a quarter.
Brighton: an Unreal Engine game-development assistant
A British indie studio embeds @cline/sdk inside its internal tooling for a small team working in Unreal Engine. The runtime drives a custom TUI tuned for C++ headers and Blueprints. Scheduled jobs run nightly to regenerate auto-bindings and open PRs for any mismatched signatures. The studio adds two connectors — one to its asset tracker, one to its level-design metadata service. The point is not novelty; it is that the team did not have to write a runtime to get there.
Where this slots into the open coding-agent landscape
Two other names matter when you are evaluating this space, and ignoring them would be dishonest.
Aider remains the most battle-tested free option — over 40,000 GitHub stars and roughly 4.1 million installs, Git-native with automatic commits per change, and a "bring your own model, zero markup" philosophy that has earned a loyal following. If your needs are personal productivity or a small team without ambition to ship a packaged product, Aider is still hard to beat.
Continue, meanwhile, has done something interesting in 2026 — it pivoted from being a coding assistant to being a CI quality-control platform, with open-source CLI agents that run on every pull request and enforce team-level rules. That is a different category of product from what Cline is now competing in, but worth flagging because builders comparing on outdated 2025 mental models will get the picture wrong.
Cline's SDK is therefore not a head-on competitor to either. It is the runtime layer that lets a builder ship a product in the space those two operate in, without first inventing the wheel. Comparable closed-source surfaces — Claude Code's /autopilot mode, Cursor, Windsurf — give a useful frame: this is roughly what they ship, only the runtime is now open, embeddable, and provider-neutral.
What the SDK does not give you
An honest assessment: shipping an embedded agent product is more than a runtime. The SDK does not, by itself, hand you any of the following — and your in-house product still needs them.
- Evaluation harness. Cline runs its own benches internally; you will need to build (or buy) one tuned to your task distribution. Without this you cannot meaningfully track regression when you swap models.
- Observability and tracing. The SDK emits events; you need a tracing pipeline to ingest them, plus retention, search and alerting. Most teams underestimate this until the first production incident.
- Safety guardrails specific to your domain. A coding agent in a regulated environment needs PII filters, secret-scanning on outputs, and policy checks on tool calls. Cline's own UI surfaces approvals; your product has to build the equivalent.
- Cost governance. Token spend on warm-cache long-context runs can be eye-watering. Without per-tenant budgets and rate limits, a single runaway task can drain a month's allowance in an afternoon.
The SDK lowers the cost of building an agent product by a large multiple. It does not change the cost of running one safely. Treat the runtime as a great starting point, not a complete solution. Plan for evaluation, observability, guardrails and cost controls as first-class workstreams in your roadmap, not afterthoughts.
The bigger picture: distribution layers are forming
Zoom out. In the past six months, the agent ecosystem has acquired three new layers it did not have before: a runtime layer (this announcement), a distribution layer (see GitHub's gh skill for a package-manager pattern), and a governance layer (hosted offerings from the major labs). For Indian and UK builders, the practical consequence is that you can now compose an agent product from off-the-shelf parts in a way that simply was not possible last year.
The new ceiling is no longer the runtime; it is the quality of your domain knowledge, the tightness of your evaluation loop, and the safety story you can put in front of a buyer. Those have always been the things that mattered. Cline's SDK release just makes them harder to hide behind a "we are still building the platform" excuse.
Final read
This is one of the more consequential open-source releases of the year for builders shipping coding agents. The runtime is real, production-tested, provider-neutral, and free. The benchmark numbers are credible. The companion CLI shows the runtime is not a toy. The trade-off is honest — you still own evaluation, observability, guardrails and cost control. For most Indian and UK teams that have been quietly stitching their own agent loop together since 2024, the right move this month is to spike the SDK against your real workload, measure the deltas, and decide whether to keep your bespoke loop or retire it. The opportunity cost of not doing that experiment is now larger than the cost of doing it.