AI Coding open-source · 9 Jun 2026 · 13 min read

Build Your First MCP Server: A 12-Step FastMCP Tutorial

Q: Do I need the raw `mcp` SDK or is FastMCP sufficient for production?

FastMCP is a thin layer on top of the official `mcp` Python SDK. It handles the protocol plumbing — transport negotiation, capability advertising, JSON-RPC dispatch — so you can focus on writing tool handlers. For the vast majority of MCP servers, FastMCP is entirely sufficient in production. The raw SDK is only necessary if you need to customise low-level protocol behaviour, such as implementing a non-standard transport or overriding capability negotiation. Start with FastMCP and drop down to the SDK only if you hit a specific ceiling.

Q: How should I handle secrets and API keys in a FastMCP server?

Always load secrets from environment variables using `os.environ.get()`, never from hardcoded strings or config files that could be committed to version control. Use a `.env` file locally and add `.env` to your `.gitignore`. In production (HTTP transport), inject secrets via your deployment platform's secrets manager — AWS Secrets Manager, GCP Secret Manager, or Doppler. In Claude Desktop configuration, the `env` block in `claude_desktop_config.json` is the correct place to pass environment variables to your server process; the values in that file live on your local machine and are never transmitted to Anthropic.

MCP is now the universal agent tool protocol — adopted by Anthropic, OpenAI, Google, and Microsoft. This 12-step guide takes you from pip install fastmcp to a production-hardened server that works with Claude Desktop, Cursor, and GitHub Copilot simultaneously.

AI Tech Connect Editorial Published 9 Jun 2026

What MCP Is and Why Every AI Builder Needs It

Think of the Model Context Protocol as the USB standard for AI tools. Before USB, connecting a peripheral to a computer meant buying a device-specific driver, hoping the manufacturer had written one for your operating system, and resigning yourself to a reinstall if it broke. Before MCP, connecting an AI model to an external tool meant writing a bespoke integration for every model-tool pair: one connector for Claude and your database, a different one for GPT-4 and your database, a third for Gemini and your database.

MCP, created by Anthropic and released in November 2024, eliminates that matrix of integrations. You write one server, and any MCP-compatible model — Claude, Cursor, GitHub Copilot, OpenAI agents, Google AI Studio — can use it without modification. As of June 2026, the protocol has been adopted by OpenAI, Microsoft, and Google as well, making it the de facto standard for agent tool connectivity across the industry.

The protocol has three primitives you need to understand before writing a single line of code:

Tools — callable functions the model can invoke. A tool takes typed inputs, performs an action (a web search, a database write, an API call), and returns a result. This is the most commonly used primitive.
Resources — content the model can read but not execute. Think of a resource as a document or data feed exposed at a URI. The model can request it to add context to a conversation.
Prompts — reusable, parameterised message templates. A prompt lets you package a sophisticated instruction set that the model or user can invoke by name, without repeating it in every conversation.

Choosing the right transport depends on where you are deploying:

Transport	Use case	Auth	Setup complexity
stdio	Local dev, Claude Desktop	None needed	Minutes
HTTP with TLS	Production web deployment	Bearer token / OAuth 2.1	Hours

The practical payoff of the USB analogy is real: a single MCP server you build this week will work with Claude Desktop, Cursor, GitHub Copilot, and OpenAI agents simultaneously, today, with no changes to the server code when a new client adopts the protocol. That portability is what makes MCP worth learning now, at 0–2 years into your AI engineering career, rather than after the ecosystem has fragmented into incompatible proprietary APIs.

Steps 1–3: Install FastMCP and Initialise Your Server

FastMCP is a Python library that wraps the official mcp SDK (version 1.27.0 and above) and removes the boilerplate that would otherwise occupy the first hundred lines of every server file. It handles transport negotiation, capability advertising, JSON-RPC dispatch, and error serialisation. You write tool handlers; FastMCP handles everything else.

Step 1: Install

pip install fastmcp

This installs FastMCP and the official mcp Python SDK as a dependency. If you prefer to pin versions in a requirements.txt:

fastmcp>=2.0.0
mcp>=1.27.0

Step 2: Create and run the server

from fastmcp import FastMCP

# Step 1: Create the server
mcp = FastMCP("my-first-mcp-server")

# Step 2: Test it runs
if __name__ == "__main__":
    mcp.run()  # stdio transport by default

Running python server.py at this point will start the server in stdio mode and wait for a client to connect. There are no tools or resources yet, but the server is fully protocol-compliant and will respond to capability negotiation from any MCP client.

FastMCP gives you several things the raw mcp SDK does not handle out of the box: automatic JSON schema generation from Python type annotations, sensible default error handling, a dev-friendly .run() entrypoint that selects stdio transport automatically, and a decorator API (@mcp.tool(), @mcp.resource(), @mcp.prompt()) that registers handlers without requiring you to subclass or configure a router. The raw SDK is useful when you need low-level control; FastMCP is the right starting point for almost everything else.

Step 3: Recommended project structure

my-mcp-server/
├── server.py          # Entry point — creates FastMCP instance, registers handlers
├── tools/
│   └── __init__.py    # Tool handler functions (imported into server.py)
├── requirements.txt
└── .env               # Secrets — never commit this file

Keeping tool handlers in a separate tools/ module pays off when the server grows beyond three or four tools. It also makes it straightforward to unit-test handlers in isolation without starting the MCP server process.

Steps 4–6: Declare and Implement Tools

Tools are the heart of an MCP server. They are what the language model invokes when a user asks it to do something that requires external access — searching the web, reading a file, calling an API, writing to a database. A well-designed tool is narrow in scope, clearly documented in its docstring, and defensive about its inputs.

Step 4: The `@mcp.tool()` decorator

Any async function decorated with @mcp.tool() is automatically registered as an MCP tool. FastMCP reads the function signature to generate the JSON schema that the model uses to decide what arguments to pass.

Step 5: Input validation

Validate all inputs at the top of each handler and return a descriptive error string if they fail. Do not raise Python exceptions — more on why below.

Step 6: Always use async handlers

MCP servers are single-process applications. A synchronous blocking call (a slow HTTP request, a database query) will freeze the entire server until it completes. Always define tool handlers as async def and use await for any I/O. If you must call a synchronous library, wrap it with asyncio.to_thread().

import httpx
from fastmcp import FastMCP

mcp = FastMCP("research-assistant")

@mcp.tool()
async def web_search(query: str, max_results: int = 5) -> str:
    """Search the web and return a summary of results.

    Args:
        query: The search query string
        max_results: Maximum number of results to return (1-10)
    """
    if not query.strip():
        return "Error: query cannot be empty"
    if not 1 <= max_results <= 10:
        return "Error: max_results must be between 1 and 10"

    # Replace with a real search API such as Brave Search or SerpAPI
    return f"Top {max_results} results for '{query}': [results here]"


@mcp.tool()
async def summarise_text(text: str, max_words: int = 100) -> str:
    """Summarise a block of text to the given word limit.

    Args:
        text: The text to summarise
        max_words: Maximum word count for the summary (default 100)
    """
    if not text.strip():
        return "Error: text cannot be empty"
    words = text.split()
    if len(words) <= max_words:
        return text
    return " ".join(words[:max_words]) + "..."


if __name__ == "__main__":
    mcp.run()

Warning

Never raise Python exceptions from tool handlers — return a descriptive error string instead. Uncaught exceptions crash the MCP connection. When the language model receives an error string, it can retry with corrected inputs or explain the failure to the user. When it receives a crashed connection, the user sees a cryptic error and your server process may need a restart.

Notice that both handlers are async def and both validate inputs before doing any work. The docstrings are not decoration — FastMCP includes them in the tool schema that is sent to the client, and the language model reads them to understand what the tool does and when to call it. A clear, accurate docstring is as important as correct code.

Steps 7–9: Resources and Prompt Templates

Resources and prompts are optional primitives, but they are worth understanding because they enable patterns that tools cannot support cleanly on their own.

Step 7: Static and dynamic resources

A resource is content that the model can request and read — a file, a database record, a rendered report. Resources are identified by URI. A static resource has a fixed URI; a dynamic resource uses a URI template with placeholders that the client fills in.

@mcp.resource("docs://readme")
async def get_readme() -> str:
    """The project README as a resource."""
    with open("README.md") as f:
        return f.read()


@mcp.resource("data://users/{user_id}")
async def get_user(user_id: str) -> str:
    """Fetch user data by ID."""
    # Replace with a real database lookup
    return f'{{"id": "{user_id}", "name": "Example User"}}'

Step 8: Prompt templates

A prompt is a reusable, parameterised instruction set. It is useful in Claude Desktop workflows where a user wants to invoke a structured research or analysis process by name, without having to retype the full prompt structure every session.

@mcp.prompt()
def research_prompt(topic: str) -> str:
    """A structured research prompt template."""
    return f"""You are a research assistant. Analyse the following topic thoroughly:

Topic: {topic}

Please provide:
1. A brief overview (2-3 sentences)
2. Key facts and figures
3. Implications for AI builders in India and the UK
4. Three actionable next steps"""

Step 9: When to use each primitive

Most production MCP servers expose only tools. Resources are useful when you want to give the model access to large bodies of content (documentation, codebases, data files) that it reads rather than modifies. Prompts are most useful for power users who want reusable workflows in Claude Desktop. If you are building your first server, start with tools and add resources and prompts only when you have a specific need for them.

Builder note

Resources and prompts are not surfaced by all MCP clients. As of June 2026, tool support is universal across Claude Desktop, Cursor, and Copilot; resource and prompt support varies by client version. Design your core functionality as tools, and treat resources and prompts as enhancements for clients that support them.

Steps 10–12: Harden for Production

A server that works on your laptop is not automatically ready for production. Steps 10 through 12 address the three most common gaps between a working prototype and a server you can deploy with confidence: secret management, rate limiting, and structured logging.

Step 10: Secrets from environment variables, never from code

Any API key, database password, or bearer token that appears as a string literal in your source code will eventually be committed to version control and exposed. The pattern is straightforward: use os.environ.get() and fail fast at startup if a required variable is absent.

Step 11: Rate limiting to protect downstream APIs

Your MCP tools are callable by any model session that connects to your server. Without rate limiting, a runaway agent loop or a misconfigured retry budget can exhaust your downstream API quota in minutes. The following implementation uses a simple in-memory token bucket — adequate for single-process servers. For multi-process deployments, use Redis-backed rate limiting.

Step 12: Structured logging for debugging in production

Print statements disappear into the void in production. Structured logging with consistent field names gives you the ability to query, filter, and alert on your server's behaviour after deployment.

import os
import time
import logging
from collections import defaultdict
from fastmcp import FastMCP

# Step 10: Environment variables — never hardcoded secrets
mcp = FastMCP("production-mcp-server")

API_KEY = os.environ.get("SEARCH_API_KEY")
if not API_KEY:
    raise ValueError("SEARCH_API_KEY environment variable not set")

# Step 11: Rate limiting (simple in-memory token bucket)
_call_times: dict = defaultdict(list)

def rate_limit(tool_name: str, max_calls: int = 10, window_secs: int = 60) -> bool:
    """Return True if the call is within limits, False if the rate limit is exceeded."""
    now = time.time()
    calls = _call_times[tool_name]
    # Remove calls outside the rolling window
    _call_times[tool_name] = [t for t in calls if now - t < window_secs]
    if len(_call_times[tool_name]) >= max_calls:
        return False
    _call_times[tool_name].append(now)
    return True

@mcp.tool()
async def rate_limited_search(query: str) -> str:
    """Search the web with rate limiting (10 calls per 60 seconds).

    Args:
        query: The search query string
    """
    if not query.strip():
        return "Error: query cannot be empty"
    if not rate_limit("search"):
        return "Error: rate limit exceeded — please try again in 60 seconds"
    # Replace with a real search implementation
    return f"Results for '{query}': [search results here]"


# Step 12: Structured logging
logging.basicConfig(
    level=logging.INFO,
    format="%(asctime)s %(levelname)s %(name)s %(message)s",
)
logger = logging.getLogger(__name__)

@mcp.tool()
async def logged_tool(input_text: str) -> str:
    """Process text with structured logging for production debugging.

    Args:
        input_text: The text to process
    """
    logger.info("Tool called", extra={"input_length": len(input_text)})
    if not input_text.strip():
        logger.warning("Empty input received")
        return "Error: input_text cannot be empty"
    result = f"Processed: {input_text}"
    logger.info("Tool returned result", extra={"result_length": len(result)})
    return result


if __name__ == "__main__":
    mcp.run()

The rate limiter above stores call timestamps per tool name in a module-level dictionary. Because Python's asyncio event loop is single-threaded, there are no race conditions for a single-process server. The if not API_KEY: raise ValueError(...) pattern at module level is intentional: it causes the server to fail immediately at startup with a clear error message, rather than failing silently on the first tool call that needs the key.

Connect to Claude Desktop and Cursor

With your server running locally over stdio transport, connecting it to Claude Desktop and Cursor each requires a single JSON configuration change. Neither client needs to be restarted to pick up configuration changes — only to apply them.

Claude Desktop

Edit ~/Library/Application Support/Claude/claude_desktop_config.json on macOS (or %APPDATA%\Claude\claude_desktop_config.json on Windows) and add your server to the mcpServers object:

{
  "mcpServers": {
    "my-research-assistant": {
      "command": "python",
      "args": ["/absolute/path/to/your/server.py"],
      "env": {
        "SEARCH_API_KEY": "your-api-key-here"
      }
    }
  }
}

Use the absolute path to your server.py — Claude Desktop spawns the server as a child process and does not inherit your shell's working directory. The env block injects environment variables into the server process; this is the correct way to pass API keys to a local stdio server without hardcoding them.

Cursor

Create a .cursor/mcp.json file in your project root:

{
  "mcpServers": {
    "research-assistant": {
      "command": "python",
      "args": ["server.py"],
      "env": {}
    }
  }
}

Cursor resolves relative paths from the project root, so server.py works if the file is in the root. For a server in a subdirectory, use the relative path: "args": ["mcp/server.py"].

After saving the configuration, restart Claude Desktop or reload the Cursor window. Your tools will appear in the model's tool list. To verify the connection is working, ask the model directly: "What tools do you have available?" — it should list the tool names from your server.

Pro tip

Test with Claude Desktop before wiring to production agents. The Desktop app gives you a visual confirmation that your tools loaded correctly: click the plug icon in the conversation view to see your connected MCP servers and the list of tools they expose. If your tools are not showing, check the Claude Desktop logs at ~/Library/Logs/Claude/ on macOS for connection errors from the server process.

For GitHub Copilot, the MCP configuration lives in your VS Code settings. Open the command palette, search for "MCP: Add Server", and follow the prompts — it writes the equivalent JSON to your VS Code settings.json under the mcp.servers key. The schema is identical to the Cursor format. OpenAI's Responses API supports MCP servers over HTTP transport; for that integration path, see the MCP in production deployment guide.

Common Pitfalls

The following six mistakes account for the vast majority of issues builders encounter when moving from a working prototype to a stable MCP server. Most are invisible during local testing and only surface under real usage conditions.

Pitfall	Symptom	Fix
Blocking sync calls	MCP hangs on slow requests; the client connection times out	Use `async`/`await` throughout; wrap sync library calls with `asyncio.to_thread()`
Hardcoded secrets	Credentials in version control; security incident on first repo push	Always use `os.environ.get()` and a `.env` file — never commit `.env`
Raising Python exceptions	Client connection drops; user sees a cryptic error	Return descriptive error strings from every tool handler
No input validation	Server crashes on unexpected input; protocol error sent to client	Validate all inputs at the top of each handler before any business logic
Missing pagination on resources	Large resources crash clients or exhaust memory	Implement cursor-based pagination for any resource larger than 10 KB
Ignoring protocol versioning	Tools disappear after an SDK update; protocol negotiation fails	Pin your `fastmcp` version in `requirements.txt`; test SDK upgrades in a separate environment before deploying

The blocking sync call pitfall is the one most builders encounter first. A common pattern is using the requests library for HTTP calls inside a tool handler. requests is synchronous and will block the entire asyncio event loop for the duration of the network call. Replace it with httpx (async-native) or wrap the call: result = await asyncio.to_thread(requests.get, url).

Protocol versioning deserves a special mention in mid-2026 because the MCP 2026-07 release candidate is imminent. The RC introduces a stateless protocol core and a new extensions framework. FastMCP will ship a compatibility release, but servers pinned to older versions should be tested against the new client versions before the RC becomes the stable release. The MCP roadmap guide covers what is changing in the 2026-07 RC in detail.

Pro tip

Add a health tool that returns a dictionary with your server version, the current time, and the status of each downstream dependency (database reachable, API key present). It takes ten minutes to write and saves hours of debugging when something stops working in production. The model can call health at the start of a session to verify the server is correctly configured before attempting any substantive tool call.

Five MCP Server Ideas That Look Good on an AITC Profile

Building a working MCP server is the kind of concrete, demonstrable project that differentiates an AI builder's profile from a list of certifications. Each idea below solves a real problem, can be completed in a weekend, and is immediately useful to the people who will browse your profile. All five are open-source friendly — a public GitHub repo with the server code and a clear README is the profile artefact.

Company knowledge base search. Connect Claude or Cursor to your team's Notion workspace, Confluence instance, or a folder of Markdown files. Expose a search_docs(query) tool that performs keyword or semantic search and returns the top five matching passages. This is the most immediately useful MCP server for teams, and the one most likely to generate real usage data you can cite on your profile.
GitHub PR summariser. A tool that accepts a GitHub PR URL, fetches the diff and description using the GitHub API, and returns a structured summary: what changed, why, and what to look for in review. Useful for onboarding, async code review, and catching breaking changes. Demonstrates your ability to integrate with external APIs and handle potentially large inputs (long diffs) gracefully.
AI news aggregator. A set of tools that pull from RSS feeds (Hacker News, ArXiv abstracts, AI Tech Connect), deduplicate, and surface the three most relevant items for a given topic. Demonstrates feed parsing, relevance filtering, and the kind of information-gathering capability that makes AI assistants genuinely useful in a daily workflow.
Local file semantic search. Index a folder of documents (PDFs, Markdown files, code) using a local embedding model (sentence-transformers, Ollama) and expose a search_files(query) tool. No external API required — everything runs locally. This demonstrates understanding of the full RAG stack in a self-contained project. See the LangGraph tutorial for patterns that pair well with semantic search tools.
Personalised code reviewer. A tool that accepts a code snippet and a style guide (as a resource or hardcoded rules), and returns a structured review: style violations, potential bugs, and improvement suggestions. Pair it with a get_style_guide() resource that returns your team's conventions. This demonstrates the interplay between tools and resources in a realistic workflow.

Built one of these? Add it to your AI Tech Connect profile as a project. Include the tech stack (FastMCP, MCP, Python), a link to the repository, and a one-line description of what problem it solves. Browse existing Builder profiles to see how others frame their projects, and claim your own profile while early spots are open. Founding Builder profiles — fewer than 500 spots — carry a permanent badge and early access to inbound enquiries from companies hiring AI engineers across India and the UK.

For context on where MCP sits in a broader agent architecture decision, the agent stack decision guide published today walks through when to reach for MCP tools versus a full orchestration framework like LangGraph. The Agent SDK guide is a useful companion for understanding how MCP tools wire into Anthropic's managed agent harness.

The Microsoft MCP curriculum at github.com/microsoft/mcp-for-beginners is an excellent supplementary resource once you have your first server running — it covers enterprise deployment patterns that go beyond the scope of this tutorial.

Every article here is written by a Verified Builder. Want your name on the next one?

AI Tech Connect lists AI engineers, founders, and researchers across India and the UK — and the people hiring browse it to find them. Adding your profile is free.

Become a Verified Builder →

Frequently asked

Do I need the raw mcp SDK or is FastMCP sufficient for production?

FastMCP is a thin layer on top of the official mcp Python SDK. It handles the protocol plumbing — transport negotiation, capability advertising, JSON-RPC dispatch — so you can focus on writing tool handlers. For the vast majority of MCP servers, FastMCP is entirely sufficient in production. The raw SDK is only necessary if you need to customise low-level protocol behaviour, such as implementing a non-standard transport or overriding capability negotiation. Start with FastMCP and drop down to the SDK only if you hit a specific ceiling.

Which AI tools and platforms support MCP as of mid-2026?

As of June 2026, MCP has been adopted by Anthropic (November 2024), OpenAI, Microsoft (Copilot and Azure AI Foundry), and Google (Gemini and Google AI Studio), all of which announced support in spring 2025. Client implementations exist in Claude Desktop, Cursor, GitHub Copilot, VS Code (via the MCP extension), and OpenAI's Responses API agent framework. Microsoft has also published an open MCP curriculum at github.com/microsoft/mcp-for-beginners. In practice, a single MCP server you write today will work across all of these platforms without modification.

How should I handle secrets and API keys in a FastMCP server?

Always load secrets from environment variables using os.environ.get(), never from hardcoded strings or config files that could be committed to version control. Use a .env file locally and add .env to your .gitignore. In production (HTTP transport), inject secrets via your deployment platform's secrets manager — AWS Secrets Manager, GCP Secret Manager, or Doppler. In Claude Desktop configuration, the env block in claude_desktop_config.json is the correct place to pass environment variables to your server process; the values in that file live on your local machine and are never transmitted to Anthropic.

Why should tool handlers return error strings rather than raise Python exceptions?

The MCP protocol uses JSON-RPC for communication between the client and server. When a Python exception escapes a tool handler unhandled, FastMCP and the underlying MCP SDK convert it to a JSON-RPC error response, which many MCP clients interpret as a connection-level failure and drop the session entirely. Returning a descriptive error string (for example, 'Error: query cannot be empty') instead keeps the session alive and gives the language model the information it needs to retry with corrected inputs or explain the failure to the user. Reserve exception raising for genuinely unrecoverable errors where you want the server process itself to terminate.

What is the MCP 2026 roadmap and how does it affect servers I build today?

The MCP 2026-07 release candidate introduces four key changes: a stateless protocol core (existing stateful servers continue to work via a compatibility layer), an extensions framework for vendor-specific capabilities, MCP Apps for server-rendered UI components that surface inside client applications, and hardened authorisation (OAuth 2.1 with PKCE is now the standard for HTTP-transport servers). Servers built with FastMCP today will continue to work after the RC because the core tool, resource, and prompt primitives are unchanged. The main migration task will be adopting the new authorisation model if you are running HTTP-transport servers that currently use ad-hoc bearer tokens.

Built your first MCP server? Add it to your Builder profile.

AI Tech Connect is the directory where Indian and UK AI Builders get found by the people hiring and collaborating. Claim your free Founding Builder profile while early spots are open — two minutes, no CV.

Create your free profile Browse Builders

← Back to AI Tips