Building an MCP Server in 2026: Complete Tutorial + Koordex Case Study

TL;DR: Model Context Protocol (MCP) is the 2026 standard for exposing tools to LLMs. Build an MCP server in 50 lines of Python: define tool functions, decorate them, run the server. Claude, Cursor, custom agents discover and call your tools through a unified interface. This guide walks through the build, then shows how Internative uses MCP servers in Koordex (our AI operations layer) to expose enterprise systems safely to agentic AI.

If you're building anything agentic in 2026 — autonomous workflows, AI assistants over internal systems, custom Claude integrations — you'll write MCP servers. MCP went from "interesting Anthropic protocol" in late 2024 to "the way LLMs talk to tools" by mid-2026.

This guide is the complete tutorial. We'll build a working MCP server from zero, explain the architecture decisions you'll face, and walk through a real Koordex deployment that exposes 40+ enterprise tools to agents through one MCP layer.

By the end you'll have running code, the production patterns we use at scale, and the avoidance list for the most common mistakes.

What MCP Actually Is

Model Context Protocol (MCP) is an open standard that defines how LLMs discover, request, and receive results from external tools.

Before MCP (2023-2024):

Each LLM SDK had its own tool definition format
Switching from OpenAI to Anthropic required rewriting tool schemas
Tool implementations lived inside the AI application, not as services
Scaling past 5-10 tools became unmanageable

With MCP (2025+):

Tools defined once in MCP-compatible servers
Any MCP-compatible client (Claude, Cursor, custom agents) can use them
Standard discovery (list_tools), invocation (call_tool), result format
Tools are services — independently deployed, governed, audited

Think of MCP as "REST for LLM tools." Universal protocol, language-agnostic, deployment-flexible.

The Build — 50 Lines, Working Server

We'll build an MCP server that exposes 3 tools: get current weather, query a SQL database, send a Slack message.

Setup

``bash pip install mcp ``

Server code: `weather_db_slack_server.py`

```python from mcp.server import Server, NotificationOptions from mcp.server.models import InitializationOptions import mcp.server.stdio import mcp.types as types

server = Server("internative-tools")

@server.list_tools() async def handle_list_tools() -> list[types.Tool]: return [ types.Tool( name="get_weather", description="Get current weather for a city", inputSchema={ "type": "object", "properties": { "city": {"type": "string", "description": "City name"} }, "required": ["city"] } ), types.Tool( name="query_db", description="Run a read-only SQL query against the analytics warehouse", inputSchema={ "type": "object", "properties": { "sql": {"type": "string", "description": "SELECT-only SQL"} }, "required": ["sql"] } ), types.Tool( name="send_slack", description="Send a message to a Slack channel", inputSchema={ "type": "object", "properties": { "channel": {"type": "string"}, "message": {"type": "string"} }, "required": ["channel", "message"] } ), ]

@server.call_tool() async def handle_call_tool(name: str, arguments: dict) -> list[types.TextContent]: if name == "get_weather":

In real code: call OpenWeatherMap or similar

result = f"Weather in {arguments['city']}: 18°C, partly cloudy" elif name == "query_db":

In real code: SQLAlchemy with read-only credentials + SQL validator

result = f"Query result: [3 rows] {arguments['sql'][:60]}..." elif name == "send_slack":

In real code: Slack SDK with audit logging

result = f"Sent to {arguments['channel']}: {arguments['message'][:50]}..." else: result = f"Unknown tool: {name}" return [types.TextContent(type="text", text=result)]

async def main(): async with mcp.server.stdio.stdio_server() as (read_stream, write_stream): await server.run( read_stream, write_stream, InitializationOptions( server_name="internative-tools", server_version="0.1.0", capabilities=server.get_capabilities( notification_options=NotificationOptions(), experimental_capabilities={} ) ) )

if __name__ == "__main__": import asyncio asyncio.run(main()) ```

That's it. Working MCP server. Any MCP-compatible client can now:

Connect to the server
Call list_tools() to discover the 3 tools
Call call_tool("get_weather", {"city": "Istanbul"}) and get a response

Connect from Claude Desktop

In Claude Desktop config (~/Library/Application Support/Claude/claude_desktop_config.json on macOS):

``json { "mcpServers": { "internative-tools": { "command": "python", "args": ["/absolute/path/to/weather_db_slack_server.py"] } } } ``

Restart Claude Desktop. Your 3 tools now appear in any conversation — Claude can call them autonomously.

Connect from a custom agent (using LangGraph)

```python from langchain_mcp_adapters.client import MultiServerMCPClient

client = MultiServerMCPClient({ "internative-tools": { "command": "python", "args": ["/path/to/weather_db_slack_server.py"], "transport": "stdio" } })

tools = await client.get_tools()

Now `tools` is a list of LangChain Tool objects you can bind to any LLM

```

The same MCP server works across Claude, Cursor, custom Python agents, custom Node agents — write once, use everywhere.

Architecture Decisions You'll Hit

Transport: stdio vs HTTP vs SSE

stdio (this tutorial): server runs as subprocess. Best for desktop integrations (Claude Desktop, Cursor) or co-located services.
HTTP: server is a web service. Best for centralized tool servers used by multiple agents/clients.
SSE (Server-Sent Events): streaming server. Best for long-running tools that produce incremental output.

For Koordex production we use HTTP transport because our tool servers run as Kubernetes deployments serving many agent instances.

Tool Granularity

Wrong: one MCP server with 50 unrelated tools. Right: 3-5 MCP servers, each grouped by domain (database tools, communication tools, file tools).

Why: easier permissions, easier audit, easier deployment lifecycle.

Permission Model

MCP itself doesn't enforce permissions — the client decides what to expose. For production:

Wrap your MCP server in an auth layer (HTTP transport + JWT or mTLS)
Filter list_tools response per caller identity
Validate call_tool arguments against caller's permission set
Log every tool invocation for audit

Tool Implementation Standards

For every production tool:

Input validation: don't trust the LLM's argument formatting. Validate types, ranges, allowed values.
Idempotency: where possible, tool calls should be safe to retry. Add idempotency keys for mutating ops.
Timeouts: set per-tool timeouts. LLM-driven workflows can spam tool calls.
Rate limits: per-caller and per-tool rate limits prevent runaway agents from causing incidents.
Audit logging: who called what tool with what arguments and what was returned. JSON-line format.

Koordex Case Study — MCP at Production Scale

Koordex is Internative's AI operations layer. We use MCP servers to expose enterprise tools to autonomous agents safely.

The Setup

A typical Koordex deployment has 40+ tools exposed across 5-8 MCP servers:

Database server (10-15 read tools): SELECT against warehouse, dashboards, status checks
Communication server (5-8 tools): Slack post, email send, Teams message, ticket create
File system server (5-7 tools): read internal docs, write reports, list folders
Vendor APIs server (10-15 tools): Stripe queries, HubSpot lookups, Salesforce ops
Admin server (3-5 tools, restricted): user management, billing ops

Each server runs as a separate Kubernetes deployment. Agents discover tools via MCP list_tools filtered by their permission scope.

The Permission Pattern

`` Agent Identity → Policy Engine → Allowed Tools Filter → MCP Discovery ``

When an agent connects, our policy engine returns the subset of tools it can use. The agent sees only those in its list_tools response. Tool calls outside that scope are rejected at the gateway.

For a customer support agent: read tools + ticket create. Nothing more. For an analyst agent: read tools + dashboard write. No mutating ops. For an admin agent: full access (with human approval gate on destructive ops).

The Audit Pattern

Every tool call gets logged:

``json { "timestamp": "2026-06-22T14:31:02Z", "agent_id": "support-agent-1", "session_id": "abc123", "tool": "query_db", "arguments": {"sql": "SELECT count(*) FROM tickets WHERE status='open'"}, "result_summary": "Returned 1 row", "duration_ms": 145, "policy_decision": "allowed", "user_invoker": "begum@internative.net" } ``

This log feeds into:

Cost tracking (per tool, per agent, per session)
Compliance audit trail
Performance debugging
Anomaly detection (agent making unusual tool sequences)

The Cost Pattern

MCP tool calls aren't free — they're how LLM costs add up. Agentic workflows can chain 10-100 tool calls per user request. Without observability, the LLM bill explodes.

Koordex caches:

Identical tool calls within a session (cache hit avoids re-invocation)
Read-only tool results across sessions for 5-60 min TTL
Tool argument fingerprints to detect redundant calls

Real Koordex deployment: caching cut tool call volume 40-60% on customer support workloads.

For our cost engineering deep-dive see LLM Cost Optimization: 7 Patterns.

When NOT to Use MCP

Single-purpose agent with 1-3 tools: just write the tool calls directly, MCP is overkill.
Tool calls need sub-50ms latency end-to-end: MCP transport overhead adds 20-100ms.
Tools are fully internal to a Python process: function calls beat protocol overhead.
You're on a legacy stack that can't run Python/Node 18+: MCP SDKs assume modern runtimes.

For everything else (multi-agent systems, cross-language teams, governance-heavy environments), MCP is the right abstraction.

The Three Most Common Mistakes

Mistake 1: Building tool calls directly without MCP "to save complexity." Saves a day. Costs 4-6 weeks at month 8 when you add a second agent or switch frameworks. MCP is upfront investment that pays back fast.

Mistake 2: No permission layer. Default-open tool access is fine for prototypes. In production an agent without permission boundaries is one prompt injection away from sending emails it shouldn't.

Mistake 3: No audit log. When the agent does something surprising in production (and it will), you'll need to reconstruct what it did. Without per-tool audit log, you're guessing.

6 Questions Before You Build

How many tools will agents call? <5 = skip MCP, do direct integration. 5-50 = single MCP server. 50+ = multiple servers grouped by domain.

How many distinct agents or clients? 1 = MCP is overhead. 2+ = MCP saves work as you add agents.

What's your transport need? Desktop (Claude/Cursor) = stdio. Production multi-agent = HTTP.

What's your permission model? Open (internal tooling only) = simple. Governed (multi-tenant or regulated) = wrap MCP in auth gateway from day 1.

What's your audit requirement? Light = log file. Heavy = structured logs → SIEM (Splunk, Datadog) + retention policy.

Who owns operational responsibility? Single team = simple. Cross-team = MCP servers per domain owned by domain teams.

Next Step

If you're building MCP servers for production or evaluating an AI operations layer, we run 30-minute architecture reviews. We can walk through your specific tool surface, permission needs, and audit requirements — and tell you honestly whether Koordex fits or whether a self-build is the right call.

Contact: team@internative.net or via internative.net.