Taking cricket-mcp from my desktop to the interwebs

Mihir Wagle 6 min read
#mcp#cricket#docker

A few days ago I wrote about building a cricket statistics brain for Claude. 23 tools, 10 million deliveries, DuckDB, MCP. The whole thing ran locally through Claude Desktop.

It worked well. Too well. I kept wanting to use it from places that weren't my home. So I put it on the internet.

Here's what changed along the way.

Better tool descriptions fixed everything

The original version had a problem I didn't fully appreciate. Claude would sometimes pick the wrong tool. I had get_matchup, get_matchup_records, and get_batter_vs_team_bowling. Three tools, same query builder, slightly different use cases. Ask "who dismisses Warner the most?" and Claude might call the wrong one.

I needed better descriptions, so I rewrote them. Each one leads with a question ("How does this batter fare against this bowler?"), give 2-3 example queries, and explicitly say what the tool is not for ("Not for career stats, use get_player_stats"). That last part turned out to be the most important bit. Telling the model what not to use a tool for is just as valuable as telling it what to use it for.

I also consolidated those three matchup tools into one. It detects the mode from the parameters. Provide both a batter and bowler name, you get a specific matchup. Provide a batter and an opposition but no bowler, it breaks down every bowler on that team. Provide just one name, it's a leaderboard. Same underlying query, one tool instead of three.

Tool count went from 28 down to 25. Accuracy went up noticeably.

Player enrichment

The original version only knew what Cricsheet told it. Player names and IDs. No batting style, no bowling style, no playing role. So you couldn't ask "How does Kohli do against left-arm pace?" because the system didn't know who bowled left-arm pace.

I found a dataset of 16,000 players with batting style, bowling style, and playing role. Built an enrichment pipeline that matches them to Cricsheet IDs and loads the metadata into DuckDB. Then added a get_style_matchup tool that breaks down a batter's stats by bowling style (pace vs spin, left-arm vs right-arm) or a bowler's stats by batting hand.

To make the style queries fast, I pre-compute two columns during enrichment. bowling_style_broad (Pace/Spin) and bowling_style_arm (Left-arm Pace, Right-arm Spin, etc). No per-row CASE expressions at query time.

Now you can ask "Rohit Sharma against left-arm spin in Tests" and get a real answer.

Going remote

cricket-mcp originally only spoke stdio. That's the local transport that Claude Desktop and VS Code use. Clone the repo, ingest the data, run it on your machine. Great for personal use, not great for sharing.

The MCP SDK has a transport called StreamableHTTPServerTransport that handles the protocol over HTTP. It manages SSE streams for server-to-client messages, accepts JSON-RPC over POST, and tracks sessions via headers. I wired it up to a node:http server, added CORS headers, and gave the CLI a --transport http --port 3000 option. The existing stdio mode is untouched, still the default.

The interesting bit was session management. Each client gets its own McpServer instance and transport, but they all share the same DuckDB connection. The workload is read-only queries so there's no contention.

Docker and deployment

I wanted deployment to be a single command. The Dockerfile is a three-stage build:

  1. Build: compile TypeScript
  2. Ingest: download all 21,000+ matches from Cricsheet, parse them, load into DuckDB, enrich player metadata
  3. Runtime: slim Node.js image with just the compiled code and the ~1 GB database file

The ingest runs during docker build, so the image is completely self-contained. No external database, no startup downloads. docker run -p 3000:3000 cricket-mcp and it's serving.

I deployed it to Azure Container Apps. Push the image to Azure Container Registry, point Container Apps at it, get a URL. Added a CNAME for cricket.waglesworld.com, Azure provisioned a free SSL cert, done.

From "I want to host this" to a working endpoint took about an hour. Most of that was waiting for the Docker image to build. Ingesting 21,000 matches takes a few minutes even on cloud hardware.

Where it stands now

25 tools covering player stats, head-to-head matchups, batting/bowling records, phase analysis (powerplay/middle/death), situational stats (chasing vs setting), style matchups (pace vs spin), partnerships, venue stats, toss analysis, team form, tournament summaries, milestones, fielding, dismissal analysis, what-if scenarios, and more.

21,270 matches with ball-by-ball data. Tests back to December 2001, ODIs to June 2002, T20Is to February 2005.

12,879 enriched players with batting style, bowling style, and playing role.

Live at cricket.waglesworld.com/mcp. Any MCP client can connect over HTTP.

Still open source. Run it locally with npm run ingest, or deploy your own with docker build.

Try it

Now that it's hosted, you don't need to clone anything. You just point your MCP client at https://cricket.waglesworld.com/mcp. Here's how that works on each platform.

Claude Desktop

Go to Settings > Connectors, click Add custom connector, paste the URL. That's it. Available on Pro, Max, Team, and Enterprise plans.

If you prefer the config file approach, you'll need the mcp-remote proxy since Claude Desktop launches MCP servers as child processes over stdio:

{
  "mcpServers": {
    "cricket": {
      "command": "npx",
      "args": ["mcp-remote", "https://cricket.waglesworld.com/mcp"]
    }
  }
}

Claude Code

One command:

claude mcp add --transport http cricket-mcp https://cricket.waglesworld.com/mcp

ChatGPT

ChatGPT supports remote MCP servers in Developer Mode. Go to Settings > Connectors > Advanced > Developer Mode, add the URL. Then in a chat, click the + below the chatbox, hover over More, click Developer mode, and enable the cricket server.

Available on Pro, Plus, Business, Enterprise, and Education plans. Note that ChatGPT calls these "apps" now.

Gemini CLI

Add it to your Gemini CLI config (.gemini/settings.json):

{
  "mcpServers": {
    "cricket": {
      "httpUrl": "https://cricket.waglesworld.com/mcp"
    }
  }
}

The Gemini web app doesn't support custom MCP servers yet, but the CLI does. It handles both SSE and Streamable HTTP transports.

Or run it locally

If you'd rather keep everything on your machine, the original setup from the first post still works:

git clone https://github.com/mavaali/cricket-mcp.git
cd cricket-mcp && npm install
npm run ingest

Then add it to Claude Desktop config with the stdio transport. No internet required after the initial data download.

What's next

I want to build a web frontend. Right now the only way to interact with it is through an MCP client like Claude Desktop, which limits the audience. A chat interface on the website would let anyone ask cricket questions without installing anything.

I'm also thinking about automated data updates. A weekly cron that rebuilds the container image with the latest matches and redeploys. Cricsheet publishes updates regularly, the pipeline is already there.

PRs are still welcome.


If you missed the first post: I built a cricket statistics brain for Claude, and you can too

All data sourced from Cricsheet. Built with Claude, DuckDB, and an unhealthy obsession with cricket.

← Back to blog

Enjoyed this post? Get new ones in your inbox.