mcp2cli turns any MCP server or OpenAPI spec into a CLI at runtime. The LLM discovers tools on demand:
mcp2cli --mcp https://mcp.example.com/sse --list # ~16 tokens/tool
mcp2cli --mcp https://mcp.example.com/sse create-task --help # ~120 tokens, once
mcp2cli --mcp https://mcp.example.com/sse create-task --title "Fix bug"
No codegen, no rebuild when the server changes. Works with any LLM — it's just a CLI the model shells out to. Also handles OpenAPI specs (JSON/YAML, local or remote) with the same interface.Token savings are real, measured with cl100k_base: 96% for 30 tools over 15 turns, 99% for 120 tools over 25 turns.
It also ships as an installable skill for AI coding agents (Claude Code, Cursor, Codex): `npx skills add knowsuchagency/mcp2cli --skill mcp2cli`
Inspired by Kagan Yilmaz's CLI vs MCP analysis and CLIHub.
- https://github.com/apify/mcpc
- https://github.com/chrishayuk/mcp-cli
- https://github.com/wong2/mcp-cli
- https://github.com/f/mcptools
- https://github.com/adhikasp/mcp-client-cli
- https://github.com/thellimist/clihub
- https://github.com/EstebanForge/mcp-cli-ent
- https://github.com/knowsuchagency/mcp2cli
- https://github.com/philschmid/mcp-cli
- https://github.com/steipete/mcporter
- https://github.com/mattzcarey/cloudflare-mcp
- https://github.com/assimelha/cmcpIt turns out everyone is having the same idea.
Here's the comparison table: https://github.com/apify/mcpc?tab=readme-ov-file#related-wor...
But why advertise it and try to make it into a product?
I was inspired by clihub (I credited them) but I also wanted 3 additional things.
1. OpenAPI support 2. Dynamic CLI generation. I don’t want to recompile my CLI if the server changes. 3. An agent skill
We graded 201 MCP servers (3,991 tools, 512K tokens total). 97% have quality issues that waste tokens: descriptions that repeat the parameter name verbatim, markdown formatting inside tool descriptions, missing type info, descriptions starting with 'This tool...' or 'Allows you to...'. None of this helps the LLM; it just costs tokens.
agent-friend fix server.json > fixed.json reduces token count ~30% for most servers without changing functionality. The two approaches stack — fix the schema first, then serve via CLI if needed.
Next we'll wrap the CLIs into MCPs.
It building on top of them, because MCP did address some issues (which arguably could've been solved better with clis to begin with - like adding proper help texts to each command)... it just also introduced new ones, too.
Some of which still won't be solved via switching back to CLI.
The obvious one being authentication and privileges.
By default, I want the LLM to be able to have full read only access. This is straightforward to solve with an MCP because the tools have specific names.
With CLI it's not as straightforward, because it'll start piping etc and the same CLI is often used both for write and read access.
All solvable issues, but while I suspect CLIs are going to get a lot more traction over the next few months, it's still not the thing we'll settle on- unless the privileges situation can be solved without making me greenlight commands every 2 seconds (or ignoring their tendency to occasionally go batshit insane and randomly wipe things out while running in yolo mode)
Oh wait there's ssh. I guess it's because there's no way to tell AI agents what the tool does, or when to invoke it... Except that AI pretty much knows the syntax of all of the standard tools, even sed, jq, etc...
Yeah, ssh should've been the norm, but someone is getting promoted for inventing MCP
Tell me the hottest day in Paris in the
coming 7 days. You can find useful tools
at www.weatherforadventurers.com/tools
And then the tools url can simply return a list of urls in plain text like /tool/forecast?city=berlin&day=2026-03-09 (Returns highest temp and rain probability for the given day in the given city)
Which return the data in plain text.What additional benefits does MCP bring to the table?
MCP can provide validation & verification of the request before making the API call. Giving the model a /tool/forecast URL doesn't prevent the model from deciding to instead explore what other tools might be available on the remote server instead, like deciding to try running /tool/imagegenerator or /tool/globalthermonuclearwar. MCP can gatekeep what the AI does, check that parameters are valid, etc.
Also, MCP can be used to do local computation, work with local files etc, things that web access wouldn't give you. CLI will work for some of those use cases too, but there is a maximum command line length limit, so you might struggle to write more than 8kB to a file when using the command line, for example. It can be easier to get MCP to work with binary files as well.
I tend to think of local MCP servers like DLLs, except the function calls are over stdio and use tons of wasteful JSON instead of being a direct C-function call. But thinking of where you might use a DLL and where you might call out to a CLI can be a useful way of thinking about the difference.
You could restrict where it can go with domain allowlists but that has insufficient granularity. The same URL can serve a legitimate request or exfiltrate data depending on what's in the headers or payload: see https://embracethered.com/blog/posts/2025/claude-abusing-net...
So you need to restrict not only where the agent can reach, but what operations it can perform, with the host controlling credentials and parameters. That brings us to an MCP-like solution.
MCP is just as worse version of the above allowing lots of data exfiltration and manipulation by the LLM.
Being able to have a verifiable input/output structure is key. I suppose you can do that with a regular http api call (json) but where do you document the openapi/schema stuff? Oh yeah...something like mcp.
I agree that mcp isn't as refined as it should be, but when used properly it's better than having it burn thru tokens by scraping around web content.
Not all services provide good token definition or access control, and often have API Key + CLI combo which can be quite dangerous in some cases.
With an MCP even these bad interfaces can be fixed up on my side.
As an aside: this is a cool idea but the prose in the readme and the above post seem to be fully generated, so who knows whether it is actually true.
Measure fidelity with exact diffs and embedding similarity, and include streaming behavior, schema-change resilience, and rate-limit fallbacks in the cases you care about. Check the repo for a runnable benchmark, archived fixtures captured with vcrpy or WireMock, and a clear test harness that reproduces the claimed 96 to 99 percent savings.
"We measured this. Not estimates — actual token counts using the cl100k_base tokenizer against real schemas, verified by an automated test suite."
It works by schematising the upstream and making data locally synchronised + a common query language, so the longer term goals are more about avoiding API limits / escaping the confines of the MCP query feature set - i.e. token savings on reading data itself (in many cases, savings can be upwards of thousands of times fewer tokens)
Looking forward to trying this out!
anthropic mentions MCPs eating up context and solutions here: https://www.anthropic.com/engineering/code-execution-with-mc...
I built one specifically for Cognition's DeepWiki (https://crates.io/crates/dw2md) -- but it's rather narrow. Something more general like this clearly has more utility.
I consider this a bug. I'm sure the chat clients will fix this soon enough.
Something like: on each turn, a subagent searches available MCP tools for anything relevant. Usually, nothing helpful will be found and the regular chat continues without any MCP context added.
I'll add to your comment that it isn't a bug of MCP itself. MCP doesn't specify what the LLM sees. It's a bug of the MCP client.
In my toy chatbot, I implement MCP as pseudo-python for the LLM, dropping typing info, and giving the tool infos as abruptly as possible, just a line - function_name(mandatory arg1 name, mandatory arg2 name): Description
(I don't recommend doing that, it's largely obsolete, my point is simply that you feed the LLM whatever you want, MCP doesn't mandate anything. tbh it doesn't even mandate that it feeds into a LLM, hence the MCP CLIs)
I agree with the general idea that models are better trained to use popular cli tools like directory navigation etc, but outside of ls and ps etc the difference isn't really there, new clis are just as confusing to the model as new mcps.
> I consider this a bug. I'm sure the chat clients will fix this soon enough.
ANTHROP\C's Claudes manage/minimize/mitigate this reaonably.
The analogy I'd draw is database query planning: you don't load the entire schema into memory before every query, you resolve references on demand. Same principle here. Does the CLI maintain a tool cache between invocations, or does it re-fetch schemas each time?
One pattern we've been seeing internally is that once teams standardize API interactions through a single interface (or agent layer), debugging becomes both easier and harder.
Easier because there's a central abstraction, harder because failures become more opaque.
In production incidents we often end up tracing through multiple abstraction layers before finding the real root cause.
Curious if you've built anything into the CLI to help with observability or tracing when something fails.
If the service is using more tokens to produce the same output from the same query, but over a different protocol, than the service is a scam.
With a CLI, you avoid sending this context to the LLM and it progressively discovers only what is needed.
The input token costs come down because of using a CLI instead of MCP
But I do wonder about these tools whether they have tested that the quality of subsequent responses is the same.
This method was popularised by beads with a simple command “bd quickstart” to teach the basics to an agent. Think of this as an adaptive learning method for the agent.
I’ve not seen the details of mcp2cli, but let’s just say you had a mcp2cli wrapper over stripe, you can just tell the agent to run mcp2cli for stripe as a provider to learn how to use the rest of the APIs
Isn’t this somewhat misleading? Any system context is going to be added “per turn” because it’s included in the first turn.
Is any context removed on a turn by turn basis (aside from thinking?)
Essentially I've cloned thousands of mcp servers, used the readmes and the star rating to respond to the qdrant query (star ratings as a boost score have been an attack vector, yes I know, it's an incomplete product [1]), then presents it as a JSON response with "one-shots" which this author calls clis.
I think I became discouraged from working on it and moved on because my results weren't that great but search is hard and I shouldn't give up.
I'll get back on it seeing how good this tool is getting traction.
[1] There needs to be a legitimacy post-filter so that github user micr0s0ft or what-have-you doesn't go to to the top - I'm sure there's some best-of-practice ways of doing this and I shouldn't invent my own (which would involve seeing if the repo appears on non-UGC sites I guess?!) but I haven't looked into it
You might as well directly create a CLI tool that works with the AI agents which does an API call to the service anyway.
So, I dont see why a typical productivity app build CLI than MCP. Am I missing anything?
I started a similar project in January but but nobody seemed interested in it at the time.
Looks like I'll get back on that.
https://github.com/day50-dev/infinite-mcp
Essentially
(1) start with the aggregator mcp repos: https://github.com/day50-dev/infinite-mcp/blob/main/gh-scrap... . pull all of them down.
(2) get the meta information to understand how fresh, maintained, and popular the projects are (https://github.com/day50-dev/infinite-mcp/blob/main/gh-get-m...)
(3) try to extract one-shot ways of loading it (npx/uvx etc) https://github.com/day50-dev/infinite-mcp/blob/main/gh-one-l...
(4) insert it into what I thought was qdrant but apparently I was still using chroma - I'll change that soon
(5) use a search endpoint and an mcp to seach that https://github.com/day50-dev/infinite-mcp/blob/main/infinite...
The intention is to get this working better and then provide it as a free api and also post the entire qdrant database (or whatever is eventually used) for off-line use.
This will pair with something called a "credential file" which will be a [key, repo] pair. There's an attack vector if you don't pair them up. (You could have an mcp server for some niche thing, get on the aggregators, get fake stars, change the the code to be to a fraud version of a popular mcp server, harvest real api keys from sloppy tooling and MitM)
Anyway, we're talking about 1000s of documents at the most, maybe 10,000. So it's entirely givable away as free.
If you like this project, please tell me. Your encouragement means a lot to me!
I don't want to spend my time on things that nobody seems to be interested in.
Great implementation details, but what is the end goal? Ah ha, a readable readme (which itself is promising):
InfiniteMCP is a an MCP server that acts as a universal gateway to thousands of other MCP servers. Instead of manually configuring each MCP server you want to use, InfiniteMCP lets Claude discover, understand, and use any MCP server on demand through natural language queries.
Think of it as an "MCP server of MCP servers" - a single connection that unlocks the entire MCP ecosystem.
So, yeah, that's interesting.
> and then provide it as a free api
Oh, oops, that just became a supply chain threat. Central registries outside of targets' control are grails, and the speculated implementation for secrets makes this a lovely injection path...
If you pursue this, work with someone like control-plane.io to blue/red team it and make noise about that on your README with a link to their findings and your mitigations. And consider sync up with folks like kusari.dev (see also SLSA and GUAC) to include a vulns rating on each MCP itself (their mapping is super fast and a SBOM scanned MCP directory would be a real value add).
If you want humans to spend time reading your prose, then spend time actually writing it.