What I Learned Building Co-op Translator MCP

I recently added a Model Context Protocol server to Co-op Translator.

Co-op Translator started as a CLI-first tool for translating Markdown files, Jupyter notebooks, and text inside images. The CLI is still the most natural interface when a person wants to translate a repository from a terminal.

But agents see tools differently.

When an agent such as Codex or Claude Code works with a project, it does not only run one big command and walk away. It reads context, chooses a smaller action, checks the result, and then decides the next step. That pushed me to rethink the shape of the Co-op Translator APIs before exposing them through MCP.

A CLI can hide complexity

A CLI command is allowed to do many things at once:

translate -l "ko" -md -nb -img

That command can discover files, translate content, choose output paths, rewrite links, update metadata, and clean up old translated assets. For a human, that is convenient.

For an agent-facing API, the same convenience can become ambiguity.

If one tool translates text, rewrites paths, writes files, and updates metadata at the same time, the agent has a harder time understanding which part of the workflow succeeded and which part should happen next.

So I separated the lower-level responsibilities:

translate_markdown_content(document, language_code, options)
rewrite_markdown_paths(content, source_path, target_path, policy)
run_translation(...)

One API translates Markdown content. Another rewrites links after the caller knows the source and target paths. A project-level API still orchestrates the full repository workflow, similar to the CLI.

That separation made the MCP server much easier to reason about.

MCP tools need safer defaults

An MCP server is not just a thin wrapper around existing functions.

The tools are model-controlled by the host application, so the defaults matter. A repository translation can create or update many files. That is not something an agent should do silently.

For that reason, repository translation through MCP starts as a dry run. A write requires explicit confirmation:

{
  "language_codes": "ko",
  "root_dir": ".",
  "markdown": true,
  "dry_run": true
}

This was one of the clearest lessons from the work: agent-facing APIs need guardrails as much as capabilities. The better default is the one that lets the agent inspect, explain, and ask before changing the workspace.

Agent-assisted translation felt like the right fit

The most interesting part was agent-assisted translation.

In the provider-backed flow, Co-op Translator calls Azure OpenAI or OpenAI directly. That is useful for production pipelines and repeatable automation.

But in an MCP environment, the host agent already has a model. For Markdown and notebooks, Co-op Translator does not always need to call another LLM provider. It can prepare the work, protect the structure, and let the host agent translate the chunks.

The flow looks like this:

1. start_markdown_agent_translation
2. the host agent translates the returned chunks
3. finish_markdown_agent_translation

Co-op Translator handles chunking, code placeholder preservation, frontmatter reconstruction, notebook Markdown cell replacement, and post-translation normalization. The host agent performs the actual translation.

That means a user can ask:

Translate this Markdown file to Korean with Co-op Translator MCP.
Use agent-assisted mode.
Keep Markdown formatting, code blocks, and links intact.

The agent can call the MCP tools, translate the prepared chunks with its own model, and return reconstructed Markdown.

This felt like a better division of labor. The agent is good at language. Co-op Translator is responsible for the translation workflow around the language.

Image translation is different

Markdown and notebooks can use the agent-assisted path because the content is already textual.

Images are different.

Image translation needs OCR, bounding boxes, and layout-aware rendering. The system has to know where text appears inside the image and how to draw translated text back into the visual layout.

So image translation remains provider-backed. It still needs Azure AI Vision for text extraction and layout information.

That distinction is important to document clearly. MCP support does not mean every workflow becomes keyless. It means each workflow can choose the right execution model.

The main lesson

The biggest lesson was simple:

Agent-facing APIs need clearer boundaries than human-facing CLIs.

A human-facing CLI should optimize for convenience. An agent-facing API should optimize for composability, predictability, and safe defaults.

Good MCP tools are not only about exposing functions. They are about exposing the right workflow surfaces so an agent can make progress without guessing.

That is what Co-op Translator MCP tries to do: keep the full CLI-style project workflow available, while also giving agents smaller tools for Markdown, notebooks, images, path rewriting, review, and configuration inspection.

It changed how I think about productizing developer tools for agents. The tool is no longer only something a person runs. It becomes part of another reasoning loop.