Integrations
Connect Cua Driver to your AI coding agent
cua-driver mcp is a stdio MCP server. Any agent that supports MCP can use it — no extra setup beyond adding it to the agent's MCP config.
Grant the required permissions before connecting any agent — on macOS that's Accessibility and Screen
Recording (Windows and Linux have no equivalent prompt). Run cua-driver call check_permissions to verify.
Claude Code
Standard MCP registration:
claude mcp add --transport stdio cua-driver -- cua-driver mcpVerify:
claude mcp list
# cua-driver: cua-driver mcp (stdio) - ✓ ConnectedClaude Code computer-use compatibility mode
Claude Code vision/computer-use-style flows appear to use the presence of a screenshot tool as a cue for image-grounded operation. If you want that behavior, register the compatibility server instead:
claude mcp add --transport stdio cua-computer-use -- cua-driver mcp --claude-code-computer-use-compatThis mode still exposes the normal CuaDriver tools. The only changed tool is screenshot: it requires pid and window_id, captures that window only, and returns a window-local image coordinate frame. Start with launch_app or list_windows, then call screenshot with the target window.
For this Claude Code vision/computer-use-style path, use MCP rather than shelling out to the CLI. CLI screenshots can still capture windows, but they do not expose the mcp__cua-computer-use__screenshot tool name that Claude Code appears to use as the image-grounding cue.
This does not call Anthropic APIs or expose Anthropic's native computer-use API tool. It is a CuaDriver MCP compatibility mode for Claude Code.
GitHub Copilot CLI
Add to ~/.copilot/mcp-config.json:
{
"mcpServers": {
"cua-driver": {
"type": "local",
"command": "cua-driver",
"args": ["mcp"],
"tools": ["*"]
}
}
}Or interactively inside gh copilot chat:
/mcp addFill in: name=cua-driver, type=STDIO, command=cua-driver, args=mcp. Press Ctrl+S to save.
Codex (OpenAI)
codex mcp add cua-driver -- cua-driver mcpCursor
Generate the config snippet and paste it into ~/.cursor/mcp.json:
cua-driver mcp-config --client cursorGemini CLI
Add to ~/.gemini/settings.json:
{
"mcp": {
"servers": {
"cua-driver": {
"type": "stdio",
"command": "cua-driver",
"args": ["mcp"]
}
}
}
}Tools appear prefixed as mcp_cua-driver_*.
OpenCode
cua-driver mcp-config --client opencodePaste the output into ~/.config/opencode/config.json (global) or opencode.json at the project root.
Always configure Cua Driver as an MCP server — never rely on the CLI fallback. If MCP is not
wired up, OpenCode calls cua-driver as a shell subprocess. The get_window_state response no
longer includes base64 by default, but the screenshot image block is silently dropped — the model
receives only the AX tree with no visual context. Use --screenshot-out-file or the
screenshot_out_file param to preserve the image when using the CLI path.
Local vision models (Ollama)
If you are using a vision-capable model via Ollama, you must also declare its input modalities in config.json — otherwise OpenCode strips images before they reach the model:
{
"mcp": {
"cua-driver": {
"type": "local",
"command": ["/Users/you/.local/bin/cua-driver", "mcp"],
"enabled": true
}
},
"provider": {
"ollama": {
"npm": "@ai-sdk/openai-compatible",
"options": { "baseURL": "http://localhost:11434/v1" },
"models": {
"llama3.2-vision": {
"modalities": {
"input": ["text", "image"],
"output": ["text"]
}
}
}
}
}
}The modalities field is required because OpenCode's @ai-sdk/openai-compatible provider defaults to text-only when no capabilities are declared. Without it, screenshots are replaced with an error string and never reach the model.
Hermes (NousResearch)
cua-driver mcp-config --client hermesPaste the output into ~/.hermes/config.yaml.
OpenClaw
cua-driver mcp-config --client openclawAntigravity CLI (and Antigravity IDE)
Google's Antigravity CLI — the agy binary that replaced Gemini CLI on May 19, 2026 — reads MCP server configs from ~/.gemini/config/mcp_config.json (Unix) or %USERPROFILE%\.gemini\config\mcp_config.json (Windows). The same file is read by Antigravity IDE.
cua-driver mcp-config --client antigravityOutput is a snippet to paste under the top-level mcpServers object in that file. If the file doesn't exist yet, create it with {"mcpServers": {}} first. Antigravity CLI has no in-shell mcp add subcommand — edit the JSON file directly, then restart agy to pick up the change.
--client gemini is accepted as a legacy alias and emits the same instructions — Antigravity inherited the ~/.gemini/ config tree from the old Gemini CLI install on purpose, so existing setups carry over without re-registration.
Other clients
For any client that accepts the standard mcpServers shape:
cua-driver mcp-configOutput:
{
"mcpServers": {
"cua-driver": {
"command": "cua-driver",
"args": ["mcp"]
}
}
}Was this page helpful?