Linux pre-release
Display server (X11 / Wayland), AT-SPI prerequisites, autostart, and headless workflows for the Linux pre-release backend.
The Linux backend is available as a pre-release target. It has Linux implementations under crates/platform-linux/, but it has not completed the same official release and platform validation pass as macOS and Windows. Use it for early testing, expect gaps, and report issues before relying on it in production workflows.
For the install one-liner itself, see Installation — the canonical script auto-detects Linux and routes to the pre-release Rust backend.
Display server: X11 vs Wayland vs XWayland
Cua Driver's window enumeration, input synthesis, and screen capture talk to your display server directly. The Linux pre-release backend targets X11 (including the XWayland compatibility server on Wayland sessions), and probes which one you're on at startup:
cua-driver doctorThe display server probe surfaces:
DISPLAYset,WAYLAND_DISPLAYunset — pure X11 session. This is the expected path for the current pre-release backend.WAYLAND_DISPLAYset,DISPLAYset — Wayland session with XWayland. cua-driver talks to XWayland (treats the session as X11). Native-Wayland-only apps that bypass XWayland aren't visible to the tools — call this out as a known limitation.WAYLAND_DISPLAYset,DISPLAYunset — pure Wayland session, no XWayland. Not supported today. Run an X11 session, or install / enable XWayland.- Neither set — headless. Window-driving tools will return errors. See Headless workflow for
Xvfb.
Wayland caveat. Wayland's security model deliberately isolates apps from each other's window
state and input — exactly what cua-driver needs to do its job. The Rust port works around this by
talking to XWayland, which acts as an X11 proxy for Wayland clients. Apps that render natively
against Wayland (no XWayland) — most modern Firefox builds, GTK4 apps, etc. — won't show up in
list_windows and aren't clickable. If your target app is one of those, run an X11 session (most
distros let you pick X11 vs Wayland at the login screen).
AT-SPI prerequisite
The accessibility-tree tools (get_window_state, click with element_index, anything that walks UIA-equivalent structure) talk to the AT-SPI 2 D-Bus bus. AT-SPI is the Linux accessibility framework — same role as macOS's AX or Windows' UI Automation. Cua Driver expects the AT-SPI bus to be reachable on your session bus.
cua-driver probes this at startup via gdbus:
cua-driver doctor
# [ok ] AT-SPI bus: org.a11y.Bus reachable on session bus
# ...or, if not:
# [warn] AT-SPI bus: org.a11y.Bus not reachable
# install at-spi2-core (or equivalent) and ensure your desktop is
# running with accessibility enabledIf the probe fails, install the AT-SPI service and enable accessibility:
| Distro | Package | Enable |
|---|---|---|
| Ubuntu / Debian | at-spi2-core (usually pre-installed under GNOME) | gsettings set org.gnome.desktop.interface toolkit-accessibility true |
| Fedora | at-spi2-core | same gsettings command |
| Arch | at-spi2-core (extra) | gsettings set … if running GNOME; KDE has its own toggle |
| Alpine / minimal | at-spi2-core dbus dbus-x11 | run dbus-launch --exit-with-session from your X startup script |
You don't need accessibility enabled system-wide — it needs to be on for the user session that's running Cua Driver. Most desktop environments toggle this automatically when an accessibility client connects.
Why AT-SPI and not raw X11 properties? Many Linux apps don't expose their UI structure through
X11 window manager hints — only through AT-SPI. Without AT-SPI, click would have to fall back to
raw pixel coordinates from screenshots (the same vision-only mode that works as a fallback). With
AT-SPI, Cua Driver can address elements by their accessible name / role / index, matching the
agent ergonomics on macOS (AX) and Windows (UIA).
Autostart on Linux
The cua-driver autostart verb family is Windows-only today. On Linux it currently returns:
cua-driver autostart is currently Windows-only. macOS users: see
libs/cua-driver/scripts/install-local.sh --autostart for the
LaunchAgent recipe. Linux users: same script registers a systemd
--user unit. A cross-platform impl is tracked as a follow-up.The two working alternatives:
Option A — Use install-local.sh --autostart
The dev-loop install script registers a systemd user unit when invoked with --autostart. See libs/cua-driver/scripts/install-local.sh — built for local development off a git checkout, but the systemd unit it writes is the same shape you'd use in production.
Option B — Write your own systemd user unit
Save the following to ~/.config/systemd/user/cua-driver.service:
[Unit]
Description=cua-driver background daemon
# Wait for the graphical session (DISPLAY / WAYLAND_DISPLAY will be set).
After=graphical-session.target
PartOf=graphical-session.target
[Service]
Type=simple
ExecStart=%h/.local/bin/cua-driver serve
Restart=on-failure
RestartSec=2
[Install]
WantedBy=graphical-session.targetThen enable + start:
systemctl --user daemon-reload
systemctl --user enable --now cua-driver.service
systemctl --user status cua-driver.service # confirm
cua-driver status # confirm via daemon socketOn most distros you'll also want loginctl enable-linger $USER if you intend the daemon to keep running after the user logs out (e.g. CI runners that ssh in to drive cua-driver but never have an interactive session).
See the Autostart concept page for the cross-platform breakdown.
Headless workflow
Linux doesn't have Windows' Session 0 isolation — sshd spawns processes that inherit the user's environment cleanly, including DISPLAY / WAYLAND_DISPLAY when they're set. So the "SSH + daemon proxy" dance that's necessary on Windows (Running Cua Driver under SSH on Windows) is not needed on Linux: as long as a display server is running on your session, cua-driver serve over SSH connects to it.
For pure headless (no display server at all — typical CI runner), use Xvfb to provide a virtual X server:
# Install Xvfb (Ubuntu/Debian: xvfb; Fedora: xorg-x11-server-Xvfb; Arch: xorg-server-xvfb)
sudo apt-get install -y xvfb at-spi2-core dbus-x11
# Run Cua Driver under a virtual X server:
xvfb-run -a cua-driver serve
# Or, more explicitly, launch Xvfb yourself + point cua-driver at it:
Xvfb :99 -screen 0 1920x1080x24 &
export DISPLAY=:99
dbus-launch --exit-with-session cua-driver serveThis is a useful recipe for local CI experiments with the Linux pre-release backend.
Tools that don't need a display. Even without DISPLAY set, the non-graphical tools
(list_apps, launch_app via PATH lookup, read_clipboard if a clipboard daemon is up) still
work. Only the desktop-touching tools (click, screenshot, list_windows, get_window_state)
require a display server. cua-driver doctor makes this explicit per-tool.
Distro-specific notes
The canonical install script (/bin/bash -c "$(curl -fsSL …/install.sh)") downloads a Linux x86_64 binary tarball from GitHub Releases, drops it into ~/.cua-driver/packages/releases/<v>-x86_64-unknown-linux-gnu/, and symlinks ~/.local/bin/cua-driver. Because Linux support is pre-release, expect distro-specific gaps and prerequisites.
The bit that varies is what accessibility / display tooling is pre-installed:
- Ubuntu / Debian — GNOME ships AT-SPI by default;
at-spi2-coreis usually present. KDE Plasma needsat-spi2-coreinstalled manually. - Fedora — GNOME ships AT-SPI; same as Ubuntu under GNOME. Wayland is the default session on Fedora 25+; if AT-SPI feels flaky, switch to "GNOME on Xorg" at the login screen.
- Arch / Manjaro — minimal install; you'll likely need to explicitly install
at-spi2-coreand ensuredbus-launchruns as part of your session startup. - Alpine / busybox-style — slimmest of all; needs
dbus,dbus-x11,at-spi2-coreplus likely an explicitdbus-launch --exit-with-sessionwrapper.
If cua-driver doctor is happy after install, you have a reasonable starting point for testing the pre-release backend.
Per-tool Linux implementation matrix
See libs/cua-driver/rust/PARITY.md for the per-tool Linux implementation matrix. Treat it as implementation tracking, not an official support certification; the backend remains pre-release until Linux testing and release coverage are complete.
See also
- Installation — canonical one-liner that handles the Linux pre-release backend
- Autostart — concept page (Windows-only verb family; Linux uses systemd)
- Process model — Linux has none of the macOS-TCC or Windows-Session-0 weirdness; this page explains what those problems are and why Linux sidesteps them
PARITY.md— per-tool platform implementation matrix
Was this page helpful?