CuaReference

Computer SDK API Reference

Python API reference for controlling virtual machines and computer interfaces

v0.5.12pip install cua-computer

Cua Computer Interface for cross-platform computer control.

Classes

ClassDescription
ComputerComputer is the main class for interacting with the computer.
VMProviderTypeEnum of supported VM provider types.

Computer

Computer is the main class for interacting with the computer.

Constructor

Computer(self, display: Union[Display, Dict[str, int], str] = '1024x768', memory: str = '8GB', cpu: str = '4', os_type: OSType = 'macos', name: str = '', image: Optional[str] = None, shared_directories: Optional[List[str]] = None, use_host_computer_server: bool = False, verbosity: Union[int, LogLevel] = logging.INFO, telemetry_enabled: bool = True, provider_type: Union[str, VMProviderType] = VMProviderType.LUME, provider_port: Optional[int] = 7777, noVNC_port: Optional[int] = 8006, api_port: Optional[int] = None, host: str = 'localhost', api_host: Optional[str] = None, storage: Optional[str] = None, ephemeral: bool = False, api_key: Optional[str] = None, experiments: Optional[List[str]] = None, timeout: int = 100, run_opts: Optional[Dict[str, Any]] = None)

Attributes

NameTypeDescription
loggerAny
imageAny
hostAny
provider_portAny
noVNC_portAny
api_portAny
api_hostAny
os_typeAny
provider_typeAny
ephemeralAny
api_keyAny
timeoutAny
experimentsAny
custom_run_optsAny
storageAny
shared_pathAny
verbosityAny
vm_loggerAny
interface_loggerAny
configAny
shared_directoriesAny
use_host_computer_serverAny
interfaceAnyGet the computer interface for interacting with the VM.
tracingComputerTracingGet the computer tracing instance for recording sessions.
telemetry_enabledboolCheck if telemetry is enabled for this computer instance.

Methods

Computer.create_desktop_from_apps

def create_desktop_from_apps(self, apps)

Create a virtual desktop from a list of app names, returning a DioramaComputer that proxies Diorama.Interface but uses diorama_cmds via the computer interface.

Parameters:

NameTypeDescription
appslist[str]List of application names to include in the desktop.

Returns: DioramaComputer: A proxy object with the Diorama interface, but using diorama_cmds.

Computer.run

async def run(self) -> Optional[str]

Initialize the VM and computer interface.

Computer.disconnect

async def disconnect(self) -> None

Disconnect from the computer's WebSocket interface.

Computer.stop

async def stop(self) -> None

Disconnect from the computer's WebSocket interface and stop the computer.

Computer.start

async def start(self) -> None

Start the computer.

Computer.restart

async def restart(self) -> None

Restart the computer.

If using a VM provider that supports restart, this will issue a restart without tearing down the provider context, then reconnect the interface. Falls back to stop()+run() when a provider restart is not available.

Computer.get_ip

async def get_ip(self, max_retries: int = 15, retry_delay: int = 3) -> str

Get the IP address of the VM or localhost if using host computer server.

This method delegates to the provider's get_ip method, which waits indefinitely until the VM has a valid IP address.

Parameters:

NameTypeDescription
max_retriesAnyUnused parameter, kept for backward compatibility
retry_delayAnyDelay between retries in seconds (default: 2)

Returns: IP address of the VM or localhost if using host computer server

Computer.wait_vm_ready

async def wait_vm_ready(self) -> Optional[Dict[str, Any]]

Wait for VM to be ready with an IP address.

Returns: VM status information or None if using host computer server.

Computer.update

async def update(self, cpu: Optional[int] = None, memory: Optional[str] = None)

Update VM settings.

Computer.get_screenshot_size

def get_screenshot_size(self, screenshot: bytes) -> Dict[str, int]

Get the dimensions of a screenshot.

Parameters:

NameTypeDescription
screenshotAnyThe screenshot bytes

Returns: Dict[str, int]: Dictionary containing 'width' and 'height' of the image

Computer.to_screen_coordinates

async def to_screen_coordinates(self, x: float, y: float) -> tuple[float, float]

Convert normalized coordinates to screen coordinates.

Parameters:

NameTypeDescription
xAnyX coordinate between 0 and 1
yAnyY coordinate between 0 and 1

Returns: tuple[float, float]: Screen coordinates (x, y)

Computer.to_screenshot_coordinates

async def to_screenshot_coordinates(self, x: float, y: float) -> tuple[float, float]

Convert screen coordinates to screenshot coordinates.

Parameters:

NameTypeDescription
xAnyX coordinate in screen space
yAnyY coordinate in screen space

Returns: tuple[float, float]: (x, y) coordinates in screenshot space

Computer.playwright_exec

async def playwright_exec(self, command: str, params: Optional[Dict] = None) -> Dict[str, Any]

Execute a Playwright browser command.

Parameters:

NameTypeDescription
commandAnyThe browser command to execute (visit_url, click, type, scroll, web_search)
paramsAnyCommand parameters

Returns: Dict containing the command result

Example:

# Navigate to a URL
await computer.playwright_exec("visit_url", {"url": "https://example.com"})

# Click at coordinates
await computer.playwright_exec("click", {"x": 100, "y": 200})

# Type text
await computer.playwright_exec("type", {"text": "Hello, world!"})

# Scroll
await computer.playwright_exec("scroll", {"delta_x": 0, "delta_y": -100})

# Web search
await computer.playwright_exec("web_search", {"query": "computer use agent"})

Computer.venv_install

async def venv_install(self, venv_name: str, requirements: list[str])

Install packages in a UV project.

Parameters:

NameTypeDescription
venv_nameAnyName of the UV project
requirementsAnyList of package requirements to install

Returns: Tuple of (stdout, stderr) from the installation command

Computer.pip_install

async def pip_install(self, requirements: list[str])

Install packages using the system Python with UV (no venv).

Parameters:

NameTypeDescription
requirementsAnyList of package requirements to install globally/user site.

Returns: Tuple of (stdout, stderr) from the installation command

Computer.venv_cmd

async def venv_cmd(self, venv_name: str, command: str)

Execute a shell command in a UV project.

Parameters:

NameTypeDescription
venv_nameAnyName of the UV project
commandAnyShell command to execute in the UV project

Returns: Tuple of (stdout, stderr) from the command execution

Computer.venv_exec

async def venv_exec(self, venv_name: str, python_func, args = (), kwargs = {})

Execute Python function in a virtual environment using source code extraction.

Parameters:

NameTypeDescription
venv_nameAnyName of the virtual environment
python_funcAnyA callable function to execute *args: Positional arguments to pass to the function **kwargs: Keyword arguments to pass to the function

Returns: The result of the function execution, or raises any exception that occurred

Computer.venv_exec_background

async def venv_exec_background(self, venv_name: str, python_func, args = (), requirements: Optional[List[str]] = None, kwargs = {}) -> int

Run the Python function in the venv in the background and return the PID.

Uses a short launcher Python that spawns a detached child and exits immediately.

Computer.python_exec

async def python_exec(self, python_func, args = (), kwargs = {})

Execute a Python function using the system Python (no venv).

Uses source extraction and base64 transport, mirroring venv_exec but without virtual environment activation.

Returns the function result or raises a reconstructed exception with remote traceback context appended.

Computer.python_exec_background

async def python_exec_background(self, python_func, args = (), requirements: Optional[List[str]] = None, kwargs = {}) -> int

Run a Python function with the system interpreter in the background and return PID.

Uses a short launcher Python that spawns a detached child and exits immediately.

Computer.python_command

def python_command(self, requirements: Optional[List[str]] = None, venv_name: str = 'default', use_system_python: bool = False, background: bool = False) -> Callable[[Callable[P, R]], Callable[P, Awaitable[R]]]

Decorator to execute a Python function remotely in this Computer's venv.

This mirrors computer.helpers.sandboxed() but binds to this instance and optionally ensures required packages are installed before execution.

Parameters:

NameTypeDescription
requirementsAnyPackages to install in the virtual environment.
venv_nameAnyName of the virtual environment to use.
use_system_pythonAnyIf True, use the system Python/pip instead of a venv.
backgroundAnyIf True, run the function detached and return the child PID immediately.

Returns: A decorator that turns a local function into an async callable which runs remotely and returns the function's result.


VMProviderType

Inherits from: StrEnum

Enum of supported VM provider types.

Attributes

NameTypeDescription
LUMEAny
LUMIERAny
CLOUDAny
CLOUDV2Any
WINSANDBOXAny
DOCKERAny
UNKNOWNAny

tracing

Computer tracing functionality for recording sessions.

This module provides a Computer.tracing API inspired by Playwright's tracing functionality, allowing users to record computer interactions for debugging, training, and analysis.


ComputerTracing

Computer tracing class that records computer interactions and saves them to disk.

This class provides a flexible API for recording computer sessions with configurable options for what to record (screenshots, API calls, video, etc.).

Constructor

ComputerTracing(self, computer_instance)

Attributes

NameTypeDescription
is_tracingboolCheck if tracing is currently active.

Methods

ComputerTracing.start

async def start(self, config: Optional[Dict[str, Any]] = None) -> None

Start tracing with the specified configuration.

Parameters:

NameTypeDescription
configAnyTracing configuration dict with options: - video: bool - Record video frames (default: False) - screenshots: bool - Record screenshots (default: True) - api_calls: bool - Record API calls and results (default: True) - accessibility_tree: bool - Record accessibility tree snapshots (default: False) - metadata: bool - Record custom metadata (default: True) - name: str - Custom trace name (default: auto-generated) - path: str - Custom trace directory path (default: auto-generated)

ComputerTracing.stop

async def stop(self, options: Optional[Dict[str, Any]] = None) -> str

Stop tracing and save the trace data.

Parameters:

NameTypeDescription
optionsAnyStop options dict with: - path: str - Custom output path for the trace archive - format: str - Output format ('zip' or 'dir', default: 'zip')

Returns: str: Path to the saved trace file or directory

ComputerTracing.record_api_call

async def record_api_call(self, method: str, args: Dict[str, Any], result: Any = None, error: Optional[Exception] = None) -> None

Record an API call event.

Parameters:

NameTypeDescription
methodAnyThe method name that was called
argsAnyArguments passed to the method
resultAnyResult returned by the method
errorAnyException raised by the method, if any

ComputerTracing.record_accessibility_tree

async def record_accessibility_tree(self) -> None

Record the current accessibility tree if enabled.

ComputerTracing.add_metadata

async def add_metadata(self, key: str, value: Any) -> None

Add custom metadata to the trace.

Parameters:

NameTypeDescription
keyAnyMetadata key
valueAnyMetadata value

models

Models for computer configuration.


BaseVMProvider

Inherits from: AsyncContextManager

Base interface for VM providers.

All VM provider implementations must implement this interface.

Attributes

NameTypeDescription
provider_typeVMProviderTypeGet the provider type.

Methods

BaseVMProvider.get_vm

async def get_vm(self, name: str, storage: Optional[str] = None) -> Dict[str, Any]

Get VM information by name.

Parameters:

NameTypeDescription
nameAnyName of the VM to get information for
storageAnyOptional storage path override. If provided, this will be used instead of the provider's default storage path.

Returns: Dictionary with VM information including status, IP address, etc.

BaseVMProvider.list_vms

async def list_vms(self) -> ListVMsResponse

List all available VMs.

Returns: ListVMsResponse: A list of minimal VM objects as defined in computer.providers.types.MinimalVM.

BaseVMProvider.run_vm

async def run_vm(self, image: str, name: str, run_opts: Dict[str, Any], storage: Optional[str] = None) -> Dict[str, Any]

Run a VM by name with the given options.

Parameters:

NameTypeDescription
imageAnyName/tag of the image to use
nameAnyName of the VM to run
run_optsAnyDictionary of run options (memory, cpu, etc.)
storageAnyOptional storage path override. If provided, this will be used instead of the provider's default storage path.

Returns: Dictionary with VM run status and information

BaseVMProvider.stop_vm

async def stop_vm(self, name: str, storage: Optional[str] = None) -> Dict[str, Any]

Stop a VM by name.

Parameters:

NameTypeDescription
nameAnyName of the VM to stop
storageAnyOptional storage path override. If provided, this will be used instead of the provider's default storage path.

Returns: Dictionary with VM stop status and information

BaseVMProvider.restart_vm

async def restart_vm(self, name: str, storage: Optional[str] = None) -> Dict[str, Any]

Restart a VM by name.

Parameters:

NameTypeDescription
nameAnyName of the VM to restart
storageAnyOptional storage path override. If provided, this will be used instead of the provider's default storage path.

Returns: Dictionary with VM restart status and information

BaseVMProvider.update_vm

async def update_vm(self, name: str, update_opts: Dict[str, Any], storage: Optional[str] = None) -> Dict[str, Any]

Update VM configuration.

Parameters:

NameTypeDescription
nameAnyName of the VM to update
update_optsAnyDictionary of update options (memory, cpu, etc.)
storageAnyOptional storage path override. If provided, this will be used instead of the provider's default storage path.

Returns: Dictionary with VM update status and information

BaseVMProvider.get_ip

async def get_ip(self, name: str, storage: Optional[str] = None, retry_delay: int = 2) -> str

Get the IP address of a VM, waiting indefinitely until it's available.

Parameters:

NameTypeDescription
nameAnyName of the VM to get the IP for
storageAnyOptional storage path override. If provided, this will be used instead of the provider's default storage path.
retry_delayAnyDelay between retries in seconds (default: 2)

Returns: IP address of the VM when it becomes available


Display

Display configuration.

Constructor

Display(self, width: int, height: int) -> None

Attributes

NameTypeDescription
widthint
heightint

Image

VM image configuration.

Constructor

Image(self, image: str, tag: str, name: str) -> None

Attributes

NameTypeDescription
imagestr
tagstr
namestr

Computer

Computer configuration.

Constructor

Computer(self, image: str, tag: str, name: str, display: Display, memory: str, cpu: str, vm_provider: Optional[BaseVMProvider] = None) -> None

Attributes

NameTypeDescription
imagestr
tagstr
namestr
displayDisplay
memorystr
cpustr
vm_providerOptional[BaseVMProvider]

Methods

Computer.get_ip

async def get_ip(self) -> Optional[str]

Get the IP address of the VM.


diorama_computer


Key

Inherits from: Enum

Keyboard keys that can be used with press_key.

These key names follow a consistent cross-platform keyboard key naming convention.

Attributes

NameTypeDescription
PAGE_DOWNAny
PAGE_UPAny
HOMEAny
ENDAny
LEFTAny
RIGHTAny
UPAny
DOWNAny
RETURNAny
ENTERAny
ESCAPEAny
ESCAny
TABAny
SPACEAny
BACKSPACEAny
DELETEAny
ALTAny
CTRLAny
SHIFTAny
WINAny
COMMANDAny
OPTIONAny
F1Any
F2Any
F3Any
F4Any
F5Any
F6Any
F7Any
F8Any
F9Any
F10Any
F11Any
F12Any

Methods

Key.from_string

def from_string(cls, key: str) -> Key | str

Convert a string key name to a Key enum value.

Parameters:

NameTypeDescription
keyAnyString key name to convert

Returns: Key enum value if the string matches a known key, otherwise returns the original string for single character keys


DioramaComputer

A Computer-compatible proxy for Diorama that sends commands over the ComputerInterface.

Constructor

DioramaComputer(self, computer, apps)

Attributes

NameTypeDescription
computerAny
appsAny
interfaceAny

Methods

DioramaComputer.run

async def run(self)

Initialize and run the DioramaComputer if not already initialized.

Returns: self: The DioramaComputer instance


DioramaComputerInterface

Diorama Interface proxy that sends diorama_cmds via the Computer's interface.

Constructor

DioramaComputerInterface(self, computer, apps)

Attributes

NameTypeDescription
computerAny
appsAny

Methods

DioramaComputerInterface.screenshot

async def screenshot(self, as_bytes = True)

Take a screenshot of the diorama scene.

Parameters:

NameTypeDescription
as_bytesboolIf True, return image as bytes; if False, return PIL Image object

Returns: bytes or PIL.Image: Screenshot data in the requested format

DioramaComputerInterface.get_screen_size

async def get_screen_size(self)

Get the dimensions of the diorama scene.

Returns: dict: Dictionary containing 'width' and 'height' keys with pixel dimensions

DioramaComputerInterface.move_cursor

async def move_cursor(self, x, y)

Move the cursor to the specified coordinates.

Parameters:

NameTypeDescription
xintX coordinate to move cursor to
yintY coordinate to move cursor to

DioramaComputerInterface.left_click

async def left_click(self, x = None, y = None)

Perform a left mouse click at the specified coordinates or current cursor position.

Parameters:

NameTypeDescription
xint, optionalX coordinate to click at. If None, clicks at current cursor position
yint, optionalY coordinate to click at. If None, clicks at current cursor position

DioramaComputerInterface.right_click

async def right_click(self, x = None, y = None)

Perform a right mouse click at the specified coordinates or current cursor position.

Parameters:

NameTypeDescription
xint, optionalX coordinate to click at. If None, clicks at current cursor position
yint, optionalY coordinate to click at. If None, clicks at current cursor position

DioramaComputerInterface.double_click

async def double_click(self, x = None, y = None)

Perform a double mouse click at the specified coordinates or current cursor position.

Parameters:

NameTypeDescription
xint, optionalX coordinate to double-click at. If None, clicks at current cursor position
yint, optionalY coordinate to double-click at. If None, clicks at current cursor position

DioramaComputerInterface.scroll_up

async def scroll_up(self, clicks = 1)

Scroll up by the specified number of clicks.

Parameters:

NameTypeDescription
clicksintNumber of scroll clicks to perform upward. Defaults to 1

DioramaComputerInterface.scroll_down

async def scroll_down(self, clicks = 1)

Scroll down by the specified number of clicks.

Parameters:

NameTypeDescription
clicksintNumber of scroll clicks to perform downward. Defaults to 1

DioramaComputerInterface.drag_to

async def drag_to(self, x, y, duration = 0.5)

Drag from the current cursor position to the specified coordinates.

Parameters:

NameTypeDescription
xintX coordinate to drag to
yintY coordinate to drag to
durationfloatDuration of the drag operation in seconds. Defaults to 0.5

DioramaComputerInterface.get_cursor_position

async def get_cursor_position(self)

Get the current cursor position.

Returns: dict: Dictionary containing the current cursor coordinates

DioramaComputerInterface.type_text

async def type_text(self, text)

Type the specified text at the current cursor position.

Parameters:

NameTypeDescription
textstrThe text to type

DioramaComputerInterface.press_key

async def press_key(self, key)

Press a single key.

Parameters:

NameTypeDescription
keyAnyThe key to press

DioramaComputerInterface.hotkey

async def hotkey(self, keys = ())

Press multiple keys simultaneously as a hotkey combination.

Raises:

  • ValueError - If any key is not a Key enum or string type

DioramaComputerInterface.to_screen_coordinates

async def to_screen_coordinates(self, x, y)

Convert coordinates to screen coordinates.

Parameters:

NameTypeDescription
xintX coordinate to convert
yintY coordinate to convert

Returns: dict: Dictionary containing the converted screen coordinates


helpers

Helper functions and decorators for the Computer module.


DependencyInfo

Inherits from: TypedDict

Attributes

NameTypeDescription
import_statementsList[str]
definitionsList[tuple[str, Any]]

set_default_computer

def set_default_computer(computer: Any) -> None

Set the default computer instance to be used by the remote decorator.

Parameters:

NameTypeDescription
computerAnyThe computer instance to use as default

sandboxed

def sandboxed(venv_name: str = 'default', computer: str = 'default', max_retries: int = 3) -> Callable[[Callable[P, R]], Callable[P, Awaitable[R]]]

Decorator that wraps a function to be executed remotely via computer.venv_exec

The function is automatically analyzed for dependencies (imports, helper functions, constants, etc.) and reconstructed with all necessary code in the remote sandbox.

Parameters:

NameTypeDescription
venv_nameAnyName of the virtual environment to execute in
computerAnyThe computer instance to use, or "default" to use the globally set default
max_retriesAnyMaximum number of retries for the remote execution

generate_source_code

def generate_source_code(func: FunctionType) -> str

Generate complete source code for a function with all dependencies.

Parameters:

NameTypeDescription
funcAnyThe function to generate source code for

Returns: Complete Python source code as a string


interface

Interface package for Computer SDK.


BaseComputerInterface

Inherits from: ABC

Base class for computer control interfaces.

Constructor

BaseComputerInterface(self, ip_address: str, username: str = 'lume', password: str = 'lume', api_key: Optional[str] = None, vm_name: Optional[str] = None)

Attributes

NameTypeDescription
ip_addressAny
usernameAny
passwordAny
api_keyAny
vm_nameAny
loggerAny
delayfloat

Methods

BaseComputerInterface.wait_for_ready

async def wait_for_ready(self, timeout: int = 60) -> None

Wait for interface to be ready.

Parameters:

NameTypeDescription
timeoutAnyMaximum time to wait in seconds

Raises:

  • TimeoutError - If interface is not ready within timeout

BaseComputerInterface.close

def close(self) -> None

Close the interface connection.

BaseComputerInterface.force_close

def force_close(self) -> None

Force close the interface connection.

By default, this just calls close(), but subclasses can override to provide more forceful cleanup.

BaseComputerInterface.mouse_down

async def mouse_down(self, x: Optional[int] = None, y: Optional[int] = None, button: MouseButton = 'left', delay: Optional[float] = None) -> None

Press and hold a mouse button.

Parameters:

NameTypeDescription
xAnyX coordinate to press at. If None, uses current cursor position.
yAnyY coordinate to press at. If None, uses current cursor position.
buttonAnyMouse button to press ('left', 'middle', 'right').
delayAnyOptional delay in seconds after the action

BaseComputerInterface.mouse_up

async def mouse_up(self, x: Optional[int] = None, y: Optional[int] = None, button: MouseButton = 'left', delay: Optional[float] = None) -> None

Release a mouse button.

Parameters:

NameTypeDescription
xAnyX coordinate to release at. If None, uses current cursor position.
yAnyY coordinate to release at. If None, uses current cursor position.
buttonAnyMouse button to release ('left', 'middle', 'right').
delayAnyOptional delay in seconds after the action

BaseComputerInterface.left_click

async def left_click(self, x: Optional[int] = None, y: Optional[int] = None, delay: Optional[float] = None) -> None

Perform a left mouse button click.

Parameters:

NameTypeDescription
xAnyX coordinate to click at. If None, uses current cursor position.
yAnyY coordinate to click at. If None, uses current cursor position.
delayAnyOptional delay in seconds after the action

BaseComputerInterface.right_click

async def right_click(self, x: Optional[int] = None, y: Optional[int] = None, delay: Optional[float] = None) -> None

Perform a right mouse button click.

Parameters:

NameTypeDescription
xAnyX coordinate to click at. If None, uses current cursor position.
yAnyY coordinate to click at. If None, uses current cursor position.
delayAnyOptional delay in seconds after the action

BaseComputerInterface.double_click

async def double_click(self, x: Optional[int] = None, y: Optional[int] = None, delay: Optional[float] = None) -> None

Perform a double left mouse button click.

Parameters:

NameTypeDescription
xAnyX coordinate to double-click at. If None, uses current cursor position.
yAnyY coordinate to double-click at. If None, uses current cursor position.
delayAnyOptional delay in seconds after the action

BaseComputerInterface.move_cursor

async def move_cursor(self, x: int, y: int, delay: Optional[float] = None) -> None

Move the cursor to the specified screen coordinates.

Parameters:

NameTypeDescription
xAnyX coordinate to move cursor to.
yAnyY coordinate to move cursor to.
delayAnyOptional delay in seconds after the action

BaseComputerInterface.drag_to

async def drag_to(self, x: int, y: int, button: str = 'left', duration: float = 0.5, delay: Optional[float] = None) -> None

Drag from current position to specified coordinates.

Parameters:

NameTypeDescription
xAnyThe x coordinate to drag to
yAnyThe y coordinate to drag to
buttonAnyThe mouse button to use ('left', 'middle', 'right')
durationAnyHow long the drag should take in seconds
delayAnyOptional delay in seconds after the action

BaseComputerInterface.drag

async def drag(self, path: List[Tuple[int, int]], button: str = 'left', duration: float = 0.5, delay: Optional[float] = None) -> None

Drag the cursor along a path of coordinates.

Parameters:

NameTypeDescription
pathAnyList of (x, y) coordinate tuples defining the drag path
buttonAnyThe mouse button to use ('left', 'middle', 'right')
durationAnyTotal time in seconds that the drag operation should take
delayAnyOptional delay in seconds after the action

BaseComputerInterface.key_down

async def key_down(self, key: str, delay: Optional[float] = None) -> None

Press and hold a key.

Parameters:

NameTypeDescription
keyAnyThe key to press and hold (e.g., 'a', 'shift', 'ctrl').
delayAnyOptional delay in seconds after the action.

BaseComputerInterface.key_up

async def key_up(self, key: str, delay: Optional[float] = None) -> None

Release a previously pressed key.

Parameters:

NameTypeDescription
keyAnyThe key to release (e.g., 'a', 'shift', 'ctrl').
delayAnyOptional delay in seconds after the action.

BaseComputerInterface.type_text

async def type_text(self, text: str, delay: Optional[float] = None) -> None

Type the specified text string.

Parameters:

NameTypeDescription
textAnyThe text string to type.
delayAnyOptional delay in seconds after the action.

BaseComputerInterface.press_key

async def press_key(self, key: str, delay: Optional[float] = None) -> None

Press and release a single key.

Parameters:

NameTypeDescription
keyAnyThe key to press (e.g., 'a', 'enter', 'escape').
delayAnyOptional delay in seconds after the action.

BaseComputerInterface.hotkey

async def hotkey(self, keys: str = (), delay: Optional[float] = None) -> None

Press multiple keys simultaneously (keyboard shortcut).

Parameters:

NameTypeDescription
delayAnyOptional delay in seconds after the action.

BaseComputerInterface.scroll

async def scroll(self, x: int, y: int, delay: Optional[float] = None) -> None

Scroll the mouse wheel by specified amounts.

Parameters:

NameTypeDescription
xAnyHorizontal scroll amount (positive = right, negative = left).
yAnyVertical scroll amount (positive = up, negative = down).
delayAnyOptional delay in seconds after the action.

BaseComputerInterface.scroll_down

async def scroll_down(self, clicks: int = 1, delay: Optional[float] = None) -> None

Scroll down by the specified number of clicks.

Parameters:

NameTypeDescription
clicksAnyNumber of scroll clicks to perform downward.
delayAnyOptional delay in seconds after the action.

BaseComputerInterface.scroll_up

async def scroll_up(self, clicks: int = 1, delay: Optional[float] = None) -> None

Scroll up by the specified number of clicks.

Parameters:

NameTypeDescription
clicksAnyNumber of scroll clicks to perform upward.
delayAnyOptional delay in seconds after the action.

BaseComputerInterface.screenshot

async def screenshot(self) -> bytes

Take a screenshot.

Returns: Raw bytes of the screenshot image

BaseComputerInterface.get_screen_size

async def get_screen_size(self) -> Dict[str, int]

Get the screen dimensions.

Returns: Dict with 'width' and 'height' keys

BaseComputerInterface.get_cursor_position

async def get_cursor_position(self) -> Dict[str, int]

Get the current cursor position on screen.

Returns: Dict with 'x' and 'y' keys containing cursor coordinates.

BaseComputerInterface.copy_to_clipboard

async def copy_to_clipboard(self) -> str

Get the current clipboard content.

Returns: The text content currently stored in the clipboard.

BaseComputerInterface.set_clipboard

async def set_clipboard(self, text: str) -> None

Set the clipboard content to the specified text.

Parameters:

NameTypeDescription
textAnyThe text to store in the clipboard.

BaseComputerInterface.file_exists

async def file_exists(self, path: str) -> bool

Check if a file exists at the specified path.

Parameters:

NameTypeDescription
pathAnyThe file path to check.

Returns: True if the file exists, False otherwise.

BaseComputerInterface.directory_exists

async def directory_exists(self, path: str) -> bool

Check if a directory exists at the specified path.

Parameters:

NameTypeDescription
pathAnyThe directory path to check.

Returns: True if the directory exists, False otherwise.

BaseComputerInterface.list_dir

async def list_dir(self, path: str) -> List[str]

List the contents of a directory.

Parameters:

NameTypeDescription
pathAnyThe directory path to list.

Returns: List of file and directory names in the specified directory.

BaseComputerInterface.read_text

async def read_text(self, path: str) -> str

Read the text contents of a file.

Parameters:

NameTypeDescription
pathAnyThe file path to read from.

Returns: The text content of the file.

BaseComputerInterface.write_text

async def write_text(self, path: str, content: str) -> None

Write text content to a file.

Parameters:

NameTypeDescription
pathAnyThe file path to write to.
contentAnyThe text content to write.

BaseComputerInterface.read_bytes

async def read_bytes(self, path: str, offset: int = 0, length: Optional[int] = None) -> bytes

Read file binary contents with optional seeking support.

Parameters:

NameTypeDescription
pathAnyPath to the file
offsetAnyByte offset to start reading from (default: 0)
lengthAnyNumber of bytes to read (default: None for entire file)

BaseComputerInterface.write_bytes

async def write_bytes(self, path: str, content: bytes) -> None

Write binary content to a file.

Parameters:

NameTypeDescription
pathAnyThe file path to write to.
contentAnyThe binary content to write.

BaseComputerInterface.delete_file

async def delete_file(self, path: str) -> None

Delete a file at the specified path.

Parameters:

NameTypeDescription
pathAnyThe file path to delete.

BaseComputerInterface.create_dir

async def create_dir(self, path: str) -> None

Create a directory at the specified path.

Parameters:

NameTypeDescription
pathAnyThe directory path to create.

BaseComputerInterface.delete_dir

async def delete_dir(self, path: str) -> None

Delete a directory at the specified path.

Parameters:

NameTypeDescription
pathAnyThe directory path to delete.

BaseComputerInterface.get_file_size

async def get_file_size(self, path: str) -> int

Get the size of a file in bytes.

Parameters:

NameTypeDescription
pathAnyThe file path to get the size of.

Returns: The size of the file in bytes.

BaseComputerInterface.get_desktop_environment

async def get_desktop_environment(self) -> str

Get the current desktop environment.

Returns: The name of the current desktop environment.

BaseComputerInterface.set_wallpaper

async def set_wallpaper(self, path: str) -> None

Set the desktop wallpaper to the specified path.

Parameters:

NameTypeDescription
pathAnyThe file path to set as wallpaper

BaseComputerInterface.open

async def open(self, target: str) -> None

Open a target using the system's default handler.

Typically opens files, folders, or URLs with the associated application.

Parameters:

NameTypeDescription
targetAnyThe file path, folder path, or URL to open.

BaseComputerInterface.launch

async def launch(self, app: str, args: List[str] | None = None) -> Optional[int]

Launch an application with optional arguments.

Parameters:

NameTypeDescription
appAnyThe application executable or bundle identifier.
argsAnyOptional list of arguments to pass to the application.

Returns: Optional process ID (PID) of the launched application if available, otherwise None.

BaseComputerInterface.get_current_window_id

async def get_current_window_id(self) -> int | str

Get the identifier of the currently active/focused window.

Returns: A window identifier that can be used with other window management methods.

BaseComputerInterface.get_application_windows

async def get_application_windows(self, app: str) -> List[int | str]

Get all window identifiers for a specific application.

Parameters:

NameTypeDescription
appAnyThe application name, executable, or identifier to query.

Returns: A list of window identifiers belonging to the specified application.

BaseComputerInterface.get_window_name

async def get_window_name(self, window_id: int | str) -> str

Get the title/name of a window.

Parameters:

NameTypeDescription
window_idAnyThe window identifier.

Returns: The window's title or name string.

BaseComputerInterface.get_window_size

async def get_window_size(self, window_id: int | str) -> tuple[int, int]

Get the size of a window in pixels.

Parameters:

NameTypeDescription
window_idAnyThe window identifier.

Returns: A tuple of (width, height) representing the window size in pixels.

BaseComputerInterface.get_window_position

async def get_window_position(self, window_id: int | str) -> tuple[int, int]

Get the screen position of a window.

Parameters:

NameTypeDescription
window_idAnyThe window identifier.

Returns: A tuple of (x, y) representing the window's top-left corner in screen coordinates.

BaseComputerInterface.set_window_size

async def set_window_size(self, window_id: int | str, width: int, height: int) -> None

Set the size of a window in pixels.

Parameters:

NameTypeDescription
window_idAnyThe window identifier.
widthAnyDesired width in pixels.
heightAnyDesired height in pixels.

BaseComputerInterface.set_window_position

async def set_window_position(self, window_id: int | str, x: int, y: int) -> None

Move a window to a specific position on the screen.

Parameters:

NameTypeDescription
window_idAnyThe window identifier.
xAnyX coordinate for the window's top-left corner.
yAnyY coordinate for the window's top-left corner.

BaseComputerInterface.maximize_window

async def maximize_window(self, window_id: int | str) -> None

Maximize a window.

Parameters:

NameTypeDescription
window_idAnyThe window identifier.

BaseComputerInterface.minimize_window

async def minimize_window(self, window_id: int | str) -> None

Minimize a window.

Parameters:

NameTypeDescription
window_idAnyThe window identifier.

BaseComputerInterface.activate_window

async def activate_window(self, window_id: int | str) -> None

Bring a window to the foreground and focus it.

Parameters:

NameTypeDescription
window_idAnyThe window identifier.

BaseComputerInterface.close_window

async def close_window(self, window_id: int | str) -> None

Close a window.

Parameters:

NameTypeDescription
window_idAnyThe window identifier.

BaseComputerInterface.get_window_title

async def get_window_title(self, window_id: int | str) -> str

Convenience alias for get_window_name().

Parameters:

NameTypeDescription
window_idAnyThe window identifier.

Returns: The window's title or name string.

BaseComputerInterface.window_size

async def window_size(self, window_id: int | str) -> tuple[int, int]

Convenience alias for get_window_size().

Parameters:

NameTypeDescription
window_idAnyThe window identifier.

Returns: A tuple of (width, height) representing the window size in pixels.

BaseComputerInterface.run_command

async def run_command(self, command: str) -> CommandResult

Run shell command and return structured result.

Executes a shell command using subprocess.run with shell=True and check=False. The command is run in the target environment and captures both stdout and stderr.

Parameters:

NameTypeDescription
commandstrThe shell command to execute

Returns: CommandResult: A structured result containing: - stdout (str): Standard output from the command - stderr (str): Standard error from the command - returncode (int): Exit code from the command (0 indicates success)

Raises:

  • RuntimeError - If the command execution fails at the system level

Example:

result = await interface.run_command("ls -la")
if result.returncode == 0:
    print(f"Output: {result.stdout}")
else:
    print(f"Error: {result.stderr}, Exit code: {result.returncode}")

BaseComputerInterface.get_accessibility_tree

async def get_accessibility_tree(self) -> Dict

Get the accessibility tree of the current screen.

Returns: Dict containing the hierarchical accessibility information of screen elements.

BaseComputerInterface.to_screen_coordinates

async def to_screen_coordinates(self, x: float, y: float) -> tuple[float, float]

Convert screenshot coordinates to screen coordinates.

Parameters:

NameTypeDescription
xAnyX coordinate in screenshot space
yAnyY coordinate in screenshot space

Returns: tuple[float, float]: (x, y) coordinates in screen space

BaseComputerInterface.to_screenshot_coordinates

async def to_screenshot_coordinates(self, x: float, y: float) -> tuple[float, float]

Convert screen coordinates to screenshot coordinates.

Parameters:

NameTypeDescription
xAnyX coordinate in screen space
yAnyY coordinate in screen space

Returns: tuple[float, float]: (x, y) coordinates in screenshot space


InterfaceFactory

Factory for creating OS-specific computer interfaces.

Methods

InterfaceFactory.create_interface_for_os

def create_interface_for_os(os: OSType, ip_address: str, api_port: Optional[int] = None, api_key: Optional[str] = None, vm_name: Optional[str] = None) -> BaseComputerInterface

Create an interface for the specified OS.

Parameters:

NameTypeDescription
osAnyOperating system type ('macos', 'linux', or 'windows')
ip_addressAnyIP address of the computer to control
api_portAnyOptional API port of the computer to control
api_keyAnyOptional API key for cloud authentication
vm_nameAnyOptional VM name for cloud authentication

Returns: BaseComputerInterface: The appropriate interface for the OS

Raises:

  • ValueError - If the OS type is not supported

MacOSComputerInterface

Inherits from: GenericComputerInterface

Interface for macOS.

Constructor

MacOSComputerInterface(self, ip_address: str, username: str = 'lume', password: str = 'lume', api_key: Optional[str] = None, vm_name: Optional[str] = None, api_port: Optional[int] = None)

Methods

MacOSComputerInterface.diorama_cmd

async def diorama_cmd(self, action: str, arguments: Optional[dict] = None) -> dict

Send a diorama command to the server (macOS only).

Was this page helpful?


On this page

ClassesComputerConstructorAttributesMethodsComputer.create_desktop_from_appsComputer.runComputer.disconnectComputer.stopComputer.startComputer.restartComputer.get_ipComputer.wait_vm_readyComputer.updateComputer.get_screenshot_sizeComputer.to_screen_coordinatesComputer.to_screenshot_coordinatesComputer.playwright_execComputer.venv_installComputer.pip_installComputer.venv_cmdComputer.venv_execComputer.venv_exec_backgroundComputer.python_execComputer.python_exec_backgroundComputer.python_commandVMProviderTypeAttributestracingComputerTracingConstructorAttributesMethodsComputerTracing.startComputerTracing.stopComputerTracing.record_api_callComputerTracing.record_accessibility_treeComputerTracing.add_metadatamodelsBaseVMProviderAttributesMethodsBaseVMProvider.get_vmBaseVMProvider.list_vmsBaseVMProvider.run_vmBaseVMProvider.stop_vmBaseVMProvider.restart_vmBaseVMProvider.update_vmBaseVMProvider.get_ipDisplayConstructorAttributesImageConstructorAttributesComputerConstructorAttributesMethodsComputer.get_ipdiorama_computerKeyAttributesMethodsKey.from_stringDioramaComputerConstructorAttributesMethodsDioramaComputer.runDioramaComputerInterfaceConstructorAttributesMethodsDioramaComputerInterface.screenshotDioramaComputerInterface.get_screen_sizeDioramaComputerInterface.move_cursorDioramaComputerInterface.left_clickDioramaComputerInterface.right_clickDioramaComputerInterface.double_clickDioramaComputerInterface.scroll_upDioramaComputerInterface.scroll_downDioramaComputerInterface.drag_toDioramaComputerInterface.get_cursor_positionDioramaComputerInterface.type_textDioramaComputerInterface.press_keyDioramaComputerInterface.hotkeyDioramaComputerInterface.to_screen_coordinateshelpersDependencyInfoAttributesset_default_computersandboxedgenerate_source_codeinterfaceBaseComputerInterfaceConstructorAttributesMethodsBaseComputerInterface.wait_for_readyBaseComputerInterface.closeBaseComputerInterface.force_closeBaseComputerInterface.mouse_downBaseComputerInterface.mouse_upBaseComputerInterface.left_clickBaseComputerInterface.right_clickBaseComputerInterface.double_clickBaseComputerInterface.move_cursorBaseComputerInterface.drag_toBaseComputerInterface.dragBaseComputerInterface.key_downBaseComputerInterface.key_upBaseComputerInterface.type_textBaseComputerInterface.press_keyBaseComputerInterface.hotkeyBaseComputerInterface.scrollBaseComputerInterface.scroll_downBaseComputerInterface.scroll_upBaseComputerInterface.screenshotBaseComputerInterface.get_screen_sizeBaseComputerInterface.get_cursor_positionBaseComputerInterface.copy_to_clipboardBaseComputerInterface.set_clipboardBaseComputerInterface.file_existsBaseComputerInterface.directory_existsBaseComputerInterface.list_dirBaseComputerInterface.read_textBaseComputerInterface.write_textBaseComputerInterface.read_bytesBaseComputerInterface.write_bytesBaseComputerInterface.delete_fileBaseComputerInterface.create_dirBaseComputerInterface.delete_dirBaseComputerInterface.get_file_sizeBaseComputerInterface.get_desktop_environmentBaseComputerInterface.set_wallpaperBaseComputerInterface.openBaseComputerInterface.launchBaseComputerInterface.get_current_window_idBaseComputerInterface.get_application_windowsBaseComputerInterface.get_window_nameBaseComputerInterface.get_window_sizeBaseComputerInterface.get_window_positionBaseComputerInterface.set_window_sizeBaseComputerInterface.set_window_positionBaseComputerInterface.maximize_windowBaseComputerInterface.minimize_windowBaseComputerInterface.activate_windowBaseComputerInterface.close_windowBaseComputerInterface.get_window_titleBaseComputerInterface.window_sizeBaseComputerInterface.run_commandBaseComputerInterface.get_accessibility_treeBaseComputerInterface.to_screen_coordinatesBaseComputerInterface.to_screenshot_coordinatesInterfaceFactoryMethodsInterfaceFactory.create_interface_for_osMacOSComputerInterfaceConstructorMethodsMacOSComputerInterface.diorama_cmd