Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.praison.ai/llms.txt

Use this file to discover all available pages before exploring further.

Computer Use

Build agents that can interact with browsers and desktop applications through screenshots, clicks, and keyboard input.

Quick Start

import { createComputerUse, Agent } from 'praisonai-ts';

const computer = createComputerUse({
  requireApproval: true, // Safety: require approval for actions
});

const agent = new Agent({
  name: 'ComputerAgent',
  instructions: 'You help users automate computer tasks.',
  tools: computer.getTools(),
});

const response = await agent.chat('Take a screenshot of the current screen');

Configuration

import { createComputerUse } from 'praisonai-ts';

const computer = createComputerUse({
  // Safety
  requireApproval: true,    // Require human approval (default: true)
  
  // Capabilities
  browser: true,            // Enable browser control
  desktop: true,            // Enable desktop control
  
  // Settings
  screenshotDir: './screenshots',
  timeout: 30000,           // Action timeout in ms
});

Available Tools

The computer use module provides these tools:
ToolDescription
screenshotTake a screenshot
clickClick at coordinates
doubleClickDouble-click at coordinates
typeType text
keyPress a key or combination
moveMouseMove mouse to coordinates
scrollScroll in a direction
waitWait for a duration
executeExecute shell command

With Playwright (Browser)

import { chromium } from 'playwright';
import { createComputerUse } from 'praisonai-ts';

const browser = await chromium.launch({ headless: false });
const page = await browser.newPage();

const computer = createComputerUse({
  requireApproval: true,
  tools: {
    screenshot: async () => {
      const buffer = await page.screenshot();
      return {
        base64: buffer.toString('base64'),
        width: 1920,
        height: 1080,
      };
    },
    click: async (x, y) => {
      await page.mouse.click(x, y);
    },
    type: async (text) => {
      await page.keyboard.type(text);
    },
    // ... other tools
  },
});

Human Approval

By default, all actions require human approval:
import { createComputerUse, createCLIApprovalPrompt } from 'praisonai-ts';

const computer = createComputerUse({
  requireApproval: true,
});

// CLI approval prompt
const approvalHandler = createCLIApprovalPrompt();

computer.onApprovalRequest(async (action) => {
  console.log(`Action requested: ${action.type}`);
  console.log(`Parameters:`, action.params);
  return await approvalHandler(action);
});

Custom Tool Implementations

const computer = createComputerUse({
  tools: {
    screenshot: async () => {
      // Custom screenshot implementation
      const screenshot = await takeScreenshot();
      return {
        base64: screenshot.data,
        width: screenshot.width,
        height: screenshot.height,
      };
    },
    click: async (x, y, button = 'left') => {
      // Custom click implementation
      await simulateClick(x, y, button);
    },
    type: async (text) => {
      // Custom type implementation
      await simulateTyping(text);
    },
    execute: async (command) => {
      // Custom command execution
      return await runCommand(command);
    },
  },
});

Agent Example

import { Agent, createComputerUse } from 'praisonai-ts';

const computer = createComputerUse({ requireApproval: true });

const agent = new Agent({
  name: 'WebAutomator',
  instructions: `You automate web tasks. Always:
1. Take a screenshot first to understand the current state
2. Describe what you see
3. Plan your actions
4. Execute actions one at a time
5. Verify results with another screenshot`,
  tools: computer.getTools(),
  model: 'claude-3.5-sonnet', // Vision-capable model
});

const response = await agent.chat('Go to google.com and search for "PraisonAI"');

Safety Best Practices

  1. Always require approval - Enable requireApproval: true
  2. Limit capabilities - Only enable needed features
  3. Sandbox execution - Run in isolated environments
  4. Log all actions - Track what the agent does
  5. Set timeouts - Prevent runaway actions

Environment Variables

VariableRequiredDescription
OPENAI_API_KEYYesFor the agent
ANTHROPIC_API_KEYFor ClaudeClaude vision