EXTENSIONS

Desktop Automation

Automate native desktop applications across Windows, Mac, and Linux using karate-robot for mouse, keyboard, and window interactions with image and OCR-based element detection.

Benefits of Desktop Automation

Unified framework: Combine API, web, and desktop automation in the same test
Cross-platform support: Works on Windows, Mac, and Linux
Multiple locator strategies: Windows UI Automation, image matching, OCR text recognition
Integrated testing: Mix desktop automation with web browser testing seamlessly

Karate Robot vs Traditional Tools

Aspect	Selenium/Appium	Karate Robot
Desktop apps	Limited/no support	Full native support
Cross-platform	Browser only	Desktop + browser
API + UI testing	Separate tools	Single framework
Image recognition	Third-party tools	Built-in
OCR support	Not available	Built-in

Key Advantage

Karate Robot allows you to test complete workflows including file uploads via native dialogs, desktop application interactions, and web UI validation - all in one test scenario.

Quick Start

Maven Setup

Add the karate-robot dependency to your project:

<dependency>
    <groupId>io.karatelabs</groupId>
    <artifactId>karate-robot</artifactId>
    <version>1.5.1</version>
    <scope>test</scope>
</dependency>

Gradle Setup

testImplementation 'io.karatelabs:karate-robot:1.5.1'

Basic Example

Automate a simple desktop calculator:

Feature: Calculator Test

Scenario: Basic desktop automation
* robot { window: 'Calculator' }
* click('button-7')
* click('button-plus')
* click('button-3')
* click('button-equals')
* match text('.result') == '10'

Key Points

Use robot { window: 'name' } to target applications
Supports exact, contains, or regex window matching
Works with native OS windows and controls

Window Management

Exact Window Name

Target a window by its exact title:

Scenario: Target Notepad window
* robot { window: 'Notepad' }
* input('Hello from Karate!')

Window Name Pattern

Use regex to match window titles:

Scenario: Target Chrome browser
* robot { window: '^Chrome' }
* input('karate dsl')
* input(Key.ENTER)

The ^ symbol indicates the window title starts with "Chrome".

Switching Between Windows

Work with multiple applications in a single test:

Scenario: Copy text between applications
* robot { window: 'Notepad' }
* input('Test data')
* input(Key.CTRL + 'a')
* input(Key.CTRL + 'c')

* robot { window: 'WordPad' }
* input(Key.CTRL + 'v')

Mouse and Keyboard

Mouse Actions

Control mouse movements and clicks:

Scenario: Mouse interactions
* robot { window: 'MyApp' }
* click('button.submit')
* doubleClick('file.txt')
* rightClick('context-menu')
* move(100, 200)
* drag(50, 50, 200, 200)

Keyboard Input

Send text and special keys:

Scenario: Keyboard automation
* robot { window: 'TextEditor' }
* input('Hello World')
* input(Key.ENTER)
* input(Key.CTRL + 'a')
* input(Key.DELETE)

Key Combinations

Common keyboard shortcuts:

Scenario: Keyboard shortcuts
* robot { window: 'Browser' }
* input(Key.CTRL + 't')
* input('karatelabs.io')
* input(Key.ENTER)
* input(Key.CTRL + 'w')
* input(Key.ALT + Key.F4)

Element Locators

Windows UI Automation

Use XPath-like selectors on Windows:

Scenario: Windows UI Automation
* robot { window: 'Calculator', root: true }
* click('//Button[@Name="Seven"]')
* click('//Button[@Name="Plus"]')
* click('//Button[@Name="Three"]')
* click('//Button[@Name="Equals"]')
* def result = text('//Text[@AutomationId="CalculatorResults"]')
* match result contains '10'

The root: true option enables Windows UI Automation for precise element targeting.

Image-Based Locators

Click elements using image matching:

Scenario: Image recognition
* robot { window: 'MyApp' }
* click('images/submit-button.png')
* waitFor('images/success-message.png')
* screenshot()

Place image files in your test resources directory (e.g., src/test/resources/images/).

OCR Text Recognition

Locate and interact with elements using OCR:

Scenario: OCR-based interaction
* robot { window: 'MyApp' }
* click('{ocr}Submit')
* waitFor('{ocr}Success')
* def message = text('{ocr}')
* match message contains 'Operation completed'

Tesseract OCR

OCR features require Tesseract OCR data files. Install Tesseract and ensure the data files are available on your system. Karate Robot will automatically detect the installation.

Advanced Patterns

Mixing Web and Desktop Automation

Combine browser and desktop automation for file uploads:

Feature: File Upload Flow

Scenario: Upload file via native dialog
# Start with web automation
* configure driver = { type: 'chrome' }
* driver 'https://example.com/upload'
* click('input[type="file"]')

# Switch to desktop automation for file dialog
* robot { window: 'Open' }
* input('C:\Documents\report.pdf')
* input(Key.ENTER)

# Back to web automation
* waitFor('.upload-status')
* match text('.upload-status') == 'Upload complete'

Configuration Options

Configure retry behavior and timing:

Scenario: Custom robot configuration
* robot {
    window: 'MyApp',
    retryCount: 5,
    retryInterval: 1000,
    highlight: true,
    screenshotOnFailure: true
  }
* click('submit-button')

Configuration options:

retryCount: Number of retry attempts (default: 3)
retryInterval: Milliseconds between retries (default: 3000)
highlight: Highlight elements before interaction (default: true)
screenshotOnFailure: Capture screenshot on failure (default: true)

Testing Desktop Applications

Complete desktop application test:

Feature: Desktop App Testing

Background:
* robot { window: 'MyDesktopApp' }

Scenario: Login and navigation
* input('#username', 'admin')
* input('#password', 'secret')
* click('button.login')
* waitFor('.dashboard')
* match text('.welcome') contains 'Welcome, Admin'

* click('menu.reports')
* click('button.generate')
* waitFor('{ocr}Report Generated')

Platform-Specific Notes

Windows

UI Automation support:

Full XPath-like selector support
Access to native Windows controls
Best performance and accuracy

Recommended approach:

* robot { window: 'MyApp', root: true }
* click('//Button[@Name="Submit"]')

macOS

Requirements:

Enable Accessibility permissions for terminal/IDE
Grant screen recording permissions

Window focus:

* robot { window: '^MyApp' }
* activate()

Use activate() to bring window to front if needed.

Linux

Requirements:

X11 display server
Compatible window manager

Display configuration:

export DISPLAY=:0

Set the DISPLAY environment variable before running tests.

Debugging and Troubleshooting

Screenshot on Failure

Automatically capture screenshots when tests fail:

Scenario: Auto-screenshot
* robot { window: 'MyApp', screenshotOnFailure: true }
* click('button-that-might-fail')

Element Highlighting

Highlight elements before clicking for debugging:

Scenario: Visual debugging
* robot { window: 'MyApp', highlight: true }
* click('submit-button')

Wait Strategies

Use retry and wait for dynamic content:

Scenario: Wait for element
* robot { window: 'MyApp' }
* waitFor('success-message')
* retry(5, 2000).click('dynamic-button')

Next Steps

Combine with web automation: UI Testing
Use image comparison for validation: Image Comparison
Create reusable automation flows: Calling Features
Explore complete examples: Examples and Demos

Benefits of Desktop Automation​

Karate Robot vs Traditional Tools​

Quick Start​

Maven Setup​

Gradle Setup​

Basic Example​

Window Management​

Exact Window Name​

Window Name Pattern​

Switching Between Windows​

Mouse and Keyboard​

Mouse Actions​

Keyboard Input​

Key Combinations​

Element Locators​

Windows UI Automation​

Image-Based Locators​

OCR Text Recognition​

Advanced Patterns​

Mixing Web and Desktop Automation​

Configuration Options​

Testing Desktop Applications​

Platform-Specific Notes​

Windows​

macOS​

Linux​

Debugging and Troubleshooting​

Screenshot on Failure​

Element Highlighting​

Wait Strategies​

Next Steps​