Skip to main content

EXTENSIONS

Desktop Automation

Automate native desktop applications across Windows, Mac, and Linux using karate-robot for mouse, keyboard, and window interactions with image and OCR-based element detection.

Benefits of Desktop Automation

  • Unified framework: Combine API, web, and desktop automation in the same test
  • Cross-platform support: Works on Windows, Mac, and Linux
  • Multiple locator strategies: Windows UI Automation, image matching, OCR text recognition
  • Integrated testing: Mix desktop automation with web browser testing seamlessly

Karate Robot vs Traditional Tools

AspectSelenium/AppiumKarate Robot
Desktop appsLimited/no supportFull native support
Cross-platformBrowser onlyDesktop + browser
API + UI testingSeparate toolsSingle framework
Image recognitionThird-party toolsBuilt-in
OCR supportNot availableBuilt-in
Key Advantage

Karate Robot allows you to test complete workflows including file uploads via native dialogs, desktop application interactions, and web UI validation - all in one test scenario.

Quick Start

Maven Setup

Add the karate-robot dependency to your project:

<dependency>
<groupId>io.karatelabs</groupId>
<artifactId>karate-robot</artifactId>
<version>1.5.1</version>
<scope>test</scope>
</dependency>

Gradle Setup

testImplementation 'io.karatelabs:karate-robot:1.5.1'

Basic Example

Automate a simple desktop calculator:


Feature: Calculator Test

Scenario: Basic desktop automation
* robot { window: 'Calculator' }
* click('button-7')
* click('button-plus')
* click('button-3')
* click('button-equals')
* match text('.result') == '10'
Key Points
  • Use robot { window: 'name' } to target applications
  • Supports exact, contains, or regex window matching
  • Works with native OS windows and controls

Window Management

Exact Window Name

Target a window by its exact title:


Scenario: Target Notepad window
* robot { window: 'Notepad' }
* input('Hello from Karate!')

Window Name Pattern

Use regex to match window titles:


Scenario: Target Chrome browser
* robot { window: '^Chrome' }
* input('karate dsl')
* input(Key.ENTER)

The ^ symbol indicates the window title starts with "Chrome".

Switching Between Windows

Work with multiple applications in a single test:


Scenario: Copy text between applications
* robot { window: 'Notepad' }
* input('Test data')
* input(Key.CTRL + 'a')
* input(Key.CTRL + 'c')

* robot { window: 'WordPad' }
* input(Key.CTRL + 'v')

Mouse and Keyboard

Mouse Actions

Control mouse movements and clicks:


Scenario: Mouse interactions
* robot { window: 'MyApp' }
* click('button.submit')
* doubleClick('file.txt')
* rightClick('context-menu')
* move(100, 200)
* drag(50, 50, 200, 200)

Keyboard Input

Send text and special keys:


Scenario: Keyboard automation
* robot { window: 'TextEditor' }
* input('Hello World')
* input(Key.ENTER)
* input(Key.CTRL + 'a')
* input(Key.DELETE)

Key Combinations

Common keyboard shortcuts:


Scenario: Keyboard shortcuts
* robot { window: 'Browser' }
* input(Key.CTRL + 't')
* input('karatelabs.io')
* input(Key.ENTER)
* input(Key.CTRL + 'w')
* input(Key.ALT + Key.F4)

Element Locators

Windows UI Automation

Use XPath-like selectors on Windows:


Scenario: Windows UI Automation
* robot { window: 'Calculator', root: true }
* click('//Button[@Name="Seven"]')
* click('//Button[@Name="Plus"]')
* click('//Button[@Name="Three"]')
* click('//Button[@Name="Equals"]')
* def result = text('//Text[@AutomationId="CalculatorResults"]')
* match result contains '10'

The root: true option enables Windows UI Automation for precise element targeting.

Image-Based Locators

Click elements using image matching:


Scenario: Image recognition
* robot { window: 'MyApp' }
* click('images/submit-button.png')
* waitFor('images/success-message.png')
* screenshot()

Place image files in your test resources directory (e.g., src/test/resources/images/).

OCR Text Recognition

Locate and interact with elements using OCR:


Scenario: OCR-based interaction
* robot { window: 'MyApp' }
* click('{ocr}Submit')
* waitFor('{ocr}Success')
* def message = text('{ocr}')
* match message contains 'Operation completed'
Tesseract OCR

OCR features require Tesseract OCR data files. Install Tesseract and ensure the data files are available on your system. Karate Robot will automatically detect the installation.

Advanced Patterns

Mixing Web and Desktop Automation

Combine browser and desktop automation for file uploads:


Feature: File Upload Flow

Scenario: Upload file via native dialog
# Start with web automation
* configure driver = { type: 'chrome' }
* driver 'https://example.com/upload'
* click('input[type="file"]')

# Switch to desktop automation for file dialog
* robot { window: 'Open' }
* input('C:\Documents\report.pdf')
* input(Key.ENTER)

# Back to web automation
* waitFor('.upload-status')
* match text('.upload-status') == 'Upload complete'

Configuration Options

Configure retry behavior and timing:


Scenario: Custom robot configuration
* robot {
window: 'MyApp',
retryCount: 5,
retryInterval: 1000,
highlight: true,
screenshotOnFailure: true
}
* click('submit-button')

Configuration options:

  • retryCount: Number of retry attempts (default: 3)
  • retryInterval: Milliseconds between retries (default: 3000)
  • highlight: Highlight elements before interaction (default: true)
  • screenshotOnFailure: Capture screenshot on failure (default: true)

Testing Desktop Applications

Complete desktop application test:


Feature: Desktop App Testing

Background:
* robot { window: 'MyDesktopApp' }

Scenario: Login and navigation
* input('#username', 'admin')
* input('#password', 'secret')
* click('button.login')
* waitFor('.dashboard')
* match text('.welcome') contains 'Welcome, Admin'

* click('menu.reports')
* click('button.generate')
* waitFor('{ocr}Report Generated')

Platform-Specific Notes

Windows

UI Automation support:

  • Full XPath-like selector support
  • Access to native Windows controls
  • Best performance and accuracy

Recommended approach:

* robot { window: 'MyApp', root: true }
* click('//Button[@Name="Submit"]')

macOS

Requirements:

  • Enable Accessibility permissions for terminal/IDE
  • Grant screen recording permissions

Window focus:

* robot { window: '^MyApp' }
* activate()

Use activate() to bring window to front if needed.

Linux

Requirements:

  • X11 display server
  • Compatible window manager

Display configuration:

export DISPLAY=:0

Set the DISPLAY environment variable before running tests.

Debugging and Troubleshooting

Screenshot on Failure

Automatically capture screenshots when tests fail:


Scenario: Auto-screenshot
* robot { window: 'MyApp', screenshotOnFailure: true }
* click('button-that-might-fail')

Element Highlighting

Highlight elements before clicking for debugging:


Scenario: Visual debugging
* robot { window: 'MyApp', highlight: true }
* click('submit-button')

Wait Strategies

Use retry and wait for dynamic content:


Scenario: Wait for element
* robot { window: 'MyApp' }
* waitFor('success-message')
* retry(5, 2000).click('dynamic-button')

Next Steps