EXTENSIONS
Desktop Automation
Automate native desktop applications across Windows, Mac, and Linux using karate-robot for mouse, keyboard, and window interactions with image and OCR-based element detection.
Benefits of Desktop Automation
- Unified framework: Combine API, web, and desktop automation in the same test
- Cross-platform support: Works on Windows, Mac, and Linux
- Multiple locator strategies: Windows UI Automation, image matching, OCR text recognition
- Integrated testing: Mix desktop automation with web browser testing seamlessly
Karate Robot vs Traditional Tools
Aspect | Selenium/Appium | Karate Robot |
---|---|---|
Desktop apps | Limited/no support | Full native support |
Cross-platform | Browser only | Desktop + browser |
API + UI testing | Separate tools | Single framework |
Image recognition | Third-party tools | Built-in |
OCR support | Not available | Built-in |
Karate Robot allows you to test complete workflows including file uploads via native dialogs, desktop application interactions, and web UI validation - all in one test scenario.
Quick Start
Maven Setup
Add the karate-robot dependency to your project:
<dependency>
<groupId>io.karatelabs</groupId>
<artifactId>karate-robot</artifactId>
<version>1.5.1</version>
<scope>test</scope>
</dependency>
Gradle Setup
testImplementation 'io.karatelabs:karate-robot:1.5.1'
Basic Example
Automate a simple desktop calculator:
Feature: Calculator Test
Scenario: Basic desktop automation
* robot { window: 'Calculator' }
* click('button-7')
* click('button-plus')
* click('button-3')
* click('button-equals')
* match text('.result') == '10'
- Use
robot { window: 'name' }
to target applications - Supports exact, contains, or regex window matching
- Works with native OS windows and controls
Window Management
Exact Window Name
Target a window by its exact title:
Scenario: Target Notepad window
* robot { window: 'Notepad' }
* input('Hello from Karate!')
Window Name Pattern
Use regex to match window titles:
Scenario: Target Chrome browser
* robot { window: '^Chrome' }
* input('karate dsl')
* input(Key.ENTER)
The ^
symbol indicates the window title starts with "Chrome".
Switching Between Windows
Work with multiple applications in a single test:
Scenario: Copy text between applications
* robot { window: 'Notepad' }
* input('Test data')
* input(Key.CTRL + 'a')
* input(Key.CTRL + 'c')
* robot { window: 'WordPad' }
* input(Key.CTRL + 'v')
Mouse and Keyboard
Mouse Actions
Control mouse movements and clicks:
Scenario: Mouse interactions
* robot { window: 'MyApp' }
* click('button.submit')
* doubleClick('file.txt')
* rightClick('context-menu')
* move(100, 200)
* drag(50, 50, 200, 200)
Keyboard Input
Send text and special keys:
Scenario: Keyboard automation
* robot { window: 'TextEditor' }
* input('Hello World')
* input(Key.ENTER)
* input(Key.CTRL + 'a')
* input(Key.DELETE)
Key Combinations
Common keyboard shortcuts:
Scenario: Keyboard shortcuts
* robot { window: 'Browser' }
* input(Key.CTRL + 't')
* input('karatelabs.io')
* input(Key.ENTER)
* input(Key.CTRL + 'w')
* input(Key.ALT + Key.F4)
Element Locators
Windows UI Automation
Use XPath-like selectors on Windows:
Scenario: Windows UI Automation
* robot { window: 'Calculator', root: true }
* click('//Button[@Name="Seven"]')
* click('//Button[@Name="Plus"]')
* click('//Button[@Name="Three"]')
* click('//Button[@Name="Equals"]')
* def result = text('//Text[@AutomationId="CalculatorResults"]')
* match result contains '10'
The root: true
option enables Windows UI Automation for precise element targeting.
Image-Based Locators
Click elements using image matching:
Scenario: Image recognition
* robot { window: 'MyApp' }
* click('images/submit-button.png')
* waitFor('images/success-message.png')
* screenshot()
Place image files in your test resources directory (e.g., src/test/resources/images/
).
OCR Text Recognition
Locate and interact with elements using OCR:
Scenario: OCR-based interaction
* robot { window: 'MyApp' }
* click('{ocr}Submit')
* waitFor('{ocr}Success')
* def message = text('{ocr}')
* match message contains 'Operation completed'
OCR features require Tesseract OCR data files. Install Tesseract and ensure the data files are available on your system. Karate Robot will automatically detect the installation.
Advanced Patterns
Mixing Web and Desktop Automation
Combine browser and desktop automation for file uploads:
Feature: File Upload Flow
Scenario: Upload file via native dialog
# Start with web automation
* configure driver = { type: 'chrome' }
* driver 'https://example.com/upload'
* click('input[type="file"]')
# Switch to desktop automation for file dialog
* robot { window: 'Open' }
* input('C:\Documents\report.pdf')
* input(Key.ENTER)
# Back to web automation
* waitFor('.upload-status')
* match text('.upload-status') == 'Upload complete'
Configuration Options
Configure retry behavior and timing:
Scenario: Custom robot configuration
* robot {
window: 'MyApp',
retryCount: 5,
retryInterval: 1000,
highlight: true,
screenshotOnFailure: true
}
* click('submit-button')
Configuration options:
retryCount
: Number of retry attempts (default: 3)retryInterval
: Milliseconds between retries (default: 3000)highlight
: Highlight elements before interaction (default: true)screenshotOnFailure
: Capture screenshot on failure (default: true)
Testing Desktop Applications
Complete desktop application test:
Feature: Desktop App Testing
Background:
* robot { window: 'MyDesktopApp' }
Scenario: Login and navigation
* input('#username', 'admin')
* input('#password', 'secret')
* click('button.login')
* waitFor('.dashboard')
* match text('.welcome') contains 'Welcome, Admin'
* click('menu.reports')
* click('button.generate')
* waitFor('{ocr}Report Generated')
Platform-Specific Notes
Windows
UI Automation support:
- Full XPath-like selector support
- Access to native Windows controls
- Best performance and accuracy
Recommended approach:
* robot { window: 'MyApp', root: true }
* click('//Button[@Name="Submit"]')
macOS
Requirements:
- Enable Accessibility permissions for terminal/IDE
- Grant screen recording permissions
Window focus:
* robot { window: '^MyApp' }
* activate()
Use activate()
to bring window to front if needed.
Linux
Requirements:
- X11 display server
- Compatible window manager
Display configuration:
export DISPLAY=:0
Set the DISPLAY
environment variable before running tests.
Debugging and Troubleshooting
Screenshot on Failure
Automatically capture screenshots when tests fail:
Scenario: Auto-screenshot
* robot { window: 'MyApp', screenshotOnFailure: true }
* click('button-that-might-fail')
Element Highlighting
Highlight elements before clicking for debugging:
Scenario: Visual debugging
* robot { window: 'MyApp', highlight: true }
* click('submit-button')
Wait Strategies
Use retry and wait for dynamic content:
Scenario: Wait for element
* robot { window: 'MyApp' }
* waitFor('success-message')
* retry(5, 2000).click('dynamic-button')
Next Steps
- Combine with web automation: UI Testing
- Use image comparison for validation: Image Comparison
- Create reusable automation flows: Calling Features
- Explore complete examples: Examples and Demos