What is OpenApps?
OpenApps is a lightweight, Python-based framework designed for researching, training, and evaluating multimodal AI agents that interact with applications just like humans (via clicking, typing, and scrolling).
Key Features
- Scalable Data: It can generate thousands of unique app versions by modifying state and design, providing virtually unlimited training data.
- Low Overhead: It runs on a single CPU without the need for complex OS emulators or Docker containers.
- Dynamic Modification: App content and appearance (e.g., titles, variables) can be modified instantly via configuration files (
config/apps) or command-line overrides. - Flexible Runtime: Supports running in both headless mode (for speed) or live mode (to visually watch the agent solve tasks).
- Accurate Evaluation: It provides ground-truth rewards based on the application’s underlying state, with all logic transparently accessible in Python.
- Human-Like Interaction: Enables multimodal agents to interact with applications exactly as humans do, by clicking, typing, and scrolling.
- Unlimited Training Data: Generates thousands of unique app versions by dynamically configuring state and design, solving data scarcity for UI agents.
- Ground Truth Evaluation: Provides precise task rewards based on the app’s underlying state, with all app logic transparent and accessible in Python.
- Multi-Model Support: Compatible with major LLMs including OpenAI, Claude, and VLLM models (such as UI-Tars).
- Pre-built Agents: Includes ready-to-use agent configurations like:
- GPT-5-1: For advanced task solving.
- Dummy Agent: For random exploration (random clicks).
- Task-Specific Execution: Allows launching agents with specific goals (e.g.,
task_name=add_meeting_with_dennis).



