What is Adala?
Adala is a paradigm shift in how we approach data labeling and agent intelligence. At its core, Adala is an Autonomous Data (Labeling) Agent framework designed to turn raw data into high-quality, structured training signals through intelligent, self-improving agents.
What makes Adala stand out? It moves beyond static, rule-based labeling or one-off LLM prompts. Instead, it enables the creation of autonomous agents that learn and refine their skills iteratively, not in isolation, but by interacting with a defined environment, observing outcomes, and reflecting on their own performance.
This “runtime” environment is grounded in real-world data: you provide the ground truth dataset, and the agents learn from it, adapting their behavior over time like a skilled apprentice under expert supervision.
For me as a developer, this means reliability without rigidity. The agents are built on a foundation of verified data, ensuring consistent, trustworthy outputs, critical when scaling labeling across thousands of documents or complex multimodal inputs. But reliability doesn’t mean inflexibility.
With Adala, I can precisely control the output: set strict constraints for consistency (e.g., “only return ‘yes’ or ‘no’”), or allow adaptive reasoning for nuanced tasks (e.g., “explain your confidence level”). This balance between structure and autonomy is what makes Adala uniquely powerful.
Adala’s specialization in data processing, especially labeling, is where it truly shines. Whether I’m annotating medical records, classifying financial statements, or tagging images for autonomous driving models, Adala lets me build agents trained not just to label, but to understand context, resolve ambiguity, and improve over time.
This turns data labeling from a bottleneck into a dynamic, self-optimizing pipeline.
Who can use it?
AI engineers, data scientists, ML ops teams, and even domain experts who want to accelerate data preparation without sacrificing quality.
It’s ideal for anyone working with large-scale datasets where accuracy, consistency, and speed are non-negotiable, like in healthcare, finance, or safety-critical systems.
Adala Core Features: Autonomous Data Labeling
- Autonomous Skill Acquisition: Implements agents that independently acquire and refine skills through iterative learning, observation, and reflection, reducing manual oversight.
- Ground Truth Anchoring: Ensures high-reliability output by basing agent learning on user-defined ground truth datasets, creating a trustworthy foundation for enterprise data needs.
- Constraint-Based Output Control: Offers granular control over agent outputs with flexible constraints, allowing users to enforce strict guidelines or permit adaptive creativity based on project requirements.
- Agnostic LLM Runtimes: Features a flexible, extensible runtime environment that supports multiple LLM backends and advanced architectures like Student/Teacher models for optimized cost and performance.
- Native Python & Dataframe Integration: Seamlessly integrates with Python notebooks and large Dataframes, streamlining workflows for Data Scientists and ML Engineers without complex infrastructure changes.
- Iterative Self-Optimization: Agents continuously evolve by analyzing their operating environment and feedback loops, ensuring long-term adaptability to changing data patterns.
Why is it important?
Because data quality is the bedrock of AI success. Most model failures stem not from poor architecture, but from flawed or biased training data. Adala addresses this head-on by embedding learning, reflection, and control directly into the labeling process.
It allows us to fail fast in code, not in production, just like the sterile field analogy from medicine. By building reliable, controllable, and self-improving agents, Adala turns data curation from a chore into a strategic asset. For developers like me, it’s not just a tool, it’s the future of responsible, efficient AI development.
Here is a concise, developer-friendly guide to installing and using Adala, formatted in Markdown.
Installation
Choose the installation method that best fits your environment.
Standard PyPI Install
pip install adala
Bleeding Edge (GitHub)
Recommended for the latest updates.
pip install git+https://github.com/HumanSignal/Adala.git
Developer Setup (Poetry)
git clone https://github.com/HumanSignal/Adala.git
cd Adala/
poetry install
Prerequisites
Before running the agent, export your API key.
# For OpenAI
export OPENAI_API_KEY='your-openai-api-key'
# For OpenRouter (Optional)
export OPENROUTER_API_KEY='your-openrouter-api-key'
Quickstart: Sentiment Analysis
This script demonstrates how to train an autonomous agent to classify sentiment using OpenAI (GPT-4o).
1. Setup & Data
Define your ground truth (training) and target (test) datasets.
Python
import pandas as pd
from adala.agents import Agent
from adala.environments import StaticEnvironment
from adala.skills import ClassificationSkill
from adala.runtimes import OpenAIChatRuntime
# Ground truth data for the agent to learn from
train_df = pd.DataFrame([
["It was the negative first impressions, and then it started working.", "Positive"],
["Not loud enough and doesn't turn on like it should.", "Negative"],
["I don't know what to say.", "Neutral"],
["Manager was rude, but mic shows flat frequency response.", "Positive"],
["The phone doesn't seem to accept anything except CBR mp3s.", "Negative"],
["I tried it before, I bought this device for my son.", "Neutral"],
], columns=["text", "sentiment"])
# New data to classify
test_df = pd.DataFrame([
"All three broke within two months of use.",
"The device worked for a long time, can't say anything bad.",
"Just a random line of text."
], columns=["text"])
2. Configure & Train Agent
Initialize the agent with a ClassificationSkill and an OpenAI runtime, then trigger the learning loop.
agent = Agent(
# 1. Connect to the environment (Ground Truth)
environment=StaticEnvironment(df=train_df),
# 2. Define the skill to learn
skills=ClassificationSkill(
name='sentiment',
instructions="Label text as positive, negative or neutral.",
labels=["Positive", "Negative", "Neutral"],
input_template="Text: {text}",
output_template="Sentiment: {sentiment}"
),
# 3. Define the Runtime (LLM Backend)
runtimes = {
'openai': OpenAIChatRuntime(model='gpt-4o'),
},
teacher_runtimes = {
'default': OpenAIChatRuntime(model='gpt-4o'),
},
default_runtime='openai',
)
# Start autonomous learning
# The agent iterates to improve accuracy based on the training data
agent.learn(learning_iterations=3, accuracy_threshold=0.95)
# Run prediction on new data
print('n=> Run tests ...')
predictions = agent.run(test_df)
print(predictions)
Alternative: Using OpenRouter (Claude, Gemini, etc.)
To use other models via OpenRouter (e.g., Claude 3.5 Haiku), simply swap the runtimes configuration in the Agent setup:
import os
# ... (Imports and DataFrames remain the same) ...
agent = Agent(
environment=StaticEnvironment(df=train_df),
skills=ClassificationSkill(
name='sentiment',
instructions="Label text as positive, negative or neutral.",
labels=["Positive", "Negative", "Neutral"],
input_template="Text: {text}",
output_template="Sentiment: {sentiment}"
),
runtimes = {
'openrouter': OpenAIChatRuntime(
base_url="https://openrouter.ai/api/v1",
model="anthropic/claude-3.5-haiku",
api_key=os.getenv("OPENROUTER_API_KEY"),
provider="Custom"
),
},
default_runtime='openrouter',
teacher_runtimes = {
"default" : OpenAIChatRuntime(
base_url="https://openrouter.ai/api/v1",
model="anthropic/claude-3.5-haiku",
api_key=os.getenv("OPENROUTER_API_KEY"),
provider="Custom"
),
}
)
# ... (agent.learn and agent.run remain the same) ...
Use-cases with Real-life Examples:
- TextGenerationSkill
- Summary: Creates new, coherent text content based on specific input prompts.
- Importance: Essential for automating creative workflows and scaling content production without manual effort.
- Use Cases: Drafting email responses, generating marketing ad copy, or creating fictional stories.
- OntologyCreator
- Summary: Analyzes raw text examples to automatically infer and structure underlying concepts and relationships (ontologies).
- Importance: Critical for turning unstructured data into structured knowledge graphs, reducing the workload of data modeling.
- Use Cases: Building knowledge bases from scratch, organizing enterprise documentation, or defining categories for a new dataset.
- ClassificationSkill
- Summary: Sorts text inputs into a defined set of categories or labels.
- Importance: The bedrock of data organization; it allows systems to instantly filter and route massive amounts of information.
- Use Cases: Spam detection in emails, sentiment analysis of reviews, or tagging support tickets by topic.
- Skill Sets
- Summary: Orchestrates multiple individual skills into a sequential pipeline to handle complex, multi-step tasks.
- Importance: Enables the automation of sophisticated workflows that require logic, transformation, and decision-making beyond a single step.
- Use Cases: A pipeline that first translates a user review, then classifies its sentiment, and finally generates an appropriate response.
- Math Reasoning (GSM8k)
- Summary: Solves grade-school level mathematical problems requiring multi-step logic.
- Importance: meaningful for verifying an agent’s ability to handle logic, numbers, and sequential reasoning, not just language.
- Use Cases: Educational tutoring bots, financial data verification, or automated invoice processing.
- SummarizationSkill
- Summary: Distills long passages of text into shorter, concise versions while retaining key information.
- Importance: Solves information overload by allowing users to grasp the gist of documents quickly.
- Use Cases: Generating executive summaries of reports, creating news headlines, or condensing meeting transcripts.
- ClassificationSkillWithCoT (Chain-of-Thought)
- Summary: Classifies text but forces the agent to “show its work” by reasoning through the problem before assigning a label.
- Importance: drastically improves accuracy on ambiguous or complex tasks and provides transparency (audit trails) for why a decision was made.
- Use Cases: Moderating subtle hate speech, medical diagnosis assistance, or complex legal document classification.
- TranslationSkill
- Summary: Converts text from a source language into a target language.
- Importance: Facilitates global communication and accessibility by removing language barriers in real-time.
- Use Cases: Localizing software interfaces, multilingual customer support chat, or translating international news.
- QuestionAnsweringSkill
- Summary: Extracts specific answers to queries based on a provided context or document.
- Importance: Key for Retrieval-Augmented Generation (RAG); it ensures answers are grounded in factual data rather than hallucinated.
- Use Cases: Chatbots that answer questions about company policy, legal research assistants, or extracting specific data points from contracts.
License
Adala is a free open-source project that is released under the Apache-2.0 License.




