What is ODD or Open-Source Data Discovery?
ODD (Open-Source Data Discovery) is a powerful, self-hosted, open-source tool designed to help data teams streamline and democratize access to data. It cuts down the time spent searching for data by providing a modern, intuitive interface that makes discovering datasets fast and easy.
Beyond discovery, ODD brings transparency by tracking who uses what data and how, enabling accountability and trust. It actively supports a healthy data culture by continuously monitoring data quality and compliance, reducing risks and manual overhead.
Ultimately, ODD accelerates insights by turning chaotic data exploration into a structured, collaborative, and reliable process, empowering teams to act faster with confidence.
Features
- Federated Discovery: Instant search across all data sources via a unified catalog.
- End-to-End Lineage: Visualize data flow from source ingestion to final dashboards.
- ML-First Design: Auto-log experiment parameters and link models to data.
- Governance & Security: Granular access control, compliance tagging, and usage auditing.
- Data Quality: Real-time monitoring with support for dbt and Great Expectations.
- Reference Data Hub: Centralized management for lookup tables (currencies, codes).
- Collaborative: Safer deprecation workflows with downstream risk analysis.
- Open & Private: Fully extensible, self-hosted, and privacy-first architecture.
Available Integrations
- Airflow
- Airflow 2+
- Apache Druid
- Cassandra
- ClickHouse
- Elasticsearch
- Hive
- Kafka
- Feast
- MSSQL
- MySQL
- Microsoft ODBC
- MongoDB
- Neo4j
- MariaDB
- Oracle
- PostgreSQL
- Redshift
- Snowflake
- Vertica
- Tarantool
- Athena
- DynamoDB
- Glue
- Kinesis
- Quicksight
- S3
- SageMaker
- SageMaker Feature Store
- SQS
- Delta Lake (S3)
- Tableau
- Cube
- Superset
- Power BI
- Trino
- Presto
- DBT
- Redash
- Spark
- MLflow
- Kubeflow
- Databricks Unity Catalog
- Great Expectations
- SQLite
- Couchbase
- CockroachDB
- Fivetran
- Airbyte
- Metabase
- Mode
- BigQuery
- SingleStore
- BigTable
- Google Cloud Storage
- Blob Storage
- DuckDB
- ScyllaDB
License
Apache-2.0 License.




