What This Stack Does
This stack addresses the reproducibility crisis in AI development where teams cannot reliably recreate training conditions because the underlying data changed between runs. Dolt serves as the versioned data store where every dataset modification is committed with full Git semantics. Teams branch datasets to experiment with different preprocessing strategies, diff branches to understand exactly which rows changed, and merge validated modifications back to the main branch.
Versioned Data and Unified Ingestion
Dolt anchors the stack as the single source of truth for structured training data. Its MySQL wire protocol means existing data engineering tools connect without modification. Data scientists use familiar SQL to query, transform, and audit datasets while the database automatically tracks every change. When a model's performance regresses, the team can diff the current training data against the version used for the previous successful training run to identify the exact rows that changed.
OpenBB feeds the stack with financial and market data from over 100 providers through a unified Python SDK. For teams building fintech AI models, OpenBB normalizes data from disparate sources into consistent formats that Dolt can version. Each data pull is committed with metadata about the source, timestamp, and provider, creating a complete provenance chain from raw market data through to the model that was trained on it.
Evaluation and Branch-Per-Experiment Workflow
SWE-bench provides the evaluation harness for measuring how code generation agents perform across dataset versions. Teams can branch the training data, retrain or fine-tune their agent, and run SWE-bench evaluations to measure whether the data change improved real-world software engineering task resolution. This creates a tight feedback loop between data curation and agent capability measurement.
The workflow follows a branch-per-experiment pattern. A data scientist creates a branch, applies a preprocessing change, commits it, runs training on the branched dataset, evaluates results, and submits a pull request on DoltHub if the change improves model performance. Reviewers inspect the row-level diff, verify the preprocessing logic, and merge. The main branch always contains the team's best-validated training data.
The Bottom Line
All tools in this stack are open-source. Dolt is Apache 2.0, OpenBB is Apache 2.0, and SWE-bench is MIT. The managed offerings — Hosted Dolt and OpenBB Enterprise — provide team features for organizations that need SLA guarantees and dedicated support. The stack runs entirely self-hosted for teams with privacy requirements around training data.