# data-centric-ai
2 tools tagged
Showing 2 of 2 tools
Cleanlab
AI-powered data quality for ML datasets
Cleanlab is a data-centric AI library that automatically detects and fixes label errors, outliers, and data quality issues in machine learning datasets. It works with any ML model and any data type including text, images, tabular, and audio by analyzing model predictions to identify mislabeled examples, near-duplicates, and ambiguous data points. Cleanlab helps teams improve model accuracy by cleaning training data rather than tuning model architecture.
Snorkel AI
Data-centric AI platform for programmatic data labeling
Snorkel AI is a data-centric AI platform that enables programmatic labeling of training data through labeling functions rather than manual annotation. Spun out of Stanford AI Lab, it lets teams write Python functions that encode domain heuristics to label data at scale, with the platform combining weak labels into high-quality training sets. Used by Fortune 500 companies for text, image, and structured data labeling.