Cleanlab：1.1万星AI数据清洗工具包

Cleanlab是一个拥有11K+ GitHub stars的Open SourceAI工具包，可发现和修复ML数据集中的数据质量问题。自动标签错误检测、缺失值填充和分类、回归、聚类任务的数据清洗。包含安装指南、基准测试和生产部署。

description: ‘Cleanlab is an open-source AI toolkit with 11K+ GitHub stars that finds and fixes data quality issues in ML datasets. Automatic label error detection, missing value imputation, and data cleansing for classification, regression, and clustering tasks. Includes setup guide, benchmarks, and production deployment.’ tags: [‘ai’, ‘data-cleaning’, ‘ml’, ‘open-source’, ‘self-hosted’] date: 2026-06-10 slug: ‘cleanlab-11k-star-ai-data-cleaning’ category: data-science github_repo: ‘https://github.com/cleanlab/cleanlab' license: MIT lang: zh faqs: #

Cleanlab: The 11K-Star AI Toolkit That Cuts Data Annotation Costs by 80% — Open-Source Data Cleaning with Python #