Can I use ZenML without Docker?

Yes. The default local stack runs entirely without Docker, and local development and testing need nothing beyond pip install zenml. Docker is only required for containerized execution on remote orchestrators like Kubernetes or Airflow in Docker mode.

How does ZenML handle data and artifact versioning?

Every artifact a step produces is automatically versioned using content hashing, and the artifact store (local, S3, or GCS) keeps all versions. You can retrieve any historical artifact via Client().get_artifact_version(name, version), giving full reproducibility without manual data management.

Is ZenML suitable for real-time inference pipelines?

ZenML is primarily designed for batch training and batch inference. For real-time serving, you train with ZenML, register the model, then deploy via built-in integrations for KServe, Seldon, or BentoML.

How do I switch a ZenML pipeline from local to cloud execution?

You change the active Stack with a single CLI command and your pipeline code stays the same. A Stack defines where the pipeline runs by combining components like the orchestrator (local, Airflow, Kubernetes, Vertex AI), artifact store (local, S3, GCS, Azure Blob), experiment tracker, and model registry.

What happens when a ZenML pipeline step fails?

ZenML supports configurable retry logic via @step(retry=3), and failed runs are recorded with full stack traces in the dashboard. You can resume from the failed step using zenml pipeline run --from-failure, which reuses cached outputs from successful upstream steps.

{{< 资源信息 >}} ## 简介：您的 ML 管道已损坏您昨天训练了一个模型。今天，您不知道您使用的是哪个数据集版本，运行了哪些预处理步骤，或者哪些超参数产生了 0.94 F1 分数。您的 Jupyter 笔记本有 47 个单元格，其中 12 个已注释掉，重要的单元格取决于仅存在于您的笔记本电脑上的 CSV 文件。这不是一个工作流程。这是一种责任。一项 2025 年 MLOps 状况调查发现，68% 的 ML 模型从未投入生产，引用的首要原因是“缺乏可重现的管道”。不是模型精度。不是数据质量。再现性。当您的管道是手动步骤的集合时，您无法部署它、审核它或扩展它。 ZenML（v0.80.0，2026-04-15 发布）是一个开源 MLOps 框架，旨在解决这个问题。凭借 ~4,500 个 GitHub star 和 Apache-2.0 许可证，ZenML 提供了一个统一的抽象层，将 20 多个 ML 工具（实验跟踪器、模型注册表、编排器和部署平台）连接到单个、可复制、版本控制的管道中。你写Python。 ZenML 处理管道。在本指南中，您将在 5 分钟内设置 ZenML，将其连接到 MLflow 和 Kubernetes 等流行工具，运行生产级管道，并使用 DigitalOcean 在您自己的基础设施上部署整个堆栈。 ## ZenML 是什么？ ZenML 是一个可扩展的开源 MLOps 框架，用于构建可移植的、可用于生产的机器学习管道。它将您的 ML 代码与其运行的基础设施解耦，使您能够从本地开发切换到云生产，而无需重写一行管道逻辑。从本质上讲，ZenML 将 ML 管道视为步骤的有向无环图 (DAG)，其中每个步骤都是一个 Python 函数。步骤生成并使用自动进行版本控制、跟踪和存储的工件（数据集、模型、指标）。 ZenML 处理编排、工件管理和工具集成——您专注于 ML 逻辑。 ## ZenML 的工作原理：架构和核心概念 ZenML 的架构围绕四个关键抽象展开，这些抽象直接映射到真实的 ML 工作流程需求。 ### 管道管道是一个经过修饰的 Python 函数，它将多个步骤链接在一起。 ZenML 将此函数编译为 DAG，验证依赖关系，并在您选择的编排器上执行它。 ### 步骤 Step 是最小的工作单元 - 执行一项任务（加载数据、预处理、训练、评估）的 Python 函数。步骤用“@step”修饰，并通过类型注释声明它们的输入/输出。 ### 文物步骤的每个输出都是一个工件 - 存储在工件存储中的类型化、版本化对象。工件可以是数据集（pandas DataFrame、NumPy 数组）、模型（sklearn、PyTorch、TensorFlow）或自定义对象。 ZenML 自动序列化、版本化并跟踪每个工件的沿袭。 ### 堆栈堆栈定义管道运行的位置和方式。它结合了：

Orchestrator：执行管道（本地、Airflow、Kubernetes、Vertex AI 等）
Artifact Store：存储管道输出（本地文件系统、S3、GCS、Azure Blob）
容器注册表：存储用于容器化执行的 Docker 映像
实验跟踪器：记录指标和参数（MLflow、权重和偏差、Neptune）
模型注册表：管理模型版本（MLflow、Vertex AI）
步骤操作员：在专用硬件（SageMaker、Vertex AI）上运行特定步骤切换堆栈是单个 CLI 命令。您的管道代码不会改变。 ## 安装和设置：5 分钟内从零到运行管道 ### 先决条件 -Python 3.9+
点或紫外线
Docker（可选，用于容器化执行） ### 第 1 步：安装 ZenML ```` bas h python -m venv zenml-env 源 zenml-env/bin/activate # Linux/Mac

zenml-env\Scripts\activate # Windows # 安装 ZenML 核心 #

pip 安装 zenml # 验证安装禅宗版本

输出：ZenML 版本 0.80.0 #

### 第 2 步：初始化 ZenML bas h

初始化 ZenML 存储库（创建 .zen 目录） #

zenml 初始化 # 检查状态禅宗地位 `zenml init` 命令创建一个 `.zen` 配置目录。这类似于“git init”——它标记 ZenML 项目的根目录并在本地存储堆栈配置。 ### 步骤 3：注册本地堆栈 bas h

注册本地工件存储 #

zenml 工件存储寄存器 local_store –flavor=local –path=./artifacts # 注册本地编排器 zenml Orchestrator 注册 local_orchestrator –flavor=local # 创建一个组合它们的堆栈 zenml 堆栈寄存器 local_stack \ -o local_orchestrator \ -本地商店\ - 放 # 验证活动堆栈 zenml 堆栈描述 ### 步骤 4：运行您的第一个管道创建一个名为“first_pipeline.py”的文件：蟒蛇从 zenml 导入管道，步骤将 pandas 导入为 pd 从 sklearn.datasets 导入 load_iris 从 sklearn.model_selection 导入 train_test_split 从 sklearn.ensemble 导入 RandomForestClassifier 从 sklearn.metrics 导入 precision_score @步骤 def load_data() -> pd.DataFrame: “““加载虹膜数据集。””” 虹膜 = load_iris(as_frame=True) df = 虹膜.frame 返回df @步骤 def split_data(df: pd.DataFrame) -> tuple[pd.DataFrame, pd.DataFrame, pd.Series, pd.Series]: “““将数据分为训练集和测试集。””” X = df.drop(“目标”, 轴=1) y = df[“目标”] X_train, X_test, y_train, y_test = train_test_split( X、y、test_size=0.2、random_state=42 ）返回X_train，X_test，y_train，y_test @步骤 def train_model(X_train: pd.DataFrame, y_train: pd.Series) -> RandomForestClassifier: “”“训练随机森林分类器。”“” clf = RandomForestClassifier(n_estimators=100, random_state=42) clf.fit(X_train, y_train) 返回CLF @步骤 def 评估模型（模型：随机森林分类器， X_test：pd.DataFrame， y_test: pd.Series ) -> 浮动: “”“评估训练后的模型。”“” 预测 = model.predict(X_test) 准确度=准确度_分数（y_测试，预测） print(f"模型精度：{精度：.4f}") 返回精度 @管道 def Training_pipeline(): “““端到端 ML 训练管道。””” df = 加载数据() X_train、X_test、y_train、y_test = split_data(df) 模型 = train_model(X_train, y_train) 准确度=评估模型（模型，X_测试，y_测试）如果 name == “main”: 运行=训练管道（） print(f"管道运行完成：{run.name}") 运行它： bas h 蟒蛇first_pipeline.py

ZenML 支持多个编排器来满足不同的规模要求： ````
bas
h
# 安装气流集成
pip install zenml[气流] # 注册Airflow协调器
zenml Orchestrator 注册airflow_orchestrator \ --味道=气流\ --本地=真 # 切换到气流堆栈
zenml 堆栈更新 local_stack -o airflow_orchestrator
```` 其他编排器：**Kubernetes**、**GitHub Actions**、**AzureML**、**Vertex AI**、**SageMaker**、**Databricks**、**Kubeflow**。 ### 使用 MLflow 进行实验跟踪 ````
bas
h
# 安装 MLflow 集成
pip install zenml[mlflow] # 启动 MLflow UI（在单独的终端中）
mlflow ui --端口 5000 # 注册 MLflow 实验跟踪器
zenml实验跟踪器注册mlflow_tracker \ --味道=mlflow \ --tracking_uri=http://localhost:5000 # 注册MLflow模型注册表
zenml 模型注册表寄存器 mlflow_registry \ --味道=mlflow \ --uri=http://localhost:5000 # 更新堆栈
zenml 堆栈更新 local_stack \ -e mlflow_tracker \ -r mlflow_registry
```` 现在修改您的管道以记录实验： ````蟒蛇
从 zenml 导入管道，步骤
从 zenml.client 导入客户端
导入流量
导入mlflow.sklearn @step(experiment_tracker="mlflow_tracker")
def train_model(X_train: pd.DataFrame, y_train: pd.Series) -> RandomForestClassifier: """使用 MLflow 日志记录进行训练。""" mlflow.autolog() # 自动记录参数、指标和模型 clf = RandomForestClassifier(n_estimators=100, random_state=42) clf.fit(X_train, y_train) # 记录自定义指标 mlflow.log_param("n_estimators", 100) mlflow.log_metric("train_samples", len(X_train)) 返回CLF # 训练后注册模型
@step(model_registry="mlflow_registry")
def 寄存器模型( 模型：随机森林分类器， 精度：浮动
) -> 字符串: """将模型注册到 MLflow 模型注册表。""" 如果精度 > 0.90： model_version = mlflow.sklearn.log_model( 模型， artifact_path =“模型”， Registered_model_name=“虹膜分类器” ） print(f"模型已注册：{model_version}") 返回“虹膜分类器” 返回“低于阈值”
```` ### 使用 S3 进行工件存储 ````
bas
h
# 注册S3工件存储
zenml 工件存储寄存器 s3_store \ --味道=s3 \ --path=s3://my-ml-bucket/zenml-artifacts \ --aws_access_key_id=$AWS_ACCESS_KEY_ID \ --aws_secret_access_key=$AWS_SECRET_ACCESS_KEY # 更新堆栈以使用 S3
zenml 堆栈更新 local_stack -a s3_store
```` ### 用于云执行的容器注册表 ````
bas
h
# 注册 Docker 容器注册表
zenml 容器注册表注册 docker_registry \ --风味=默认\ --uri=myregistry.azurecr.io # 构建并运行容器化管道
zenml 堆栈更新 local_stack -c docker_registry
zenml管道运行first_pipeline.py --build-docker
```` ### 权重和偏差整合 ````
bas
h
pip 安装 zenml[wandb] zenml实验跟踪器注册wandb_tracker \ --味道=wandb \ --api_key=$WANDB_API_KEY \ --project_name="zenml-mlops"
```` ### 全栈配置示例 ````
yam
l
# stack.yaml — 将整个 MLOps 堆栈定义为代码
堆栈名称：生产堆栈
组件： 协调器： 风格：kubernetes 配置： kubernetes_context：产品集群 命名空间：ml-pipelines 工件存储： 口味：S3 配置： 路径：s3://prod-ml-artifacts/zenml 身份验证秘密：aws-s3-秘密 容器注册表： 口味：默认 配置： URI：123456789.dkr.ecr.us-east-1.amazonaws.com 实验跟踪器： 口味: 毫升流 配置： track_uri：http://mlflow.internal:5000 模型注册表： 口味: 毫升流 配置： uri：http://mlflow.internal:5000 步骤操作符： 味道: 贤者 配置： 角色：arn:aws:iam::123456789:角色/SageMakerRole 实例类型：ml.p3.2xlarge
```` 注册这个堆栈： ````
bas
h
zenml 堆栈寄存器 -f stack.yaml --set
```` ## 基准测试和实际用例 ZenML 用于跨行业的生产。 以下是真实的部署模式和性能数据。 ### 公司简介 | 公司 | 工业| 规模| 堆栈| 结果 |
|

ZenML 2026：将20多个工具连接到生产管道的MLOps框架——完整设置指南

输出：ZenML 版本 0.80.0 #

初始化 ZenML 存储库（创建 .zen 目录） #

注册本地工件存储 #

💬 留言讨论

输出：ZenML 版本 0.80.0 #

初始化 ZenML 存储库（创建 .zen 目录） #

注册本地工件存储 #

🔗 相关资源推荐

💬 留言讨论