11 模型上线 — 从代码到 API

01-10 章训了一堆模型,但都活在 R / Python 脚本里。要让别人用,要让临床医生在网页上输入数据看预测,要进 production,必须把模型变成 API。这一章讲 R Plumber + Python FastAPI 两条路径,以及上线必做的"模型卡"。

部署的 4 个层次

层次	内容	适用场景
1. 模型卡(Model Card)	文档:训练数据 / 性能指标 / 已知偏差 / 使用场景	任何要给别人用的模型
2. 序列化(Serialization)	`saveRDS()` / `torch.save()` / `pickle` / `ONNX`	模型迁移到其他机器跑
3. API 包装	Plumber / FastAPI / Flask	给前端 / 其他系统调用
4. 容器化 + 编排	Docker + k8s / docker-compose	高可用部署

本章覆盖 1-3,容器化在 project-practice/module03 详细讲。

1. 模型卡(Model Card)— 必做

Google 在 2018 年提出的概念,现在是 ML 论文的事实标准。任何对外发布的模型都该附:

任务:输入是什么、输出是什么、用例
训练数据:来源、规模、时间、代表性局限
性能:report 在 test 集 + 外部队列上的指标(AUC / C-index / Brier)
已知偏差:race / gender / age / institution bias
使用建议:适用场景、不适用场景、如何更新模型
License + 引用

BioF3 的 ml-classifier 工具输出的 metrics_summary.csv + summary.txt 已经是模型卡的核心数据,缺的是文字描述。模板见后文。

2. 序列化 — 不只是 saveRDS()

R / Python 的标准方式:

# R - 整个 caret train 对象
saveRDS(fit_rf, "model_rf.rds")
fit_rf <- readRDS("model_rf.rds")

# 但是!跨版本 R 可能反序列化失败
# 推荐:也保存模型 + 数据预处理对象
saveRDS(list(model = fit_rf, preProc = preProc), "bundle.rds")

# Python - PyTorch
torch.save(model.state_dict(), "model.pt")  # 推荐
# Or: torch.save(model, "model.pt")  # 完整对象,跨版本风险大

# Python - sklearn
import joblib
joblib.dump(fit, "model.pkl")

⚠️ 跨版本 / 跨平台序列化坑:

R saveRDS 在 R 4.0+ 默认 ASCII format,旧版本读不出来 — 用 version=2
Python pickle 不跨版本(3.8 训的可能 3.11 读不出来)— 推荐 joblib + 锁版本
PyTorch state_dict 跨版本最稳,完整 model 对象在版本升级时会出兼容问题

ONNX(Open Neural Network Exchange) 是真跨框架的序列化:R caret → ONNX → 任何 runtime。但生信场景用 ONNX 较少,可选。

3. R Plumber API

R 训的 04 / 06 章模型(LR / RF / XGBoost / RSF)用 Plumber 是最快路径:

# api.R
library(plumber)
fit <- readRDS("model_lr.rds")
preProc <- readRDS("preprocess.rds")

#* @apiTitle BioF3 LIHC Classifier
#* @apiDescription HCC vs Normal classifier
#* @apiVersion 1.0

#* Predict HCC probability from gene expression
#* @param expression A list of gene → expression value
#* @post /predict
function(req) {
  # 1. parse input
  body <- jsonlite::fromJSON(req$postBody)
  X <- matrix(body$expression, nrow = 1)

  # 2. preprocess (apply train-fit preProc)
  X_scaled <- predict(preProc, X)

  # 3. predict
  prob <- predict(fit, X_scaled, type = "prob")[, "Tumor"]

  # 4. response
  list(probability = prob, threshold = 0.5,
       prediction = ifelse(prob > 0.5, "Tumor", "Normal"))
}

启动:

library(plumber)
pr <- plumb("api.R")
pr$run(host = "0.0.0.0", port = 8080)

调用:

curl -X POST http://localhost:8080/predict \
  -H "Content-Type: application/json" \
  -d '{"expression": [1.2, 3.5, ...]}'

4. Python FastAPI

08-09 章 PyTorch / scVI 模型用 FastAPI:

# api.py
from fastapi import FastAPI
from pydantic import BaseModel
import torch
import numpy as np

app = FastAPI(title="BioF3 LIHC Autoencoder")

# Load model on startup
model = AE(input_dim=5000, latent_dim=16)
model.load_state_dict(torch.load("model.pt", map_location="cpu"))
model.eval()

class ExprInput(BaseModel):
    expression: list[float]

@app.post("/embed")
def embed(payload: ExprInput):
    if len(payload.expression) != 5000:
        return {"error": f"expected 5000 features, got {len(payload.expression)}"}
    x = torch.tensor(payload.expression, dtype=torch.float32).unsqueeze(0)
    with torch.no_grad():
        latent = model.encoder(x).numpy()[0]
    return {"latent": latent.tolist()}

启动:

uvicorn api:app --host 0.0.0.0 --port 8000

FastAPI 自带 Swagger UI(http://localhost:8000/docs)— 比 Plumber 更适合做对外 API。

5. 生产环境必查 5 件

❌ 没做版本锁定 — requirements.txt / renv.lock 必备。生产环境跑了半年突然崩,常常是某个依赖偷偷升级

❌ 没限流 / 鉴权 — 公开 API 不限流,被恶意刷会被云服务商发警告。最简单 nginx limit_req + token 鉴权

❌ 预测错误不记录 — 上线后模型预测错的样本是金矿(用来下个版本重训),要落日志

❌ 不监控漂移 — 模型上线 6 个月,输入数据分布漂移(seq protocol 变了 / 临床样本年龄段变了),预测变差但你不知道。要监控 input distribution

❌ 只有一份模型文件 — 模型本身要版本化(model_v1.rds / model_v2.rds),旧版本归档,新版本上线前 A/B 测试

BioF3 的 Plumber 实践

BioF3 已有的 lasso-cox / ml-classifier 工具内部 不是 Plumber,而是 R 容器 + Rscript 同步执行 +文件系统 IO + Node.js 包一层。这个架构的代价是延迟高(每次启动 R 容器 ~3 秒),好处是 R 包多 + 多任务并发。

如果你的场景是"低延迟单点预测",直接 Plumber + Docker 部署比 BioF3 框架更轻量。

如果你的场景是"科研用户上传文件 + 跑分析 + 看报告",BioF3 框架(R 容器 + 文件输出)更适合 — 这正是 19 个工具用的架构。

Methods 段 / 模型卡模板

## Model Card: BioF3 LIHC Classifier v1.0

**Task**: Binary classification — HCC tumor vs adjacent normal liver tissue
from gene expression profiles.

**Training data**: TCGA-LIHC (n=294 train, n=62 val, n=65 test) processed
through Module 02 (vst → top 5000 HVG → Module 03 consensus features).
Class distribution: 80% Tumor, 20% Normal in training set.

**Performance** (test set):
- Best algorithm: SVM (RBF kernel)
- AUC: 1.000 (95% CI 1.000-1.000) [near-perfect, as Tumor vs Normal in
  LIHC has very strong separation]
- AUPR: 1.000
- Brier: 0.004
- Calibration: well-calibrated (slope=1.02, intercept=0.00)

**Limitations & Bias**:
- Trained only on TCGA-LIHC; generalization to non-Asian / non-Western
  cohorts not validated.
- Not validated for liver metastases or non-hepatocellular tumors.
- No pediatric samples in training (TCGA-LIHC age range 16-90).

**Intended use**: Research-only. NOT for clinical decision-making.

**License**: MIT (model weights). TCGA data terms apply to training data.

**Citation**: BioF3 ML 专栏 module 04 (2026).

把这段附在 model.rds 旁边,就是合格模型卡。

在线工具

BioF3 的 19 个工具本身就是 ML 模型 + 模型卡 + API + Web UI 的一体化产品。用 BioF3 工具学习,然后在自己项目里用 Plumber / FastAPI 复现 = 最快上手路径。

本章状态

✅ Wave 4 正文完成(2026-05-27)。配套 ml11_serve_sci.R(Plumber demo) + ml11_serve.py(FastAPI demo)在产。

静态文件

离线资料下载

手册 HTML / PDF 已在后台预生成，点击后直接下载网站静态资源。

本篇 HTML 本篇 PDF说明

部署的 4 个层次​

1. 模型卡(Model Card)— 必做​

2. 序列化 — 不只是 saveRDS()​

3. R Plumber API​

4. Python FastAPI​

5. 生产环境必查 5 件​

BioF3 的 Plumber 实践​

Methods 段 / 模型卡模板​

在线工具​

本章状态​

让 AI 带我实战这一篇