Causal Uplift Modeling — 因果提升模型：识别"可说服者"的跨域基础层

Skill-Causal-Uplift-Modeling · 01-因果推断

causalexperimentforecastingmulti_agentpricing客服与VOCMAS与智能体工程定价与利润WF-C 客服分诊WF-F 动态定价WF-G Listing内容优化WF-H 复购增长WF-I 智能体工程WF-J DTC 独立站增长

收录于用户增长决策手册用户生命周期价值运营手册

年化 ROI10 万

实现难度⭐⭐☆☆☆

业务视角

适用角色增长负责人 / CMO · 数据分析师 · 广告优化师

适用平台Amazon · TikTok Shop · Meta Ads · DTC 独立站

什么情况下用广告预算花了，但不确定哪个渠道真的带来新客；做了大促，不知道销量增长是促销效果还是季节规律

成功是什么样的能区分「真实增量」和「自然购买」，砍掉虚假归因渠道后同等预算 ROI 提升 20-40%

业务痛点

钱花出去了不知道有没有用各渠道报告都说自己贡献最大怎么向老板证明这笔钱值得花

1. 解决的问题

母婴品牌发2万张优惠券时60-70%给了本来就会买的用户白白损失利润——因果提升模型从A/B实验数据识别「可说服者」，精准发券后ROI从1.2x提升至3-5x，年化节省无效促销50-80万元

2. 核心算法逻辑

传统机器学习预测"谁会买"，Uplift Modeling 预测"谁因为我们的干预才会买"。两者差异在于因果归因：找到 CATE（条件平均处理效应），即每位用户在"被干预 vs 不被干预"两种情景下的响应差异。

3. 业务应用场景

业务问题：每月发 2 万张 20% 折扣券，其中 60-70% 是"必然购买者"（拿了券也会买，白给折扣）。Uplift 模型识别真正的"可说服者"，只向他们发券，节省预算同时提升增量 GMV。

数据要求： - 历史 A/B 实验数据：有券组 vs 无券组的购买结果 - 用户特征：购买历史、浏览行为、品类偏好、注册天数

预期产出： - 每位用户的 CATE 得分（升序排列） - 最优发券阈值：CATE > X 的用户值得发券 - 增量 ROI：发券成本 vs 真实增量 GMV

4. 输入数据要求

请查看原始代码模板获取输入规格。

5. 输出结果

请查看原始代码模板获取输出规格。

6. 业务价值 / ROI

ROI 预估：
优惠券精准定向：从全量发送到仅向可说服者发送，ROI 从 1.2x 提升至 3-5x
月均优惠券预算 ¥10 万 → 精准后节省 ¥4-6 万/月
年化 ROI：¥50-80 万（节省无效促销 + 增量 GMV）
实施难度：⭐⭐☆☆☆（需要 A/B 实验历史数据；scikit-learn 可实现；约 1-2 周）
优先级评分：⭐⭐⭐⭐⭐（图谱基础层 Skill，被3个高层 Skill 依赖；用户运营最高 ROI 的基础工具）

7. 代码模板

代码块数量：1 · 路径：未检测到

"""
Causal Uplift Modeling — T-Learner & X-Learner 实现
母婴跨境电商优惠券发放精准化
"""
import numpy as np
from sklearn.ensemble import RandomForestClassifier, GradientBoostingClassifier


class TLearnerUplift:
    """T-Learner Uplift Model：分别训练处理组和对照组"""

    def __init__(self, base_model=None):
        self.model_t = base_model or GradientBoostingClassifier(n_estimators=100, random_state=42)
        self.model_c = base_model or GradientBoostingClassifier(n_estimators=100, random_state=42)

    def fit(self, X, treatment, outcome):
        idx_t = treatment == 1
        idx_c = treatment == 0
        self.model_t.fit(X[idx_t], outcome[idx_t])
        self.model_c.fit(X[idx_c], outcome[idx_c])
        return self

    def predict_uplift(self, X):
        p_t = self.model_t.predict_proba(X)[:, 1]
        p_c = self.model_c.predict_proba(X)[:, 1]
        return p_t - p_c

    def classify_users(self, X, threshold=0.05):
        uplift = self.predict_uplift(X)
        segments = np.where(uplift > threshold, 'Persuadable',
                   np.where(uplift < -threshold, 'Sleeping_Dog', 'Neutral'))
        return uplift, segments


def generate_sample_data(n=2000, seed=42):
    """生成模拟母婴用户优惠券实验数据"""
    np.random.seed(seed)
    # 用户特征
    purchase_history = np.random.poisson(3, n)
    days_since_last  = np.random.exponential(30, n)
    category_loyalty = np.random.uniform(0, 1, n)
    clv_score        = np.random.lognormal(3, 1, n)

    X = np.column_stack([purchase_history, days_since_last, category_loyalty, clv_score])

    # 随机处理分配（A/B实验）
    treatment = np.random.binomial(1, 0.5, n)

    # 真实提升效应（异质性：价格敏感用户提升更大）
    true_uplift = 0.2 * (category_loyalty < 0.4) + 0.1 * (days_since_last > 20) - 0.05
    base_prob   = 0.1 + 0.05 * np.log1p(purchase_history)
    p_outcome   = np.clip(base_prob + treatment * true_uplift, 0.01, 0.99)
    outcome     = np.random.binomial(1, p_outcome)

    return X, treatment, outcome, true_uplift


def run_uplift_analysis():
    print("=" * 60)
    print("Causal Uplift Modeling — 母婴电商优惠券精准发放")

8. 论文来源

1706.03461