Machine Learning Regression (family = 'reg_ml')¶
In this module, we assume that the conditional outcome model in each source domain is a flexible machine learning model. For more details of methods, please refer CGDRO-Regression.
We can use cgdro_() with family = 'reg_ml' for machine learning regressions.
Now we give an example showing how to implement family = 'reg_ml' with three different loss functions:
- Reward-based loss
- Squared loss
- Regret-based loss
Example¶
Data Generating Process¶
In this example, we generate a non-linear multi-source domain data with $3$ domains, putting $10,000$ samples on each source domain and $100,000$ samples on the target domain. The dimension of the parameters is $p=5$,
# number of source groups = 3, each with 1000 samples, and 10000 target samples
# dimension p = 5
data <- simu_reg_ml(n_vec = c(10000,10000,10000), n0=100000, N_label = 20, p=5, seed = 123)
Xlist = data$X_list
Ylist = data$Y_list
X0 = data$X0
Y0 = data$Y0
X0_label = data$X0_label
Y0_label = data$Y0_label
Implementation & Prediction¶
We implement three loss functions by Regression.ml, including reward, squaredloss, and regret. Geometrically, reward: $f^∗$ is the point closest to the original within the convex hull of ${f(l)}_{l\in[L]}$; squaredloss: $f^{sq}$ corresponds to the source model with the largest noise level with the highest noise level when this noise is substantially higher than that in other sources; regret: $f^{reg}$ is the center of the smallest circle enclosing all individual source models.
loss_type = reward¶
## Fit CGDRO model
## using xgboost as f_learner and logistic regression as w_learner
fit <- cgdro_(Xlist, Ylist, X0, loss_type = "reward",
family = "reg_ml", f_learner = "xgb", w_learner = "logistic",
bias_correct = TRUE,
priors = NULL,
ridge = 1e-8,
seed = 123)
## CGDRO aggregated weights
fit$weight_
- 0.317499065671085
- 0.549549234919215
- 0.1329516994097
## Prediction
pred <- predict_cgdro_(fit) # N x 1 vector of predicted values
head(pred)
- -0.459551805452054
- -0.739729688939399
- -1.60128935735992
- 34.3347048744452
- 1.62082142807518
- -4.69201440347003
loss_type = squaredloss¶
fit <- cgdro_(Xlist, Ylist, X0, loss_type = "squaredloss",
family = "reg_ml", f_learner = "xgb", w_learner = "logistic",
bias_correct = TRUE,
priors = NULL,
ridge = 1e-8,
seed = 123)
fit$weight_
- 0.202408939972164
- 0.797591058424003
- 1.60383299773305e-09
pred <- predict_cgdro_(fit) # N x 1 vector of predicted values
head(pred)
- -0.642431667311784
- -1.77324307104553
- 1.13628680747093
- 31.9207733711317
- 1.53663282149702
- -3.58658626771721
loss_type = regret¶
fit <- cgdro_(Xlist, Ylist, X0, loss_type = "regret",
family = "reg_ml", f_learner = "xgb", w_learner = "logistic",
bias_correct = TRUE,
priors = NULL,
ridge = 1e-8,
seed = 123)
fit$weight_
- 0.395450725324904
- 0.431893343112896
- 0.172655931562201
pred <- predict_cgdro_(fit) # N x 1 vector of predicted values
head(pred)
- -0.359558655594653
- -0.159141182776789
- -2.79062660295791
- 35.4144246610455
- 1.6925564260838
- -5.1456899034924
Learning with Prior¶
This experiment demonstrates how prior information can improve the reward objective and produce better aggregated models.
We use the same data-generating process as before, with XGBoost as the outcome learner (f_learner='xgb') and KLIEP as the density-ratio estimator (w_learner='logistic').
We evaluate performance across:
target label sizes $$N_{\text{label}} \in \{20, 50, 100\},$$
prior radius $$\rho \in [0, 0.95].$$
For each combination, we compute the reward $$ \mathbb{E}\bigl[Y^2 - (Y - \hat f(X))^2\bigr], $$ for four different methods, described below.
- Naive ML (No Prior)
This is the standard CGDRO reward estimator:
- learns source models independently,
- aggregates them using the CGDRO optimizer,
- no prior information is used.
This method serves as the baseline.
- Uniform Prior
We add a prior that shrinks aggregation weights toward the uniform distribution: $$ q_{\text{prior}} = \left(\tfrac{1}{L}, \dots, \tfrac{1}{L}\right). $$
The prior strength is controlled by the radius parameter $\rho$. When $\rho=0$, the solution is exactly uniform; when $\rho$ increases, the model relaxes toward the unconstrained CGDRO solution.
This stabilizes estimation when labeled target data is scarce.
- Prior from Labeled Target Data
We estimate a prior weight vector $q_{\text{label}}$ using the few labeled target points.
We solve: $$ q_{\text{label}} = \arg\min_{q\in \Delta^L} \frac{1}{N_{\text{label}}},|Y_{\text{label}} - \hat F_{\text{src}}, q|*2^2, $$ where $\hat F*{\text{src}}$ collects source-model predictions on labeled target covariates.
This $q_{\text{label}}$ becomes a data-informed prior, with radius $\rho$ controlling how strongly it shapes the final aggregation.
This method is particularly effective when the target domain differs systematically from source domains.
- Target-Only Model
We train an XGBoost model purely on the labeled target data:
$$ \hat f_{\text{target}}(x) = \text{xgb.fit}(X_{\text{label}}, Y_{\text{label}}). $$
Because the number of labeled target points is small, this model tends to overfit, but it provides a useful benchmark: “How well can we do using only target data, without borrowing from sources?”
Across all combinations of $N_{\text{label}}$ and $\rho$, the results show that appropriate prior information consistently improves reward performance, especially when labeled target data is limited.
# ------------------------------------------------------------------
# Main loop translating your Python
# ------------------------------------------------------------------
N_labels <- c(20, 50, 100)
rhos <- seq(0.0, 0.9, by=0.05)
methods <- c("No Prior", "Uniform Prior", "Labeled Prior", "Target Only")
reward_array <- array(0, dim=c(length(N_labels), length(rhos), 4))
set.seed(0)
for (iN in seq_along(N_labels)){
N_label <- N_labels[iN]
# Data (L=3 to match your call)
data <- simu_reg_ml(n_vec = c(1000,1000,1000), n0=10000, N_label = N_label, p=5, seed = 123)
Xlist = data$X_list
Ylist = data$Y_list
X0 = data$X0
Y0 = data$Y0
X0_label = data$X0_label
Y0_label = data$Y0_label
L <- length(Xlist)
# ---- For each rho ----
for (ir in seq_along(rhos)) {
rho <- rhos[ir]
message(sprintf("N_label: %d rho: %.2f", N_label, rho))
# =========================
# 0) No Prior
# =========================
fit <- cgdro_(Xlist, Ylist, X0, loss_type = "reward",
family = "reg_ml", f_learner = "xgb", w_learner = "logistic",
bias_correct = TRUE,
priors = NULL,
ridge = 1e-8,
seed = 123)
pred <- predict_cgdro_(fit) # N x 1 vector of predicted values
reward <- mean(Y0^2 - (Y0 - pred)^2)
reward_array[iN, ir, 1] <- reward
# =========================
# 1) Uniform Prior (radius = rho)
# =========================
q_uni <- rep(1/L, L)
fit <- cgdro_(Xlist, Ylist, X0, loss_type = "reward",
family = "reg_ml", f_learner = "xgb", w_learner = "logistic",
bias_correct = TRUE,
priors = list(q_uni, rho),
ridge = 1e-8,
seed = 123)
pred <- predict_cgdro_(fit) # N x 1 vector of predicted values
reward <- mean(Y0^2 - (Y0 - pred)^2)
reward_array[iN, ir, 2] <- reward
# =========================
# 2) Labeled Prior
# =========================
# per-source predictions on labeled targets
library(CVXR)
pred_label = matrix(0, nrow=N_label, ncol=L)
idx_lab <- seq_len(N_label)
for (l in 1:L) {
pred_label[, l] = fit$pred_full_mat[idx_lab, l]
}
q <- Variable(L)
objective <- Minimize(sum_squares(Y0_label - pred_label %*% q) / N_label)
constraints <- list(q >= 0, sum(q) == 1)
prob <- Problem(objective, constraints)
# Try ECOS, fall back to SCS if needed
res <- tryCatch(solve(prob, solver = "ECOS"),
error = function(e) solve(prob, solver = "SCS"))
q_label <- as.numeric(res$getValue(q))
fit <- cgdro_(Xlist, Ylist, X0, loss_type = "reward",
family = "reg_ml", f_learner = "xgb", w_learner = "logistic",
bias_correct = TRUE,
priors = list(q_label, rho),
ridge = 1e-8,
seed = 123)
pred <- predict_cgdro_(fit) # N x 1 vector of predicted values
reward <- mean(Y0^2 - (Y0 - pred)^2)
reward_array[iN, ir, 3] <- reward
# =========================
# 3) Target Only
# =========================
umodel <- .learn_f(mode = "reg", learner = 'xgb')
umodel$fit(X0_label, Y0_label)
pred_to <- umodel$predict_cgdro_(X0)
reward <- mean(Y0^2 - (Y0 - pred_to)^2)
reward_array[iN, ir, 4] <- reward
}
}
# Save for later (like np.save)
saveRDS(reward_array, file = "reward_array.rds")
# ------------------------------------------------------------------
# Plot (facets over N_label; lines over rho; 4 methods)
# ------------------------------------------------------------------
df <- NULL
for (iN in seq_along(N_labels)) {
tmp <- as.data.frame(reward_array[iN,,])
colnames(tmp) <- c("No Prior","Uniform Prior","Labeled Prior","Target Only")
tmp$rho <- rhos
tmp$N_label <- N_labels[iN]
df <- rbind(df, tmp)
}
df_long <- df |>
tidyr::pivot_longer(cols = c("No Prior","Uniform Prior","Labeled Prior","Target Only"),
names_to = "method", values_to = "reward")
gg <- ggplot(df_long, aes(x = rho, y = reward, color = method)) +
geom_line(linewidth = 0.9) +
facet_wrap(~ N_label, nrow = 1, scales = "fixed",
labeller = labeller(N_label = function(v) paste0("N_label = ", v))) +
labs(x = expression(rho),
y = "Reward",
color = NULL) +
theme_minimal(base_size = 15) +
theme(legend.position = "bottom",
panel.grid.minor = element_blank())
options(repr.plot.width = 16, repr.plot.height = 6)
print(gg)