Fit CGDRO in machine learning prediction model¶
In [ ]:
Copied!
Regression.ml(
self,
f_learner = 'xgb',
w_learner = 'logistic',
seed = 123,
verbose = False
)
Regression.ml(
self,
f_learner = 'xgb',
w_learner = 'logistic',
seed = 123,
verbose = False
)
f_learner(str, optional): method used to fit outcome models on each source. Defaults to 'xgb'. Includinglinear,xgb,mlp, andrf.w_learner(str, optional): method used to fit density models on each source. Defaults to 'xgb'. Includinglogistic,xgb, andkliep.seed(int, optional): random seed for data-splitting. Defaults to 123.verbose(bool, optional): whether to print out the fitting information. Defaults to False.
Built-in functions in Regression.ml:
| BUilt-in Functions | Description |
|---|---|
fit() |
Fit robust machine learning regression in the target domain. |
predict() |
Make robust prediction in the target domain. |
In [ ]:
Copied!
fit(
self,
X_list,
y_list,
X0=None,
loss_type='reward',
bias_correct=True,
prior=None
)
fit(
self,
X_list,
y_list,
X0=None,
loss_type='reward',
bias_correct=True,
prior=None
)
Arguments:
X_list(list): list of feature matrices on each source domain.y_list(list): list of label arrays on each source domain.X0(array, optional): feature matrix on the target domain. If None, use the pooled source data as the target data. Defaults to None.loss_type(str, optional): type of the loss function used to compute the optimal aggregation weights. Options include 'reward' (default), 'squaredloss', and 'regret'. Defaults to 'reward'.bias_correct(bool, optional): whether to use the bias-corrected estimator of the Gamma matrix. Defaults to True.priors(tuple, optional): prior information on the aggregation weights, given as (prior_weight, rho), where prior_weight is the prior weight vector and rho is the radius of the L2-norm ball around prior_weight. If None, no prior information is used. Defaults to None.
Outputs: enabled the following attributes:
weight_: CGDRO aggregated weights of the source domains.
In [ ]:
Copied!
predict(
self,
X=None
)
predict(
self,
X=None
)
Arguments:
X: Input features for prediction. If None, uses the training data. Defaults to None.
Outputs:
pred: prediction in the target domain.
Example¶
In [ ]:
Copied!
from cgdro.Regression import ml
from cgdro.data import DataContainerSimu_Nonlinear_reg
# number of source groups = 3, each with 10000 samples, and 100000 target samples
# dimension p = 5
# sigma: source group 1,3: 0.5; source group 2: 3.
data = DataContainerSimu_Nonlinear_reg(n=10000, N=100000)
data.generate_funcs_list(L=3, seed=0)
data.generate_data()
Xlist = data.X_sources_list
Ylist = data.Y_sources_list
X0 = data.X_target
## First announcing the module
## Then calling the functions fit()
drol = ml(f_learner='xgb', w_learner='kliep')
drol.fit(Xlist,Ylist,X0, loss_type='reward')
drol.weight_
## prediction
drol.predict()
## First announcing the module
## Then calling the functions fit()
drol = ml(f_learner='xgb', w_learner='kliep')
drol.fit(Xlist,Ylist,X0, loss_type='squaredloss')
drol.weight_
## prediction
drol.predict()
## First announcing the module
## Then calling the functions fit()
drol = ml(f_learner='xgb', w_learner='kliep')
drol.fit(Xlist,Ylist,X0, loss_type='regret')
drol.weight_
## prediction
drol.predict()
from cgdro.Regression import ml
from cgdro.data import DataContainerSimu_Nonlinear_reg
# number of source groups = 3, each with 10000 samples, and 100000 target samples
# dimension p = 5
# sigma: source group 1,3: 0.5; source group 2: 3.
data = DataContainerSimu_Nonlinear_reg(n=10000, N=100000)
data.generate_funcs_list(L=3, seed=0)
data.generate_data()
Xlist = data.X_sources_list
Ylist = data.Y_sources_list
X0 = data.X_target
## First announcing the module
## Then calling the functions fit()
drol = ml(f_learner='xgb', w_learner='kliep')
drol.fit(Xlist,Ylist,X0, loss_type='reward')
drol.weight_
## prediction
drol.predict()
## First announcing the module
## Then calling the functions fit()
drol = ml(f_learner='xgb', w_learner='kliep')
drol.fit(Xlist,Ylist,X0, loss_type='squaredloss')
drol.weight_
## prediction
drol.predict()
## First announcing the module
## Then calling the functions fit()
drol = ml(f_learner='xgb', w_learner='kliep')
drol.fit(Xlist,Ylist,X0, loss_type='regret')
drol.weight_
## prediction
drol.predict()