Evaluate
Submodules
kale.evaluate.cross_validation module
Functions to do cross-validation with domain adaptation and pre-domain adaptation transformation for assessing model fitness.
- kale.evaluate.cross_validation.leave_one_group_out(x, y, groups, estimator, use_domain_adaptation=False) dict
Perform leave one group out cross validation for a given estimator.
- Parameters:
x (np.ndarray or torch.tensor) – Input data [num_samples, num_features].
y (np.ndarray or torch.tensor) – Target labels [num_samples].
groups (np.ndarray or torch.tensor) – Group labels to be left out [num_samples].
estimator (estimator object) – Machine learning estimator to be evaluated from kale or scikit-learn.
use_domain_adaptation (bool) – Whether to use domain adaptation, i.e., leveraging test data, during training.
- Returns:
- A dictionary containing results for each target group with 3 keys.
’Target’: A list of unique target groups or classes. The final entry is “Average”.
- ’Num_samples’: A list where each entry indicates the number of samples in its corresponding target group.
The final entry represents the total number of samples.
- ’Accuracy’: A list where each entry indicates the accuracy score for its corresponding target group.
The final entry represents the overall mean accuracy.
- Return type:
dict
- kale.evaluate.cross_validation.cross_validate(estimator, x, y=None, groups=None, transformer=None, domain_adapter=None, group_labels=None, scoring=None, cv=None, num_jobs=None, verbose=0, parameters=None, fit_args=None, score_args=None, pre_dispatch='2*n_jobs', return_train_score=False, return_estimator=False, return_indices=False, error_score=nan)
Run cross-validation and record fit and score times.
- Parameters:
estimator (sklearn.base.BaseEstimator) – A scikit-learn estimator implementing fit and predict methods.
x (array-like) – Input data for training and evaluation [num_samples, num_features].
y (array-like) – Target variable for supervised learning [num_samples] or [num_samples, num_targets].
groups (array-like, optional) – Group labels for the samples used while splitting the dataset into train/test sets.
transformer (sklearn.base.BaseEstimator, optional) – An unsupervised transformer implementing fit and transform methods applied before domain adaptation.
domain_adapter (sklearn.base.BaseEstimator, optional) – A domain adapter implementing fit and transform methods.
group_labels (array-like) – The factors for adaptation with shape (num_samples, num_factors). Please preprocess the factors before domain adaptation (e.g. one-hot encode domain, gender, or standardize age).
scoring (callable, list, tuple, dict, optional) – A scoring function or a list of scoring functions to evaluate the estimator’s performance.
cv (cv_object, optional) – Cross-validation splitting strategy.
num_jobs (int, optional) – Number of jobs to run cross-validation in parallel using joblib.Parallel.
verbose (int) – Level of verbosity for logging.
parameters (dict, optional) – Parameters to configure the estimator.
fit_args (dict, optional) – Additional arguments for the estimator’s fit method.
score_args (dict, optional) – Additional arguments for the scorer’s score method.
pre_dispatch (int, str) – Controls the number of jobs that get dispatched during parallel execution for joblib.Parallel.
return_train_score (bool) – Whether to include training scores in the results.
return_estimator (bool) – Whether to include the fitted estimator in the results.
return_indices (bool) – Whether to include the indices of the training and testing sets in the results.
error_score (float or str) – Value to assign to the score if an error occurs during fitting or scoring or to raise error when set to “raise”.
- Returns:
- A dictionary containing the results of fitting and scoring, including:
”train_scores” (dict, optional): Scores on training set, if return_train_score=True.
”test_scores” (dict): Scores on testing set.
”fit_time” (float, optional): Time taken to fit the estimator, if return_times=True.
”score_time” (float): Time taken to score the estimator.
”estimator” (object, optional): The fitted estimator, if return_estimator=True.
”indices” (dict, optional): Indices of the training and testing sets, if return_indices=True.
- Return type:
dict