Utilities

Submodules

kale.utils.distance module

Provide distance (i.e., similarity) calculation methods using various distance metrics.

class kale.utils.distance.DistanceMetric(value)

Bases: Enum

An enumeration.

COSINE = 'COSINE'
kale.utils.distance.calculate_distance(x1: Tensor, x2: Tensor | None = None, eps: float = 1e-08, metric: DistanceMetric = DistanceMetric.COSINE) Tensor

Returns similarity between \(x_1\) and \(x_2\), computed along `dim`=1. This method calculates the similarity between each pair of data points in two input matrices.

Note that this implementation differs from the existing implementations in PyTorch, as they calculate the similarity between each row of one matrix with its corresponding row in the other matrix (i.e., pairwise distance between columns of input matrices).

Parameters:
  • x1 (torch.Tensor) – The tensor input data.

  • x2 (torch.Tensor, optional) – The tensor input data. (default None)

  • eps (float, optional) – Small value to avoid division by zero. (default: 1e-8)

  • metric (DistanceMetric, optional) – The metric to compute distance between input matrices. (default: DistanceMetric.COSINE)

  • Returns

  • torch.Tensor – The computed similarity tensor between \(x_1\) and \(x_2\).

kale.utils.download module

Data downloading and compressed data extraction functions, Based on https://github.com/pytorch/vision/blob/master/torchvision/datasets/utils.py https://github.com/pytorch/pytorch/blob/master/torch/hub.py

kale.utils.download.download_file_by_url(url, output_directory, output_file_name, file_format=None)

Download file/compressed file by url.

Parameters:
  • url (string) – URL of the object to download

  • output_directory (string, optional) – Full path where object will be saved Abosolute path recommended. Relative path also works.

  • output_file_name (string, optional) – File name which object will be saved as

  • file_format (string, optional) – File format For compressed file, support [“tar.xz”, “tar”, “tar.gz”, “tgz”, “gz”, “zip”]

Example: (Grab the raw link from GitHub. Notice that using “raw” in the URL.)
>>> url = "https://github.com/pykale/data/raw/main/videos/video_test_data/ADL/annotations/labels_train_test/adl_P_04_train.pkl"
>>> download_file_by_url(url, "data", "a.pkl", "pkl")
>>> url = "https://github.com/pykale/data/raw/main/videos/video_test_data.zip"
>>> download_file_by_url(url, "data", "video_test_data.zip", "zip")
kale.utils.download.download_file_gdrive(id, output_directory, output_file_name, file_format=None)

Download file/compressed file by Google Drive id.

Parameters:
  • id (string) – Google Drive file id of the object to download

  • output_directory (string, optional) – Full path where object will be saved Abosolute path recommended. Relative path also works.

  • output_file_name (string, optional) – File name which object will be saved as

  • file_format (string, optional) – File format For compressed file, support [“tar.xz”, “tar”, “tar.gz”, “tgz”, “gz”, “zip”]

Example

>>> gdrive_id = "1U4D23R8u8MJX9KVKb92bZZX-tbpKWtga"
>>> download_file_gdrive(gdrive_id, "data", "demo_datasets.zip", "zip")
>>> gdrive_id = "1SV7fmAnWj-6AU9X5BGOrvGMoh2Gu9Nih"
>>> download_file_gdrive(gdrive_id, "data", "dummy_data.csv", "csv")

kale.utils.initialize_nn module

Provide methods for initializing neural network parameters (i.e., weights and biases).

kale.utils.initialize_nn.xavier_init(module) None

Fills the weight of the input Tensor with values using a normal distribution.

Parameters:

module (torch.Tensor) – The input module.

kale.utils.initialize_nn.bias_init(module) None

Fills the bias of the input Tensor with zeros.

Parameters:

module (torch.Tensor) – The input module.

kale.utils.logger module

Logging functions, based on https://github.com/HaozhiQi/ISONet/blob/master/isonet/utils/logger.py

kale.utils.logger.out_file_core()

Creates an output file name concatenating a formatted date and uuid, but without an extension.

Returns:

A string to be used in a file name.

Return type:

string

kale.utils.logger.construct_logger(name, save_dir, log_to_terminal=False)

Constructs a logger. Saves the output as a text file at a specified path. Also saves the output of git diff HEAD to the same folder. Takes option to log to terminal, which will print logging statements. Default is False.

The logger is configured to output messages at the DEBUG level, and it saves the output as a text file with a name based on the current timestamp and the specified name. It also saves the output of git diff HEAD to a file with the same name and the extension .gitdiff.patch.

Parameters:
  • name (str) – The name of the logger, typically the name of the method being logged.

  • save_dir (str) – The directory where the log file and git diff file will be saved.

  • log_to_terminal (bool, optional) – Whether to also log messages to the terminal. Defaults to False.

Returns:

The constructed logger.

Return type:

logging.Logger

Reference:

https://docs.python.org/3/library/logging.html

Raises:

None.

kale.utils.print module

Screen printing functions, from https://github.com/HaozhiQi/ISONet/blob/master/isonet/utils/misc.py

kale.utils.print.tprint(*args)

Temporarily prints things on the screen so that it won’t be flooded

kale.utils.print.pprint(*args)

Permanently prints things on the screen to have all info displayed

kale.utils.print.pprint_without_newline(*args)

Permanently prints things on the screen, separated by space rather than newline

kale.utils.save_xlsx module

Authors: Lawrence Schobs, lawrenceschobs@gmail.com

Functions to save results to Excel files.

kale.utils.save_xlsx.generate_summary_df(results_dictionary: dict, cols_to_save: list, sheet_name: str, save_location: str) DataFrame

Generates pandas dataframe with summary statistics. Designed for use in Quantile Binning (/pykale/examples/landmark_uncertainty/main.py).

Parameters:
  • results_dictionary (dict) – A dictionary containing results for each quantile bin and uncertainty method. The keys are strings indicating the name of each uncertainty method. The values are dictionaries containing results for each quantile bin.

  • cols_to_save (list) – A list of 2-element lists, each containing a string indicating the key in the results_dictionary and a string indicating the name to use for that column in the output dataframe.

  • sheet_name (str) – The name of the sheet to create in the output Excel file.

  • save_location (str) – The file path and name to use for the output Excel file.

Returns:

A dataframe with statistics including mean error, std error of All and individual targets. Also includes the Sucess detection rates (SDR). The dataframe should have the following structure:

df = {

“All um col_save_name Mean”: value, “All um col_save_name Std”: value, “B1 um col_save_name Mean”: value, “B1 um col_save_name Std”: value, …

}

Return type:

pd.DataFrame

kale.utils.save_xlsx.save_dict_xlsx(data_dict: Dict[Any, Any], save_location: str, sheet_name: str) None

Save a dictionary to an Excel file using the XlsxWriter engine.

Parameters:
  • data_dict (Dict[Any, Any]) – The dictionary that needs to be saved to an Excel file. The keys represent the row index and the values represent the data in the row. If a dictionary value is a list or a series, each element in the list/series will be a column in the row.

  • save_location (str) – The location where the Excel file will be saved. This should include the full path and the filename, for example, “/path/to/save/data.xlsx”. Overwrites the file if it already exists.

  • sheet_name (str) – The name of the sheet where the dictionary will be saved in the Excel file.

Returns:

This function does not return anything. It saves the dictionary as an Excel file at the specified location.

Return type:

None

kale.utils.seed module

Setting seed for reproducibility

kale.utils.seed.set_seed(seed=1000)

Sets the seed for generating random numbers to get (as) reproducible (as possible) results.

The CuDNN options are set according to the official PyTorch guidance on reproducibility: https://pytorch.org/docs/stable/notes/randomness.html. Another references are https://discuss.pytorch.org/t/difference-between-torch-manual-seed-and-torch-cuda-manual-seed/13848/6 https://pytorch.org/docs/stable/cuda.html#torch.cuda.manual_seed https://github.com/open-mmlab/mmcv/blob/master/mmcv/runner/utils.py#L58

Parameters:

seed (int, optional) – The desired seed. Defaults to 1000.

Module contents