torch_mimicry.datasets

Dataset Loaders

Script for loading datasets.

load_celeba_dataset(root, transform_data=True, convert_tensor=True, download=True, split='all', size=64, **kwargs)[source]

Loads the CelebA dataset.

Parameters:
  • root (str) – Path to where datasets are stored.
  • size (int) – Size to resize images to.
  • transform_data (bool) – If True, preprocesses data.
  • split (str) – The split of data to use.
  • download (bool) – If True, downloads the dataset.
  • convert_tensor (bool) – If True, converts image to tensor and preprocess to range [-1, 1].
Returns:

Torch Dataset object.

Return type:

Dataset

load_cifar100_dataset(root, split='train', download=True, transform_data=True, convert_tensor=True, **kwargs)[source]

Loads the CIFAR-100 dataset.

Parameters:
  • root (str) – Path to where datasets are stored.
  • transform_data (bool) – If True, preprocesses data.
  • split (str) – The split of data to use.
  • download (bool) – If True, downloads the dataset.
  • convert_tensor (bool) – If True, converts image to tensor and preprocess to range [-1, 1].
Returns:

Torch Dataset object.

Return type:

Dataset

load_cifar10_dataset(root, split='train', download=True, transform_data=True, **kwargs)[source]

Loads the CIFAR-10 dataset.

Parameters:
  • root (str) – Path to where datasets are stored.
  • transform_data (bool) – If True, preprocesses data.
  • split (str) – The split of data to use.
  • download (bool) – If True, downloads the dataset.
  • convert_tensor (bool) – If True, converts image to tensor and preprocess to range [-1, 1].
Returns:

Torch Dataset object.

Return type:

Dataset

load_dataset(root, name, **kwargs)[source]

Loads different datasets specifically for GAN training. By default, all images are normalized to values in the range [-1, 1].

Parameters:
  • root (str) – Path to where datasets are stored.
  • name (str) – Name of dataset to load.
Returns:

Torch Dataset object for a specific dataset.

Return type:

Dataset

load_fake_dataset(root, transform_data=True, convert_tensor=True, image_size=(3, 32, 32), **kwargs)[source]

Loads fake dataset for testing.

Parameters:
  • root (str) – Path to where datasets are stored.
  • transform_data (bool) – If True, preprocesses data.
  • convert_tensor (bool) – If True, converts image to tensor and preprocess to range [-1, 1].
Returns:

Torch Dataset object.

Return type:

Dataset

load_imagenet_dataset(root, size=32, split='train', download=True, transform_data=True, convert_tensor=True, **kwargs)[source]

Loads the ImageNet dataset.

Parameters:
  • root (str) – Path to where datasets are stored.
  • size (int) – Size to resize images to.
  • transform_data (bool) – If True, preprocesses data.
  • split (str) – The split of data to use.
  • download (bool) – If True, downloads the dataset.
  • convert_tensor (bool) – If True, converts image to tensor and preprocess to range [-1, 1].
Returns:

Torch Dataset object.

Return type:

Dataset

load_lsun_bedroom_dataset(root, size=128, transform_data=True, convert_tensor=True, **kwargs)[source]

Loads LSUN-Bedroom dataset.

Parameters:
  • root (str) – Path to where datasets are stored.
  • size (int) – Size to resize images to.
  • transform_data (bool) – If True, preprocesses data.
  • convert_tensor (bool) – If True, converts image to tensor and preprocess to range [-1, 1].
Returns:

Torch Dataset object.

Return type:

Dataset

load_stl10_dataset(root, size=48, split='unlabeled', download=True, transform_data=True, convert_tensor=True, **kwargs)[source]

Loads the STL10 dataset.

Parameters:
  • root (str) – Path to where datasets are stored.
  • size (int) – Size to resize images to.
  • transform_data (bool) – If True, preprocesses data.
  • split (str) – The split of data to use.
  • download (bool) – If True, downloads the dataset.
  • convert_tensor (bool) – If True, converts image to tensor and preprocess to range [-1, 1].
Returns:

Torch Dataset object.

Return type:

Dataset

Image Loaders

Loads randomly sampled images from datasets for computing metrics.

get_celeba_images(num_samples, root='./datasets', size=128, **kwargs)[source]

Loads randomly sampled CelebA images.

Parameters:
  • num_samples (int) – The number of images to randomly sample.
  • root (str) – The root directory where all datasets are stored.
  • size (int) – Size of image to resize to.
Returns:

Batch of num_samples images in np array form.

Return type:

ndarray

get_cifar100_images(num_samples, root='./datasets', **kwargs)[source]

Loads randomly sampled CIFAR-100 training images.

Parameters:
  • num_samples (int) – The number of images to randomly sample.
  • root (str) – The root directory where all datasets are stored.
Returns:

Batch of num_samples images in np array form.

Return type:

ndarray

get_cifar10_images(num_samples, root='./datasets', **kwargs)[source]

Loads randomly sampled CIFAR-10 training images.

Parameters:
  • num_samples (int) – The number of images to randomly sample.
  • root (str) – The root directory where all datasets are stored.
Returns:

Batch of num_samples images in np array form.

Return type:

ndarray

get_dataset_images(dataset, num_samples=50000, **kwargs)[source]

Randomly sample num_samples images based on input dataset name.

Parameters:
  • dataset (str/Dataset) – Dataset to load images from.
  • num_samples (int) – The number of images to randomly sample.
Returns:

Batch of num_samples images from a dataset in np array form.

The final format is of (N, H, W, 3) shape for TF inference.

Return type:

ndarray

get_fake_data_images(num_samples, root='./datasets', size=32, **kwargs)[source]

Loads fake images, especially for testing.

Parameters:
  • num_samples (int) – The number of images to randomly sample.
  • root (str) – The root directory where all datasets are stored.
  • size (int) – Size of image to resize to.
Returns:

Batch of num_samples images in np array form.

Return type:

ndarray

get_imagenet_images(num_samples, root='./datasets', size=32)[source]

Directly reads the imagenet folder for obtaining random images sampled in equal proportion for each class.

Parameters:
  • num_samples (int) – The number of images to randomly sample.
  • root (str) – The root directory where all datasets are stored.
  • size (int) – Size of image to resize to.
Returns:

Batch of num_samples images in np array form.

Return type:

ndarray

get_lsun_bedroom_images(num_samples, root='./datasets', size=128, **kwargs)[source]

Loads randomly sampled LSUN-Bedroom training images.

Parameters:
  • num_samples (int) – The number of images to randomly sample.
  • root (str) – The root directory where all datasets are stored.
  • size (int) – Size of image to resize to.
Returns:

Batch of num_samples images in np array form.

Return type:

ndarray

get_random_images(dataset, num_samples)[source]

Randomly sample without replacement num_samples images.

Parameters:
  • dataset (Dataset) – Torch Dataset object for indexing elements.
  • num_samples (int) – The number of images to randomly sample.
Returns:

Batch of num_samples images in np array form.

Return type:

ndarray

get_stl10_images(num_samples, root='./datasets', size=48, **kwargs)[source]

Loads randomly sampled STL-10 images.

Parameters:
  • num_samples (int) – The number of images to randomly sample.
  • root (str) – The root directory where all datasets are stored.
  • size (int) – Size of image to resize to.
Returns:

Batch of num_samples images in np array form.

Return type:

ndarray

sample_dataset_images(dataset, num_samples)[source]

Randomly samples the dataset for images.

Parameters:
  • dataset (Dataset) – Torch dataset object to sample images from.
  • num_samples (int) – The number of images to randomly sample.
Returns:

Numpy array of images with first dim as batch size.

Return type:

ndarray