DatasetLoader

class DatasetLoader[source]

Bases: abc.ABC

A generic dataset.

Attributes Summary

context_channels

Get the number of features for each context.

drug_channels

Get the number of features for each drug.

num_contexts

Get the number of contexts.

num_drugs

Get the number of drugs.

num_labeled_triples

Get the number of labeled triples.

Methods Summary

get_context_features()

Get the context feature set.

get_drug_features()

Get the drug feature set.

get_generator(batch_size, context_features, …)

Initialize a batch generator.

get_generators(batch_size, context_features, …)

Generate a pre-stratified pair of batch generators.

get_labeled_triples()

Get the labeled triples file from the storage.

summarize()

Summarize the dataset.

Attributes Documentation

context_channels

Get the number of features for each context.

Return type

int

drug_channels

Get the number of features for each drug.

Return type

int

num_contexts

Get the number of contexts.

Return type

int

num_drugs

Get the number of drugs.

Return type

int

num_labeled_triples

Get the number of labeled triples.

Return type

int

Methods Documentation

abstract get_context_features()[source]

Get the context feature set.

Return type

ContextFeatureSet

abstract get_drug_features()[source]

Get the drug feature set.

get_generator(batch_size, context_features, drug_features, drug_molecules, labeled_triples=None)[source]

Initialize a batch generator.

Parameters
  • batch_size (int) – Number of drug pairs per batch.

  • context_features (bool) – Indicator whether the batch should include biological context features.

  • drug_features (bool) – Indicator whether the batch should include drug features.

  • drug_molecules (bool) – Indicator whether the batch should include drug molecules

  • labeled_triples (Optional[LabeledTriples]) – A labeled triples object used to generate batches. If none is given, will use all triples from the dataset.

Return type

BatchGenerator

Returns

A batch generator

get_generators(batch_size, context_features, drug_features, drug_molecules, train_size=None, random_state=None)[source]

Generate a pre-stratified pair of batch generators.

Return type

Tuple[BatchGenerator, BatchGenerator]

abstract get_labeled_triples()[source]

Get the labeled triples file from the storage.

Return type

LabeledTriples

summarize()[source]

Summarize the dataset.

Return type

None