LocalDatasetLoader

class LocalDatasetLoader(directory=None)[source]

Bases: chemicalx.data.datasetloader.DatasetLoader, abc.ABC

A dataset loader that processes and caches data locally.

Attributes Summary

contexts_name

features_name

labels_name

structures_name

Methods Summary

get_context_features()

Get the context feature set.

get_drug_features()

Get the drug feature set.

get_labeled_triples()

Get the labeled triples dataframe.

preprocess()

Download and preprocess the dataset.

write_contexts(contexts)

Write the context feature set.

write_drugs(drugs)

Write the drug data.

write_labels(df)

Write the labeled triples dataframe.

Attributes Documentation

contexts_name: ClassVar[str] = 'contexts.tsv'
features_name: ClassVar[str] = 'features.tsv'
labels_name: ClassVar[str] = 'labels.tsv'
structures_name: ClassVar[str] = 'structures.tsv'

Methods Documentation

get_context_features()[source]

Get the context feature set.

Return type

ContextFeatureSet

get_drug_features()[source]

Get the drug feature set.

Return type

DrugFeatureSet

get_labeled_triples()[source]

Get the labeled triples dataframe.

Return type

LabeledTriples

abstract preprocess()[source]

Download and preprocess the dataset.

The implementation of this function should write to all three of self.drugs_path, self.contexts_path, and self.labels_path using respectively write_drugs(), write_contexts(), and write_labels().

write_contexts(contexts)[source]

Write the context feature set.

write_drugs(drugs)[source]

Write the drug data.

Return type

None

write_labels(df)[source]

Write the labeled triples dataframe.