nourish.load_dataset¶
-
nourish.load_dataset(name, *, version='latest', download=True, subdatasets=None)¶ High level function that wraps
dataset.Datasetclass’s load and download functionality. Downloads to and loads from directory:DATADIR/dataset_schemata_name/name/versionwhereDATADIRis innourish.get_config().DATADIR.DATADIRcan be changed by callinginit().- Parameters
name (str) – Name of the dataset you want to load from Nourish’s available datasets. You can get a list of these datasets by calling
list_all_datasets().version (str) – Version of the dataset to load. Latest version is used by default. You can get a list of all available versions for a dataset by calling
list_all_datasets().download (bool) – Whether or not the dataset should be downloaded before loading.
subdatasets (Optional[Iterable[str]]) – An iterable containing the subdatasets to load.
Nonemeans all subdatasets.
- Raises
FileNotFoundError – The dataset files were not previously downloaded or can’t be found, and
downloadisFalse.- Returns
Dictionary that holds all subdatasets.
- Return type
Dict[str, Any]
Example:
>>> data = load_dataset('noaa_jfk') >>> data['jfk_weather_cleaned'][['DATE', 'HOURLYVISIBILITY', 'HOURLYDRYBULBTEMPF']].head(3) DATE HOURLYVISIBILITY HOURLYDRYBULBTEMPF 0 2010-01-01 01:00:00 6.0 33.0 1 2010-01-01 02:00:00 6.0 33.0 2 2010-01-01 03:00:00 5.0 33.0