captest.io.DataLoader.load

DataLoader.load(extension='csv', summary=True, verbose=False, raise_errors=False, skip_dir_load=False, **kwargs)

Load file(s) of timeseries data from SCADA / DAS systems.

Set path to the path to a file to load a single file. Set path to the path to a directory of files to load all the files in the directory ending in “csv”. Or, set files_to_load to a list of specific files to load. Paths may be local filesystem paths or S3 URIs (e.g. s3://bucket/path/).

Multiple files will be joined together and may include files with different column headings. When multiple files with matching column headings are loaded, the individual files will be reindexed and then joined.

Missing time intervals within the individual files will be filled, but missing time intervals between the individual files will not be filled.

When loading multiple files they will be stored in loaded_files, a dictionary, mapping the file names to a dataframe for each file.

Parameters:
  • extension (str, default "csv") – Change the extension to allow loading different filetypes. Must also set the file_reader attribute to a function that will read that type of file. Do not include a period “.”.

  • summary (bool, default True) – By default prints path of each file attempted to load and then confirmation it was loaded or states it failed to load. Is only relevant if path is set to a directory not a file. Set to False to not print out any file loading status.

  • verbose (bool, default False) – Prints same output as if summary were True (sets summary True) and prints details of reindexing each file after loading.

  • raise_errors (bool, default False) – Set to true to raise error if file fails to load.

  • skip_dir_load (bool, default False) – Set to True to pass a custom file_reader that handles multiple files. This will skip the parsing of files in a directory and pass the path to the directory and kwargs to the file_reader function.

  • **kwargs – Are passed through to the file_reader callable, which by default will pass them on to pandas.read_csv.

Returns:

Resulting DataFrame of data is stored to the data attribute.

Return type:

None