ts_shape.loader.timeseries.parquet_loader
¤
Classes:
-
ParquetLoader
–This class provides class methods to load parquet files from a specified directory structure.
ParquetLoader
¤
This class provides class methods to load parquet files from a specified directory structure.
Parameters:
Methods:
-
load_all_files
–Loads all parquet files in the specified base directory into a single pandas DataFrame.
-
load_by_time_range
–Loads parquet files that fall within a specified time range based on the directory structure.
-
load_by_uuid_list
–Loads parquet files that match any UUID in the specified list.
-
load_files_by_time_range_and_uuids
–Loads parquet files that fall within a specified time range and match any UUID in the list.
Source code in src/ts_shape/loader/timeseries/parquet_loader.py
9 10 11 12 13 14 15 16 |
|
load_all_files
classmethod
¤
Loads all parquet files in the specified base directory into a single pandas DataFrame.
Parameters:
Returns:
-
DataFrame
–pd.DataFrame: A DataFrame containing all the data from the parquet files.
Source code in src/ts_shape/loader/timeseries/parquet_loader.py
32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 |
|
load_by_time_range
classmethod
¤
load_by_time_range(base_path: str, start_time: Timestamp, end_time: Timestamp) -> DataFrame
Loads parquet files that fall within a specified time range based on the directory structure.
The directory structure is expected to be in the format YYYY/MM/DD/HH.
Parameters:
-
base_path
¤str
) –The base directory where parquet files are stored.
-
start_time
¤Timestamp
) –The start timestamp.
-
end_time
¤Timestamp
) –The end timestamp.
Returns:
-
DataFrame
–pd.DataFrame: A DataFrame containing the data from the parquet files within the time range.
Source code in src/ts_shape/loader/timeseries/parquet_loader.py
53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 |
|
load_by_uuid_list
classmethod
¤
Loads parquet files that match any UUID in the specified list.
The UUIDs are expected to be part of the file names.
Parameters:
-
base_path
¤str
) –The base directory where parquet files are stored.
-
uuid_list
¤list
) –A list of UUIDs to filter the files.
Returns:
-
DataFrame
–pd.DataFrame: A DataFrame containing the data from the parquet files with matching UUIDs.
Source code in src/ts_shape/loader/timeseries/parquet_loader.py
92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 |
|
load_files_by_time_range_and_uuids
classmethod
¤
load_files_by_time_range_and_uuids(base_path: str, start_time: Timestamp, end_time: Timestamp, uuid_list: list) -> DataFrame
Loads parquet files that fall within a specified time range and match any UUID in the list.
The directory structure is expected to be in the format YYYY/MM/DD/HH, and UUIDs are part of the file names.
Parameters:
-
base_path
¤str
) –The base directory where parquet files are stored.
-
start_time
¤Timestamp
) –The start timestamp.
-
end_time
¤Timestamp
) –The end timestamp.
-
uuid_list
¤list
) –A list of UUIDs to filter the files.
Returns:
-
DataFrame
–pd.DataFrame: A DataFrame containing the data from the parquet files that meet both criteria.
Source code in src/ts_shape/loader/timeseries/parquet_loader.py
125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 |
|