s3proxy_parquet_loader
s3proxy_parquet_loader ¤
S3ProxyDataAccess ¤
S3ProxyDataAccess(
start_timestamp: str,
end_timestamp: str,
uuids: List[str],
s3_config: Dict[str, str],
)
A class to access timeseries data via an S3 proxy. This class retrieves data for specified UUIDs within a defined time range, with the option to output data as Parquet files or as a single combined DataFrame.
Initialize the S3ProxyDataAccess object. :param start_timestamp: Start timestamp in "Year-Month-Day Hour:Minute:Second" format. :param end_timestamp: End timestamp in "Year-Month-Day Hour:Minute:Second" format. :param uuids: List of UUIDs to retrieve data for. :param s3_config: Configuration dictionary for S3 connection.
fetch_data_as_parquet ¤
fetch_data_as_parquet(output_dir: str)
Retrieves timeseries data from S3 and saves it as Parquet files. Each file is saved in a directory structure of UUID/year/month/day/hour. :param output_dir: Base directory to save the Parquet files.
fetch_data_as_dataframe ¤
fetch_data_as_dataframe() -> pd.DataFrame
Retrieves timeseries data from S3 and returns it as a single DataFrame. :return: A combined DataFrame with data for all specified UUIDs and time slots.