Skip to content

ts_shape.loader.combine.integrator ¤

Classes:

  • DataIntegratorHybrid

    A flexible utility class to integrate data from various sources, including:

DataIntegratorHybrid ¤

A flexible utility class to integrate data from various sources, including: - API instances (e.g., DatapointAPI) - Direct raw data (e.g., UUID list, metadata, timeseries DataFrame) - Hybrid approaches (combination of instances and raw data)

Methods:

  • combine_data

    Combine timeseries and metadata from various sources.

combine_data classmethod ¤

combine_data(timeseries_sources: Optional[List[Union[DataFrame, object]]] = None, metadata_sources: Optional[List[Union[DataFrame, object]]] = None, uuids: Optional[List[str]] = None, join_key: str = 'uuid', merge_how: str = 'left') -> DataFrame

Combine timeseries and metadata from various sources.

:param timeseries_sources: List of timeseries sources (DataFrame or instances with fetch_data_as_dataframe). :param metadata_sources: List of metadata sources (DataFrame or instances with fetch_metadata). :param uuids: Optional list of UUIDs to filter the combined data. :param join_key: Key column to use for merging, default is "uuid". :param merge_how: Merge strategy ('left', 'inner', etc.), default is "left". :return: A combined DataFrame.

Source code in src/ts_shape/loader/combine/integrator.py
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
@classmethod
def combine_data(
    cls,
    timeseries_sources: Optional[List[Union[pd.DataFrame, object]]] = None,
    metadata_sources: Optional[List[Union[pd.DataFrame, object]]] = None,
    uuids: Optional[List[str]] = None,
    join_key: str = "uuid",
    merge_how: str = "left",
) -> pd.DataFrame:
    """
    Combine timeseries and metadata from various sources.

    :param timeseries_sources: List of timeseries sources (DataFrame or instances with `fetch_data_as_dataframe`).
    :param metadata_sources: List of metadata sources (DataFrame or instances with `fetch_metadata`).
    :param uuids: Optional list of UUIDs to filter the combined data.
    :param join_key: Key column to use for merging, default is "uuid".
    :param merge_how: Merge strategy ('left', 'inner', etc.), default is "left".
    :return: A combined DataFrame.
    """
    # Retrieve and combine timeseries data
    timeseries_data = cls._combine_timeseries(timeseries_sources, join_key)

    if timeseries_data.empty:
        print("No timeseries data found.")
        return pd.DataFrame()

    # Retrieve and combine metadata
    metadata = cls._combine_metadata(metadata_sources, join_key)

    if metadata.empty:
        print("No metadata found.")
        return timeseries_data

    # Merge timeseries data with metadata
    combined_data = pd.merge(timeseries_data, metadata, on=join_key, how=merge_how)

    # Optionally filter the combined data by UUIDs
    if uuids:
        combined_data = combined_data[combined_data[join_key].isin(uuids)]

    return combined_data