ts_shape.features.cycles.cycle_processor ¤

Classes:

CycleDataProcessor –

A class to process cycle-based data and values. It allows for splitting, merging, and grouping DataFrames

CycleDataProcessor ¤

CycleDataProcessor(cycles_df: DataFrame, values_df: DataFrame, cycle_uuid_col: str = 'cycle_uuid', systime_col: str = 'systime')

Bases: Base

A class to process cycle-based data and values. It allows for splitting, merging, and grouping DataFrames based on cycles, as well as handling grouping and transformations by cycle UUIDs.

Parameters:

cycles_df ¤
(DataFrame) –

DataFrame containing columns 'cycle_start', 'cycle_end', and 'cycle_uuid'.
values_df ¤
(DataFrame) –

DataFrame containing the values and timestamps in the 'systime' column.
cycle_uuid_col ¤
(str, default: 'cycle_uuid' ) –

Name of the column representing cycle UUIDs.
systime_col ¤
(str, default: 'systime' ) –

Name of the column representing the timestamps for the values.

Methods:

get_dataframe –

Returns the processed DataFrame.
group_by_cycle_uuid –

Group the DataFrame by the cycle_uuid column, resulting in a list of DataFrames, each containing data for one cycle.
merge_dataframes_by_cycle –

Merges the values DataFrame with the cycles DataFrame based on the cycle time intervals.
split_by_cycle –

Splits the values DataFrame by cycles defined in the cycles DataFrame.
split_dataframes_by_group –

Splits a list of DataFrames by groups based on a specified column.

Source code in src/ts_shape/features/cycles/cycle_processor.py

def __init__(self, cycles_df: pd.DataFrame, values_df: pd.DataFrame, cycle_uuid_col: str = "cycle_uuid", systime_col: str = "systime"):
    """
    Initializes the CycleDataProcessor with cycles and values DataFrames.

    Args:
        cycles_df: DataFrame containing columns 'cycle_start', 'cycle_end', and 'cycle_uuid'.
        values_df: DataFrame containing the values and timestamps in the 'systime' column.
        cycle_uuid_col: Name of the column representing cycle UUIDs.
        systime_col: Name of the column representing the timestamps for the values.
    """
    super().__init__(values_df)  # Call the parent constructor
    self.values_df = values_df.copy()  # Initialize self.values_df explicitly
    self.cycles_df = cycles_df.copy()
    self.cycle_uuid_col = cycle_uuid_col
    self.systime_col = systime_col

    # Ensure proper datetime format
    self.cycles_df['cycle_start'] = pd.to_datetime(self.cycles_df['cycle_start'])
    self.cycles_df['cycle_end'] = pd.to_datetime(self.cycles_df['cycle_end'])
    self.values_df[systime_col] = pd.to_datetime(self.values_df[systime_col])

    logging.info("CycleDataProcessor initialized with cycles and values DataFrames.")

get_dataframe ¤

get_dataframe() -> DataFrame

Returns the processed DataFrame.

Source code in src/ts_shape/utils/base.py

def get_dataframe(self) -> pd.DataFrame:
    """Returns the processed DataFrame."""
    return self.dataframe

group_by_cycle_uuid ¤

group_by_cycle_uuid(data: Optional[DataFrame] = None) -> List[DataFrame]

Group the DataFrame by the cycle_uuid column, resulting in a list of DataFrames, each containing data for one cycle.

Parameters:

data ¤
(Optional[DataFrame], default: None ) –

DataFrame containing the data to be grouped by cycle_uuid. If None, uses the internal values_df.

Return

List of DataFrames, each containing data for a unique cycle_uuid.

Source code in src/ts_shape/features/cycles/cycle_processor.py

def group_by_cycle_uuid(self, data: Optional[pd.DataFrame] = None) -> List[pd.DataFrame]:
    """
    Group the DataFrame by the cycle_uuid column, resulting in a list of DataFrames, each containing data for one cycle.

    Args:
        data: DataFrame containing the data to be grouped by cycle_uuid. If None, uses the internal values_df.

    Return:
        List of DataFrames, each containing data for a unique cycle_uuid.
    """
    if data is None:
        data = self.values_df

    grouped_dataframes = [group for _, group in data.groupby(self.cycle_uuid_col)]
    logging.info(f"Grouped data into {len(grouped_dataframes)} cycle UUID groups.")
    return grouped_dataframes

merge_dataframes_by_cycle ¤

merge_dataframes_by_cycle() -> DataFrame

Merges the values DataFrame with the cycles DataFrame based on the cycle time intervals. Appends the 'cycle_uuid' to the values DataFrame.

Return

DataFrame with an added 'cycle_uuid' column.

Source code in src/ts_shape/features/cycles/cycle_processor.py

def merge_dataframes_by_cycle(self) -> pd.DataFrame:
    """
    Merges the values DataFrame with the cycles DataFrame based on the cycle time intervals. 
    Appends the 'cycle_uuid' to the values DataFrame.

    Return:
        DataFrame with an added 'cycle_uuid' column.
    """
    # Merge based on systime falling within cycle_start and cycle_end
    self.values_df[self.cycle_uuid_col] = None

    for _, row in self.cycles_df.iterrows():
        mask = (self.values_df[self.systime_col] >= row['cycle_start']) & (self.values_df[self.systime_col] <= row['cycle_end'])
        self.values_df.loc[mask, self.cycle_uuid_col] = row[self.cycle_uuid_col]

    merged_df = self.values_df.dropna(subset=[self.cycle_uuid_col])
    logging.info(f"Merged DataFrame contains {len(merged_df)} records.")
    return merged_df

split_by_cycle ¤

split_by_cycle() -> Dict[str, DataFrame]

Splits the values DataFrame by cycles defined in the cycles DataFrame. Each cycle is defined by a start and end time, and the corresponding values are filtered accordingly.

Return

Dictionary where keys are cycle_uuids and values are DataFrames with the corresponding cycle data.

Source code in src/ts_shape/features/cycles/cycle_processor.py

def split_by_cycle(self) -> Dict[str, pd.DataFrame]:
    """
    Splits the values DataFrame by cycles defined in the cycles DataFrame. 
    Each cycle is defined by a start and end time, and the corresponding values are filtered accordingly.

    Return:
        Dictionary where keys are cycle_uuids and values are DataFrames with the corresponding cycle data.
    """
    result = {}
    for _, row in self.cycles_df.iterrows():
        mask = (self.values_df[self.systime_col] >= row['cycle_start']) & (self.values_df[self.systime_col] <= row['cycle_end'])
        result[row[self.cycle_uuid_col]] = self.values_df[mask].copy()

    logging.info(f"Split {len(result)} cycles.")
    return result

split_dataframes_by_group ¤

split_dataframes_by_group(dfs: List[DataFrame], column: str) -> List[DataFrame]

Splits a list of DataFrames by groups based on a specified column. This function performs a groupby operation on each DataFrame in the list and then flattens the result.

Parameters:

dfs ¤
(List[DataFrame]) –

List of DataFrames to be split.
column ¤
(str) –

Column name to group by.

Return

List of DataFrames, each corresponding to a group in the original DataFrames.

Source code in src/ts_shape/features/cycles/cycle_processor.py

def split_dataframes_by_group(self, dfs: List[pd.DataFrame], column: str) -> List[pd.DataFrame]:
    """
    Splits a list of DataFrames by groups based on a specified column. 
    This function performs a groupby operation on each DataFrame in the list and then flattens the result.

    Args:
        dfs: List of DataFrames to be split.
        column: Column name to group by.

    Return:
        List of DataFrames, each corresponding to a group in the original DataFrames.
    """
    split_dfs = []
    for df in dfs:
        groups = df.groupby(column)
        for _, group in groups:
            split_dfs.append(group)

    logging.info(f"Split data into {len(split_dfs)} groups based on column '{column}'.")
    return split_dfs

ts_shape.features.cycles.cycle_processor ¤

CycleDataProcessor ¤

`cycles_df` ¤

`values_df` ¤

`cycle_uuid_col` ¤

`systime_col` ¤

get_dataframe ¤

group_by_cycle_uuid ¤

`data` ¤

merge_dataframes_by_cycle ¤

split_by_cycle ¤

split_dataframes_by_group ¤

`dfs` ¤

`column` ¤

ts_shape.features.cycles.cycle_processor ¤

CycleDataProcessor ¤

cycles_df ¤

values_df ¤

cycle_uuid_col ¤

systime_col ¤

get_dataframe ¤

group_by_cycle_uuid ¤

data ¤

merge_dataframes_by_cycle ¤

split_by_cycle ¤

split_dataframes_by_group ¤

dfs ¤

column ¤

`cycles_df` ¤

`values_df` ¤

`cycle_uuid_col` ¤

`systime_col` ¤

`data` ¤

`dfs` ¤

`column` ¤