Skip to content

cycle_processor

cycle_processor ¤

CycleDataProcessor ¤

CycleDataProcessor(
    cycles_df: DataFrame,
    values_df: DataFrame,
    cycle_uuid_col: str = "cycle_uuid",
    systime_col: str = "systime",
)

Bases: Base

A class to process cycle-based data and values with optimized performance. Uses pandas IntervalIndex for efficient cycle assignment instead of nested loops.

Initializes the CycleDataProcessor with cycles and values DataFrames.

Parameters:

Name Type Description Default
cycles_df DataFrame

DataFrame containing columns 'cycle_start', 'cycle_end', and 'cycle_uuid'.

required
values_df DataFrame

DataFrame containing the values and timestamps in the 'systime' column.

required
cycle_uuid_col str

Name of the column representing cycle UUIDs.

'cycle_uuid'
systime_col str

Name of the column representing the timestamps for the values.

'systime'

split_by_cycle ¤

split_by_cycle() -> Dict[str, pd.DataFrame]

Splits the values DataFrame by cycles defined in the cycles DataFrame. Uses optimized interval-based assignment.

Return

Dictionary where keys are cycle_uuids and values are DataFrames with the corresponding cycle data.

merge_dataframes_by_cycle ¤

merge_dataframes_by_cycle() -> pd.DataFrame

Merges the values DataFrame with the cycles DataFrame based on the cycle time intervals. Uses optimized interval-based assignment instead of nested loops.

Return

DataFrame with an added 'cycle_uuid' column.

group_by_cycle_uuid ¤

group_by_cycle_uuid(
    data: Optional[DataFrame] = None,
) -> List[pd.DataFrame]

Group the DataFrame by the cycle_uuid column, resulting in a list of DataFrames, each containing data for one cycle.

Parameters:

Name Type Description Default
data Optional[DataFrame]

DataFrame containing the data to be grouped by cycle_uuid. If None, uses the internal values_df.

None
Return

List of DataFrames, each containing data for a unique cycle_uuid.

split_dataframes_by_group ¤

split_dataframes_by_group(
    dfs: List[DataFrame], column: str
) -> List[pd.DataFrame]

Splits a list of DataFrames by groups based on a specified column. This function performs a groupby operation on each DataFrame in the list and then flattens the result.

Parameters:

Name Type Description Default
dfs List[DataFrame]

List of DataFrames to be split.

required
column str

Column name to group by.

required
Return

List of DataFrames, each corresponding to a group in the original DataFrames.

compute_cycle_statistics ¤

compute_cycle_statistics() -> pd.DataFrame

Compute statistics for each cycle.

Returns:

Type Description
DataFrame

DataFrame with cycle-level statistics including duration, value counts, etc.

compare_cycles ¤

compare_cycles(
    reference_cycle_uuid: str, metric: str = "value_double"
) -> pd.DataFrame

Compare all cycles against a reference cycle.

Parameters:

Name Type Description Default
reference_cycle_uuid str

UUID of the reference cycle

required
metric str

Column name to use for comparison

'value_double'

Returns:

Type Description
DataFrame

DataFrame with comparison metrics for each cycle

identify_golden_cycles ¤

identify_golden_cycles(
    metric: str = "value_double",
    method: str = "low_variability",
    top_n: int = 5,
) -> List[str]

Identify the best performing cycles (golden cycles).

Parameters:

Name Type Description Default
metric str

Column name to evaluate

'value_double'
method str

Method for identification ('low_variability', 'high_mean', 'target_value')

'low_variability'
top_n int

Number of golden cycles to identify

5

Returns:

Type Description
List[str]

List of cycle UUIDs identified as golden cycles