Skip to content

segment_processor

segment_processor ¤

SegmentProcessor ¤

SegmentProcessor(
    dataframe: DataFrame, column_name: str = "systime"
)

Bases: Base

Apply extracted time ranges to process data and compute metric profiles.

Takes the output of SegmentExtractor.extract_time_ranges and uses it to filter and annotate process parameter data, then computes statistical metrics per UUID per segment.

Methods: - apply_ranges: Filter process data by time ranges, annotate with segment info. - compute_metric_profiles: Compute statistical metrics per UUID per segment.

apply_ranges classmethod ¤

apply_ranges(
    dataframe: DataFrame,
    time_ranges: DataFrame,
    uuid_column: str = "uuid",
    time_column: str = "systime",
    target_uuids: Optional[List[str]] = None,
) -> pd.DataFrame

Filter process parameter data by extracted time ranges.

For each time range (segment), selects rows from the main DataFrame that fall within [segment_start, segment_end] and annotates them with the segment value and index.

Parameters:

Name Type Description Default
dataframe DataFrame

Input DataFrame with process parameter data (all UUIDs).

required
time_ranges DataFrame

Output from SegmentExtractor.extract_time_ranges.

required
uuid_column str

Column identifying each timeseries.

'uuid'
time_column str

Column containing timestamps.

'systime'
target_uuids Optional[List[str]]

Optional list of UUIDs to include. None keeps all.

None

Returns:

Type Description
DataFrame

Input DataFrame filtered to the time ranges, with added columns:

DataFrame
  • segment_value: The active order/part number for each row.
DataFrame
  • segment_index: The sequential segment index.

compute_metric_profiles classmethod ¤

compute_metric_profiles(
    dataframe: DataFrame,
    uuid_column: str = "uuid",
    value_column: str = "value_double",
    group_column: str = "segment_value",
    metrics: Optional[List[str]] = None,
) -> pd.DataFrame

Compute statistical metrics per UUID per segment.

Typically called on the output of apply_ranges. Computes metrics per (UUID, segment) pair using NumericStatistics.

Parameters:

Name Type Description Default
dataframe DataFrame

Input DataFrame (output of apply_ranges or similar).

required
uuid_column str

Column identifying each timeseries.

'uuid'
value_column str

Column containing numeric values.

'value_double'
group_column str

Column to group segments by (e.g. 'segment_value', 'segment_index'). Use 'segment_value' to aggregate all ranges of the same order, or 'segment_index' for individual ranges.

'segment_value'
metrics Optional[List[str]]

Subset of metric names to compute. None uses all 19.

None

Returns:

Type Description
DataFrame

DataFrame with columns [uuid, , sample_count, metric_1, ...].