segment_processor
segment_processor ¤
SegmentProcessor ¤
SegmentProcessor(
dataframe: DataFrame, column_name: str = "systime"
)
Bases: Base
Apply extracted time ranges to process data and compute metric profiles.
Takes the output of SegmentExtractor.extract_time_ranges and uses it to filter and annotate process parameter data, then computes statistical metrics per UUID per segment.
Methods: - apply_ranges: Filter process data by time ranges, annotate with segment info. - compute_metric_profiles: Compute statistical metrics per UUID per segment.
apply_ranges
classmethod
¤
apply_ranges(
dataframe: DataFrame,
time_ranges: DataFrame,
uuid_column: str = "uuid",
time_column: str = "systime",
target_uuids: Optional[List[str]] = None,
) -> pd.DataFrame
Filter process parameter data by extracted time ranges.
For each time range (segment), selects rows from the main DataFrame that fall within [segment_start, segment_end] and annotates them with the segment value and index.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
dataframe
|
DataFrame
|
Input DataFrame with process parameter data (all UUIDs). |
required |
time_ranges
|
DataFrame
|
Output from SegmentExtractor.extract_time_ranges. |
required |
uuid_column
|
str
|
Column identifying each timeseries. |
'uuid'
|
time_column
|
str
|
Column containing timestamps. |
'systime'
|
target_uuids
|
Optional[List[str]]
|
Optional list of UUIDs to include. None keeps all. |
None
|
Returns:
| Type | Description |
|---|---|
DataFrame
|
Input DataFrame filtered to the time ranges, with added columns: |
DataFrame
|
|
DataFrame
|
|
compute_metric_profiles
classmethod
¤
compute_metric_profiles(
dataframe: DataFrame,
uuid_column: str = "uuid",
value_column: str = "value_double",
group_column: str = "segment_value",
metrics: Optional[List[str]] = None,
) -> pd.DataFrame
Compute statistical metrics per UUID per segment.
Typically called on the output of apply_ranges. Computes metrics per (UUID, segment) pair using NumericStatistics.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
dataframe
|
DataFrame
|
Input DataFrame (output of apply_ranges or similar). |
required |
uuid_column
|
str
|
Column identifying each timeseries. |
'uuid'
|
value_column
|
str
|
Column containing numeric values. |
'value_double'
|
group_column
|
str
|
Column to group segments by (e.g. 'segment_value', 'segment_index'). Use 'segment_value' to aggregate all ranges of the same order, or 'segment_index' for individual ranges. |
'segment_value'
|
metrics
|
Optional[List[str]]
|
Subset of metric names to compute. None uses all 19. |
None
|
Returns:
| Type | Description |
|---|---|
DataFrame
|
DataFrame with columns [uuid, |