time_windowed_features
time_windowed_features ¤
TimeWindowedFeatureTable ¤
TimeWindowedFeatureTable(
dataframe: DataFrame, column_name: str = "systime"
)
Bases: Base
Build ML-ready feature tables from segmented timeseries data.
Takes the output of SegmentProcessor.apply_ranges and computes statistical metrics per UUID within fixed-size time windows (e.g. every 1 minute).
Methods: - compute_long: One row per (time_window, uuid, segment). Long format. - compute: One row per time_window with columns {uuid}__{metric}. Wide format.
compute_long
classmethod
¤
compute_long(
dataframe: DataFrame,
freq: str = "1min",
time_column: str = "systime",
uuid_column: str = "uuid",
value_column: str = "value_double",
segment_column: Optional[str] = "segment_value",
metrics: Optional[List[str]] = None,
) -> pd.DataFrame
Compute statistical metrics per UUID per time window (long format).
Groups the input data into fixed-size time windows using freq and
computes numeric statistics for each (time_window, uuid, segment) group.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
dataframe
|
DataFrame
|
Input DataFrame (typically output of SegmentProcessor.apply_ranges). |
required |
freq
|
str
|
Pandas frequency string for the time window size (e.g. '1min', '30s', '5min'). |
'1min'
|
time_column
|
str
|
Column containing timestamps. |
'systime'
|
uuid_column
|
str
|
Column identifying each timeseries signal. |
'uuid'
|
value_column
|
str
|
Column containing numeric values. |
'value_double'
|
segment_column
|
Optional[str]
|
Column to sub-group by (e.g. 'segment_value'). Set to None to ignore segments entirely. |
'segment_value'
|
metrics
|
Optional[List[str]]
|
Subset of metric names to compute. None uses all 19. |
None
|
Returns:
| Type | Description |
|---|---|
DataFrame
|
DataFrame with columns [time_window, uuid, segment_value?, sample_count, metric_1, ...]. |
compute
classmethod
¤
compute(
dataframe: DataFrame,
freq: str = "1min",
time_column: str = "systime",
uuid_column: str = "uuid",
value_column: str = "value_double",
segment_column: Optional[str] = "segment_value",
metrics: Optional[List[str]] = None,
column_separator: str = "__",
) -> pd.DataFrame
Compute a wide-format feature table with one row per time window.
Each column is named {uuid}{separator}{metric} (e.g.
temperature__mean). Windows where a UUID has insufficient data
are filled with NaN.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
dataframe
|
DataFrame
|
Input DataFrame (typically output of SegmentProcessor.apply_ranges). |
required |
freq
|
str
|
Pandas frequency string for the time window size. |
'1min'
|
time_column
|
str
|
Column containing timestamps. |
'systime'
|
uuid_column
|
str
|
Column identifying each timeseries signal. |
'uuid'
|
value_column
|
str
|
Column containing numeric values. |
'value_double'
|
segment_column
|
Optional[str]
|
Column to sub-group by. Set to None to ignore segments. |
'segment_value'
|
metrics
|
Optional[List[str]]
|
Subset of metric names to compute. None uses all 19. |
None
|
column_separator
|
str
|
Separator between uuid and metric in wide column names. |
'__'
|
Returns:
| Type | Description |
|---|---|
DataFrame
|
DataFrame with columns [time_window, segment_value?, {uuid}__{metric}, ...]. |