statistical_process_control
statistical_process_control ¤
StatisticalProcessControlRuleBased ¤
StatisticalProcessControlRuleBased(
dataframe: DataFrame,
value_column: str,
tolerance_uuid: str,
actual_uuid: str,
event_uuid: str,
)
Bases: Base
Inherits from Base and applies SPC rules (Western Electric Rules) to a DataFrame for event detection. Processes data based on control limit UUIDs, actual value UUIDs, and generates events with an event UUID.
Initializes the SPCMonitor with UUIDs for tolerance, actual, and event values. Inherits the sorted dataframe from the Base class.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
dataframe
|
DataFrame
|
The input DataFrame containing the data to be processed. |
required |
value_column
|
str
|
The column containing the values to monitor. |
required |
tolerance_uuid
|
str
|
UUID identifier for rows that set tolerance values. |
required |
actual_uuid
|
str
|
UUID identifier for rows containing actual values. |
required |
event_uuid
|
str
|
UUID to assign to generated events. |
required |
calculate_control_limits ¤
calculate_control_limits() -> pd.DataFrame
Calculate the control limits (mean ± 1σ, 2σ, 3σ) for the tolerance values.
Returns:
| Type | Description |
|---|---|
DataFrame
|
pd.DataFrame: DataFrame with control limits for each tolerance group. |
calculate_dynamic_control_limits ¤
calculate_dynamic_control_limits(
method: str = "moving_range", window: int = 20
) -> pd.DataFrame
Calculate dynamic control limits that adapt over time.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
method
|
str
|
Method for calculating dynamic limits. Options: - 'moving_range': Uses moving window statistics - 'ewma': Uses Exponentially Weighted Moving Average |
'moving_range'
|
window
|
int
|
Window size for moving calculations (default: 20) |
20
|
Returns:
| Type | Description |
|---|---|
DataFrame
|
pd.DataFrame: DataFrame with dynamic control limits indexed by time. |
rule_1 ¤
rule_1(df: DataFrame, limits: DataFrame) -> pd.DataFrame
Rule 1: One point beyond the 3σ control limits.
Returns:
| Type | Description |
|---|---|
DataFrame
|
pd.DataFrame: Filtered DataFrame with rule violations. |
rule_2 ¤
rule_2(df: DataFrame) -> pd.DataFrame
Rule 2: Nine consecutive points on one side of the mean.
Returns:
| Type | Description |
|---|---|
DataFrame
|
pd.DataFrame: Filtered DataFrame with rule violations. |
rule_3 ¤
rule_3(df: DataFrame) -> pd.DataFrame
Rule 3: Six consecutive points steadily increasing or decreasing.
Returns:
| Type | Description |
|---|---|
DataFrame
|
pd.DataFrame: Filtered DataFrame with rule violations. |
rule_4 ¤
rule_4(df: DataFrame) -> pd.DataFrame
Rule 4: Fourteen consecutive points alternating up and down.
Returns:
| Type | Description |
|---|---|
DataFrame
|
pd.DataFrame: Filtered DataFrame with rule violations. |
rule_5 ¤
rule_5(df: DataFrame, limits: DataFrame) -> pd.DataFrame
Rule 5: Two out of three consecutive points near the control limit (beyond 2σ but within 3σ).
Returns:
| Type | Description |
|---|---|
DataFrame
|
pd.DataFrame: Filtered DataFrame with rule violations. |
rule_6 ¤
rule_6(df: DataFrame, limits: DataFrame) -> pd.DataFrame
Rule 6: Four out of five consecutive points near the control limit (beyond 1σ but within 2σ).
Returns:
| Type | Description |
|---|---|
DataFrame
|
pd.DataFrame: Filtered DataFrame with rule violations. |
rule_7 ¤
rule_7(df: DataFrame, limits: DataFrame) -> pd.DataFrame
Rule 7: Fifteen consecutive points within 1σ of the centerline.
Returns:
| Type | Description |
|---|---|
DataFrame
|
pd.DataFrame: Filtered DataFrame with rule violations. |
rule_8 ¤
rule_8(df: DataFrame, limits: DataFrame) -> pd.DataFrame
Rule 8: Eight consecutive points on both sides of the mean within 1σ.
Returns:
| Type | Description |
|---|---|
DataFrame
|
pd.DataFrame: Filtered DataFrame with rule violations. |
apply_rules_vectorized ¤
apply_rules_vectorized(
selected_rules: Optional[List[str]] = None,
) -> pd.DataFrame
Applies SPC rules using vectorized operations with optimized multi-rule processing. Processes multiple rules in fewer passes through the data for better performance.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
selected_rules
|
Optional[List[str]]
|
List of rule names to apply. If None, applies all rules. |
None
|
Returns:
| Type | Description |
|---|---|
DataFrame
|
pd.DataFrame: DataFrame with rule violations, including rule name and severity. |
detect_cusum_shifts ¤
detect_cusum_shifts(
target: Optional[float] = None,
k: float = 0.5,
h: float = 5.0,
) -> pd.DataFrame
Detect process shifts using CUSUM (Cumulative Sum) control chart.
CUSUM charts are effective at detecting small shifts in the process mean and are more sensitive than traditional Shewhart control charts.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
target
|
Optional[float]
|
Target mean value. If None, uses the mean of tolerance data. |
None
|
k
|
float
|
Reference value (slack parameter), typically 0.5 to 1.0 times sigma. Smaller k detects smaller shifts. Default: 0.5 |
0.5
|
h
|
float
|
Decision interval (threshold). Typical values are 4-5. Smaller h gives faster detection but more false alarms. Default: 5.0 |
5.0
|
Returns:
| Type | Description |
|---|---|
DataFrame
|
pd.DataFrame: DataFrame with CUSUM statistics and detected shifts. Columns: systime, value, cusum_high, cusum_low, shift_detected, shift_direction |
interpret_violations ¤
interpret_violations(
violations_df: DataFrame,
) -> pd.DataFrame
Add human-readable interpretations to rule violations.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
violations_df
|
DataFrame
|
DataFrame with rule violations (output from apply_rules_vectorized or process methods) |
required |
Returns:
| Type | Description |
|---|---|
DataFrame
|
pd.DataFrame: Enhanced DataFrame with interpretation and recommendation columns. |
process ¤
process(
selected_rules: Optional[List[str]] = None,
include_severity: bool = False,
) -> pd.DataFrame
Applies the selected SPC rules and generates a DataFrame of events where any rules are violated.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
selected_rules
|
Optional[List[str]]
|
List of rule names (e.g., ['rule_1', 'rule_3']) to apply. |
None
|
include_severity
|
bool
|
If True, includes severity and rule information in output. Default: False (maintains backward compatibility) |
False
|
Returns:
| Type | Description |
|---|---|
DataFrame
|
pd.DataFrame: DataFrame with rule violations and detected events. If include_severity=False: columns are [systime, value_column, uuid] If include_severity=True: columns include [systime, value_column, uuid, rule, severity] |