Skip to content

statistical_process_control

statistical_process_control ¤

StatisticalProcessControlRuleBased ¤

StatisticalProcessControlRuleBased(
    dataframe: DataFrame,
    value_column: str,
    tolerance_uuid: str,
    actual_uuid: str,
    event_uuid: str,
)

Bases: Base

Inherits from Base and applies SPC rules (Western Electric Rules) to a DataFrame for event detection. Processes data based on control limit UUIDs, actual value UUIDs, and generates events with an event UUID.

Initializes the SPCMonitor with UUIDs for tolerance, actual, and event values. Inherits the sorted dataframe from the Base class.

Parameters:

Name Type Description Default
dataframe DataFrame

The input DataFrame containing the data to be processed.

required
value_column str

The column containing the values to monitor.

required
tolerance_uuid str

UUID identifier for rows that set tolerance values.

required
actual_uuid str

UUID identifier for rows containing actual values.

required
event_uuid str

UUID to assign to generated events.

required

calculate_control_limits ¤

calculate_control_limits() -> pd.DataFrame

Calculate the control limits (mean ± 1σ, 2σ, 3σ) for the tolerance values.

Returns:

Type Description
DataFrame

pd.DataFrame: DataFrame with control limits for each tolerance group.

calculate_dynamic_control_limits ¤

calculate_dynamic_control_limits(
    method: str = "moving_range", window: int = 20
) -> pd.DataFrame

Calculate dynamic control limits that adapt over time.

Parameters:

Name Type Description Default
method str

Method for calculating dynamic limits. Options: - 'moving_range': Uses moving window statistics - 'ewma': Uses Exponentially Weighted Moving Average

'moving_range'
window int

Window size for moving calculations (default: 20)

20

Returns:

Type Description
DataFrame

pd.DataFrame: DataFrame with dynamic control limits indexed by time.

rule_1 ¤

rule_1(df: DataFrame, limits: DataFrame) -> pd.DataFrame

Rule 1: One point beyond the 3σ control limits.

Returns:

Type Description
DataFrame

pd.DataFrame: Filtered DataFrame with rule violations.

rule_2 ¤

rule_2(df: DataFrame) -> pd.DataFrame

Rule 2: Nine consecutive points on one side of the mean.

Returns:

Type Description
DataFrame

pd.DataFrame: Filtered DataFrame with rule violations.

rule_3 ¤

rule_3(df: DataFrame) -> pd.DataFrame

Rule 3: Six consecutive points steadily increasing or decreasing.

Returns:

Type Description
DataFrame

pd.DataFrame: Filtered DataFrame with rule violations.

rule_4 ¤

rule_4(df: DataFrame) -> pd.DataFrame

Rule 4: Fourteen consecutive points alternating up and down.

Returns:

Type Description
DataFrame

pd.DataFrame: Filtered DataFrame with rule violations.

rule_5 ¤

rule_5(df: DataFrame, limits: DataFrame) -> pd.DataFrame

Rule 5: Two out of three consecutive points near the control limit (beyond 2σ but within 3σ).

Returns:

Type Description
DataFrame

pd.DataFrame: Filtered DataFrame with rule violations.

rule_6 ¤

rule_6(df: DataFrame, limits: DataFrame) -> pd.DataFrame

Rule 6: Four out of five consecutive points near the control limit (beyond 1σ but within 2σ).

Returns:

Type Description
DataFrame

pd.DataFrame: Filtered DataFrame with rule violations.

rule_7 ¤

rule_7(df: DataFrame, limits: DataFrame) -> pd.DataFrame

Rule 7: Fifteen consecutive points within 1σ of the centerline.

Returns:

Type Description
DataFrame

pd.DataFrame: Filtered DataFrame with rule violations.

rule_8 ¤

rule_8(df: DataFrame, limits: DataFrame) -> pd.DataFrame

Rule 8: Eight consecutive points on both sides of the mean within 1σ.

Returns:

Type Description
DataFrame

pd.DataFrame: Filtered DataFrame with rule violations.

apply_rules_vectorized ¤

apply_rules_vectorized(
    selected_rules: Optional[List[str]] = None,
) -> pd.DataFrame

Applies SPC rules using vectorized operations with optimized multi-rule processing. Processes multiple rules in fewer passes through the data for better performance.

Parameters:

Name Type Description Default
selected_rules Optional[List[str]]

List of rule names to apply. If None, applies all rules.

None

Returns:

Type Description
DataFrame

pd.DataFrame: DataFrame with rule violations, including rule name and severity.

detect_cusum_shifts ¤

detect_cusum_shifts(
    target: Optional[float] = None,
    k: float = 0.5,
    h: float = 5.0,
) -> pd.DataFrame

Detect process shifts using CUSUM (Cumulative Sum) control chart.

CUSUM charts are effective at detecting small shifts in the process mean and are more sensitive than traditional Shewhart control charts.

Parameters:

Name Type Description Default
target Optional[float]

Target mean value. If None, uses the mean of tolerance data.

None
k float

Reference value (slack parameter), typically 0.5 to 1.0 times sigma. Smaller k detects smaller shifts. Default: 0.5

0.5
h float

Decision interval (threshold). Typical values are 4-5. Smaller h gives faster detection but more false alarms. Default: 5.0

5.0

Returns:

Type Description
DataFrame

pd.DataFrame: DataFrame with CUSUM statistics and detected shifts. Columns: systime, value, cusum_high, cusum_low, shift_detected, shift_direction

interpret_violations ¤

interpret_violations(
    violations_df: DataFrame,
) -> pd.DataFrame

Add human-readable interpretations to rule violations.

Parameters:

Name Type Description Default
violations_df DataFrame

DataFrame with rule violations (output from apply_rules_vectorized or process methods)

required

Returns:

Type Description
DataFrame

pd.DataFrame: Enhanced DataFrame with interpretation and recommendation columns.

process ¤

process(
    selected_rules: Optional[List[str]] = None,
    include_severity: bool = False,
) -> pd.DataFrame

Applies the selected SPC rules and generates a DataFrame of events where any rules are violated.

Parameters:

Name Type Description Default
selected_rules Optional[List[str]]

List of rule names (e.g., ['rule_1', 'rule_3']) to apply.

None
include_severity bool

If True, includes severity and rule information in output. Default: False (maintains backward compatibility)

False

Returns:

Type Description
DataFrame

pd.DataFrame: DataFrame with rule violations and detected events. If include_severity=False: columns are [systime, value_column, uuid] If include_severity=True: columns include [systime, value_column, uuid, rule, severity]