Skip to content

ts-shape¤

Shape Your Timeseries Data¤

PyPI Downloads Python License

Load, transform, and analyze timeseries data with a clean, composable API.

Get Started Usage Examples


Why ts-shape?¤

  • DataFrame-First


    Every operation accepts and returns Pandas DataFrames. No proprietary formats, no lock-in.

  • Modular Design


    Use only what you need. Loaders, transforms, features, and events are fully decoupled.

  • Multi-Source Loading


    Load from Parquet, S3, Azure Blob, or TimescaleDB with a unified interface.

  • Analysis Ready


    Built-in statistics, cycle detection, outlier detection, and event extraction.


Quick Example¤

import pandas as pd
from ts_shape.transform.filter.numeric_filter import NumericFilter
from ts_shape.features.stats.numeric_stats import NumericStatistics

# Load your data
df = pd.read_parquet("sensors.parquet")

# Filter to valid range
clean = NumericFilter.filter_value_in_range(df, "value_double", 0, 100)

# Get statistics
stats = NumericStatistics(clean, "value_double")
print(f"Mean: {stats.mean():.2f}, Std: {stats.std():.2f}")

Architecture¤

flowchart LR
    subgraph Input
        A1[Parquet]
        A2[S3/Azure]
        A3[TimescaleDB]
    end

    A1 --> L[Load]
    A2 --> L
    A3 --> L

    L --> T[Transform]
    T --> F[Features]
    F --> E[Events]
    E --> O[Output DataFrame]

Core Modules¤

Loaders¤

  • Parquet - Local and remote files
  • S3 Proxy - S3-compatible storage
  • Azure Blob - Container layouts
  • TimescaleDB - SQL timeseries
  • Metadata JSON - Context enrichment

Transforms¤

  • Numeric Filter - Range, threshold
  • String Filter - Pattern matching
  • DateTime Filter - Time ranges
  • Boolean Filter - Flag filtering
  • Calculator - Derived columns

Features¤

  • Numeric Stats - min, max, mean, std
  • Time Stats - Coverage, gaps
  • String Stats - Value counts
  • Cycles - Pattern detection

Events¤

  • Quality - Outliers, SPC
  • Engineering - Setpoints, startup
  • Production - Cycles, downtime

Data Model¤

ts-shape uses a simple schema:

Column Type Description
uuid string Signal identifier
systime datetime Timestamp
value_double float Numeric values
value_integer int Integer values
value_string string String values
value_bool bool Boolean values

Flexible

Use only the columns you need. Not all are required.



MIT License - Built for the timeseries community