Skip to content

ts_shape.transform.time_functions.timestamp_converter ¤

Classes:

  • TimestampConverter

    A class dedicated to converting high-precision timestamp data (e.g., in seconds, milliseconds, microseconds, or nanoseconds)

TimestampConverter ¤

TimestampConverter(dataframe: DataFrame, column_name: str = 'systime')

Bases: Base

A class dedicated to converting high-precision timestamp data (e.g., in seconds, milliseconds, microseconds, or nanoseconds) to standard datetime formats with optional timezone adjustment.

Parameters:

  • dataframe ¤

    (DataFrame) –

    The DataFrame to be processed.

  • column_name ¤

    (str, default: 'systime' ) –

    The column to sort by. Default is 'systime'. If the column is not found or is not a time column, the class will attempt to detect other time columns.

Methods:

  • convert_to_datetime

    Converts specified columns from a given timestamp unit to datetime format in a target timezone.

  • get_dataframe

    Returns the processed DataFrame.

Source code in src/ts_shape/utils/base.py
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
def __init__(self, dataframe: pd.DataFrame, column_name: str = 'systime') -> pd.DataFrame:
    """
    Initializes the Base with a DataFrame, detects time columns, converts them to datetime,
    and sorts the DataFrame by the specified column (or the detected time column if applicable).

    Args:
        dataframe (pd.DataFrame): The DataFrame to be processed.
        column_name (str): The column to sort by. Default is 'systime'. If the column is not found or is not a time column, the class will attempt to detect other time columns.
    """
    self.dataframe = dataframe.copy()

    # Attempt to convert the specified column_name to datetime if it exists
    if column_name in self.dataframe.columns:
        self.dataframe[column_name] = pd.to_datetime(self.dataframe[column_name], errors='coerce')
    else:
        # If the column_name is not in the DataFrame, fallback to automatic time detection
        time_columns = [col for col in self.dataframe.columns if 'time' in col.lower() or 'date' in col.lower()]

        # Convert all detected time columns to datetime, if any
        for col in time_columns:
            self.dataframe[col] = pd.to_datetime(self.dataframe[col], errors='coerce')

        # If any time columns are detected, sort by the first one; otherwise, do nothing
        if time_columns:
            column_name = time_columns[0]

    # Sort by the datetime column (either specified or detected)
    if column_name in self.dataframe.columns:
        self.dataframe = self.dataframe.sort_values(by=column_name)

convert_to_datetime classmethod ¤

convert_to_datetime(dataframe: DataFrame, columns: list, unit: str = 'ns', timezone: str = 'UTC') -> DataFrame

Converts specified columns from a given timestamp unit to datetime format in a target timezone.

Parameters:

  • dataframe ¤

    (DataFrame) –

    The DataFrame containing the data.

  • columns ¤

    (list) –

    A list of column names with timestamp data to convert.

  • unit ¤

    (str, default: 'ns' ) –

    The unit of the timestamps ('s', 'ms', 'us', or 'ns').

  • timezone ¤

    (str, default: 'UTC' ) –

    The target timezone for the converted datetime (default is 'UTC').

Returns:

  • DataFrame

    pd.DataFrame: A DataFrame with the converted datetime columns in the specified timezone.

Source code in src/ts_shape/transform/time_functions/timestamp_converter.py
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
@classmethod
def convert_to_datetime(cls, dataframe: pd.DataFrame, columns: list, unit: str = 'ns', timezone: str = 'UTC') -> pd.DataFrame:
    """
    Converts specified columns from a given timestamp unit to datetime format in a target timezone.

    Args:
        dataframe (pd.DataFrame): The DataFrame containing the data.
        columns (list): A list of column names with timestamp data to convert.
        unit (str): The unit of the timestamps ('s', 'ms', 'us', or 'ns').
        timezone (str): The target timezone for the converted datetime (default is 'UTC').

    Returns:
        pd.DataFrame: A DataFrame with the converted datetime columns in the specified timezone.
    """
    # Validate unit
    valid_units = ['s', 'ms', 'us', 'ns']
    if unit not in valid_units:
        raise ValueError(f"Invalid unit '{unit}'. Must be one of {valid_units}.")

    # Validate timezone
    if timezone not in pytz.all_timezones:
        raise ValueError(f"Invalid timezone '{timezone}'. Use a valid timezone name from pytz.all_timezones.")

    df = dataframe.copy()
    for col in columns:
        # Convert timestamps to datetime in UTC first
        df[col] = pd.to_datetime(df[col], unit=unit, utc=True)
        # Adjust to the target timezone
        df[col] = df[col].dt.tz_convert(timezone)

    return df

get_dataframe ¤

get_dataframe() -> DataFrame

Returns the processed DataFrame.

Source code in src/ts_shape/utils/base.py
34
35
36
def get_dataframe(self) -> pd.DataFrame:
    """Returns the processed DataFrame."""
    return self.dataframe