ts_shape.features.stats.timestamp_stats
¤
Classes:
-
TimestampStatistics
–Provides class methods to calculate statistics on timestamp columns in a pandas DataFrame.
TimestampStatistics
¤
TimestampStatistics(dataframe: DataFrame, column_name: str = 'systime')
Bases: Base
Provides class methods to calculate statistics on timestamp columns in a pandas DataFrame. The default column for calculations is 'systime'.
Parameters:
-
dataframe
¤DataFrame
) –The DataFrame to be processed.
-
column_name
¤str
, default:'systime'
) –The column to sort by. Default is 'systime'. If the column is not found or is not a time column, the class will attempt to detect other time columns.
Methods:
-
average_time_gap
–Returns the average time gap between consecutive timestamps.
-
count_most_frequent_timestamp
–Returns the count of the most frequent timestamp in the column.
-
count_not_null
–Returns the number of non-null (valid) timestamps in the column.
-
count_null
–Returns the number of null (NaN) values in the timestamp column.
-
days_with_most_activity
–Returns the top N days with the most timestamp activity.
-
earliest_timestamp
–Returns the earliest timestamp in the column.
-
get_dataframe
–Returns the processed DataFrame.
-
hour_distribution
–Returns the distribution of timestamps per hour of the day.
-
latest_timestamp
–Returns the latest timestamp in the column.
-
median_timestamp
–Returns the median timestamp in the column.
-
month_distribution
–Returns the distribution of timestamps per month.
-
most_frequent_day
–Returns the most frequent day of the week (0=Monday, 6=Sunday).
-
most_frequent_hour
–Returns the most frequent hour of the day (0-23).
-
most_frequent_timestamp
–Returns the most frequent timestamp in the column.
-
standard_deviation_timestamps
–Returns the standard deviation of the time differences between consecutive timestamps.
-
timestamp_quartiles
–Returns the 25th, 50th (median), and 75th percentiles of the timestamps.
-
timestamp_range
–Returns the time range (difference) between the earliest and latest timestamps.
-
weekday_distribution
–Returns the distribution of timestamps per weekday.
-
year_distribution
–Returns the distribution of timestamps per year.
Source code in src/ts_shape/utils/base.py
4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 |
|
average_time_gap
classmethod
¤
average_time_gap(dataframe: DataFrame, column_name: str = 'systime') -> Timedelta
Returns the average time gap between consecutive timestamps.
Source code in src/ts_shape/features/stats/timestamp_stats.py
76 77 78 79 80 81 |
|
count_most_frequent_timestamp
classmethod
¤
Returns the count of the most frequent timestamp in the column.
Source code in src/ts_shape/features/stats/timestamp_stats.py
40 41 42 43 44 |
|
count_not_null
classmethod
¤
Returns the number of non-null (valid) timestamps in the column.
Source code in src/ts_shape/features/stats/timestamp_stats.py
15 16 17 18 |
|
count_null
classmethod
¤
Returns the number of null (NaN) values in the timestamp column.
Source code in src/ts_shape/features/stats/timestamp_stats.py
10 11 12 13 |
|
days_with_most_activity
classmethod
¤
Returns the top N days with the most timestamp activity.
Source code in src/ts_shape/features/stats/timestamp_stats.py
100 101 102 103 |
|
earliest_timestamp
classmethod
¤
earliest_timestamp(dataframe: DataFrame, column_name: str = 'systime')
Returns the earliest timestamp in the column.
Source code in src/ts_shape/features/stats/timestamp_stats.py
20 21 22 23 |
|
get_dataframe
¤
get_dataframe() -> DataFrame
Returns the processed DataFrame.
Source code in src/ts_shape/utils/base.py
34 35 36 |
|
hour_distribution
classmethod
¤
hour_distribution(dataframe: DataFrame, column_name: str = 'systime') -> Series
Returns the distribution of timestamps per hour of the day.
Source code in src/ts_shape/features/stats/timestamp_stats.py
61 62 63 64 |
|
latest_timestamp
classmethod
¤
latest_timestamp(dataframe: DataFrame, column_name: str = 'systime')
Returns the latest timestamp in the column.
Source code in src/ts_shape/features/stats/timestamp_stats.py
25 26 27 28 |
|
median_timestamp
classmethod
¤
median_timestamp(dataframe: DataFrame, column_name: str = 'systime')
Returns the median timestamp in the column.
Source code in src/ts_shape/features/stats/timestamp_stats.py
83 84 85 86 |
|
month_distribution
classmethod
¤
month_distribution(dataframe: DataFrame, column_name: str = 'systime') -> Series
Returns the distribution of timestamps per month.
Source code in src/ts_shape/features/stats/timestamp_stats.py
51 52 53 54 |
|
most_frequent_day
classmethod
¤
Returns the most frequent day of the week (0=Monday, 6=Sunday).
Source code in src/ts_shape/features/stats/timestamp_stats.py
66 67 68 69 |
|
most_frequent_hour
classmethod
¤
Returns the most frequent hour of the day (0-23).
Source code in src/ts_shape/features/stats/timestamp_stats.py
71 72 73 74 |
|
most_frequent_timestamp
classmethod
¤
most_frequent_timestamp(dataframe: DataFrame, column_name: str = 'systime')
Returns the most frequent timestamp in the column.
Source code in src/ts_shape/features/stats/timestamp_stats.py
35 36 37 38 |
|
standard_deviation_timestamps
classmethod
¤
standard_deviation_timestamps(dataframe: DataFrame, column_name: str = 'systime') -> Timedelta
Returns the standard deviation of the time differences between consecutive timestamps.
Source code in src/ts_shape/features/stats/timestamp_stats.py
88 89 90 91 92 93 |
|
timestamp_quartiles
classmethod
¤
timestamp_quartiles(dataframe: DataFrame, column_name: str = 'systime') -> Series
Returns the 25th, 50th (median), and 75th percentiles of the timestamps.
Source code in src/ts_shape/features/stats/timestamp_stats.py
95 96 97 98 |
|
timestamp_range
classmethod
¤
timestamp_range(dataframe: DataFrame, column_name: str = 'systime')
Returns the time range (difference) between the earliest and latest timestamps.
Source code in src/ts_shape/features/stats/timestamp_stats.py
30 31 32 33 |
|
weekday_distribution
classmethod
¤
weekday_distribution(dataframe: DataFrame, column_name: str = 'systime') -> Series
Returns the distribution of timestamps per weekday.
Source code in src/ts_shape/features/stats/timestamp_stats.py
56 57 58 59 |
|
year_distribution
classmethod
¤
year_distribution(dataframe: DataFrame, column_name: str = 'systime') -> Series
Returns the distribution of timestamps per year.
Source code in src/ts_shape/features/stats/timestamp_stats.py
46 47 48 49 |
|