ts_shape.features.stats.timestamp_stats
¤
Classes:
-
TimestampStatistics–Provides class methods to calculate statistics on timestamp columns in a pandas DataFrame.
TimestampStatistics
¤
TimestampStatistics(dataframe: DataFrame, column_name: str = 'systime')
Bases: Base
Provides class methods to calculate statistics on timestamp columns in a pandas DataFrame. The default column for calculations is 'systime'.
Parameters:
-
(dataframe¤DataFrame) –The DataFrame to be processed.
-
(column_name¤str, default:'systime') –The column to sort by. Default is 'systime'. If the column is not found or is not a time column, the class will attempt to detect other time columns.
Methods:
-
average_time_gap–Returns the average time gap between consecutive timestamps.
-
count_most_frequent_timestamp–Returns the count of the most frequent timestamp in the column.
-
count_not_null–Returns the number of non-null (valid) timestamps in the column.
-
count_null–Returns the number of null (NaN) values in the timestamp column.
-
days_with_most_activity–Returns the top N days with the most timestamp activity.
-
earliest_timestamp–Returns the earliest timestamp in the column.
-
get_dataframe–Returns the processed DataFrame.
-
hour_distribution–Returns the distribution of timestamps per hour of the day.
-
latest_timestamp–Returns the latest timestamp in the column.
-
median_timestamp–Returns the median timestamp in the column.
-
month_distribution–Returns the distribution of timestamps per month.
-
most_frequent_day–Returns the most frequent day of the week (0=Monday, 6=Sunday).
-
most_frequent_hour–Returns the most frequent hour of the day (0-23).
-
most_frequent_timestamp–Returns the most frequent timestamp in the column.
-
standard_deviation_timestamps–Returns the standard deviation of the time differences between consecutive timestamps.
-
timestamp_quartiles–Returns the 25th, 50th (median), and 75th percentiles of the timestamps.
-
timestamp_range–Returns the time range (difference) between the earliest and latest timestamps.
-
weekday_distribution–Returns the distribution of timestamps per weekday.
-
year_distribution–Returns the distribution of timestamps per year.
Source code in src/ts_shape/utils/base.py
4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 | |
average_time_gap
classmethod
¤
average_time_gap(dataframe: DataFrame, column_name: str = 'systime') -> Timedelta
Returns the average time gap between consecutive timestamps.
Source code in src/ts_shape/features/stats/timestamp_stats.py
76 77 78 79 80 81 | |
count_most_frequent_timestamp
classmethod
¤
Returns the count of the most frequent timestamp in the column.
Source code in src/ts_shape/features/stats/timestamp_stats.py
40 41 42 43 44 | |
count_not_null
classmethod
¤
Returns the number of non-null (valid) timestamps in the column.
Source code in src/ts_shape/features/stats/timestamp_stats.py
15 16 17 18 | |
count_null
classmethod
¤
Returns the number of null (NaN) values in the timestamp column.
Source code in src/ts_shape/features/stats/timestamp_stats.py
10 11 12 13 | |
days_with_most_activity
classmethod
¤
Returns the top N days with the most timestamp activity.
Source code in src/ts_shape/features/stats/timestamp_stats.py
100 101 102 103 | |
earliest_timestamp
classmethod
¤
earliest_timestamp(dataframe: DataFrame, column_name: str = 'systime')
Returns the earliest timestamp in the column.
Source code in src/ts_shape/features/stats/timestamp_stats.py
20 21 22 23 | |
get_dataframe
¤
get_dataframe() -> DataFrame
Returns the processed DataFrame.
Source code in src/ts_shape/utils/base.py
34 35 36 | |
hour_distribution
classmethod
¤
hour_distribution(dataframe: DataFrame, column_name: str = 'systime') -> Series
Returns the distribution of timestamps per hour of the day.
Source code in src/ts_shape/features/stats/timestamp_stats.py
61 62 63 64 | |
latest_timestamp
classmethod
¤
latest_timestamp(dataframe: DataFrame, column_name: str = 'systime')
Returns the latest timestamp in the column.
Source code in src/ts_shape/features/stats/timestamp_stats.py
25 26 27 28 | |
median_timestamp
classmethod
¤
median_timestamp(dataframe: DataFrame, column_name: str = 'systime')
Returns the median timestamp in the column.
Source code in src/ts_shape/features/stats/timestamp_stats.py
83 84 85 86 | |
month_distribution
classmethod
¤
month_distribution(dataframe: DataFrame, column_name: str = 'systime') -> Series
Returns the distribution of timestamps per month.
Source code in src/ts_shape/features/stats/timestamp_stats.py
51 52 53 54 | |
most_frequent_day
classmethod
¤
Returns the most frequent day of the week (0=Monday, 6=Sunday).
Source code in src/ts_shape/features/stats/timestamp_stats.py
66 67 68 69 | |
most_frequent_hour
classmethod
¤
Returns the most frequent hour of the day (0-23).
Source code in src/ts_shape/features/stats/timestamp_stats.py
71 72 73 74 | |
most_frequent_timestamp
classmethod
¤
most_frequent_timestamp(dataframe: DataFrame, column_name: str = 'systime')
Returns the most frequent timestamp in the column.
Source code in src/ts_shape/features/stats/timestamp_stats.py
35 36 37 38 | |
standard_deviation_timestamps
classmethod
¤
standard_deviation_timestamps(dataframe: DataFrame, column_name: str = 'systime') -> Timedelta
Returns the standard deviation of the time differences between consecutive timestamps.
Source code in src/ts_shape/features/stats/timestamp_stats.py
88 89 90 91 92 93 | |
timestamp_quartiles
classmethod
¤
timestamp_quartiles(dataframe: DataFrame, column_name: str = 'systime') -> Series
Returns the 25th, 50th (median), and 75th percentiles of the timestamps.
Source code in src/ts_shape/features/stats/timestamp_stats.py
95 96 97 98 | |
timestamp_range
classmethod
¤
timestamp_range(dataframe: DataFrame, column_name: str = 'systime')
Returns the time range (difference) between the earliest and latest timestamps.
Source code in src/ts_shape/features/stats/timestamp_stats.py
30 31 32 33 | |
weekday_distribution
classmethod
¤
weekday_distribution(dataframe: DataFrame, column_name: str = 'systime') -> Series
Returns the distribution of timestamps per weekday.
Source code in src/ts_shape/features/stats/timestamp_stats.py
56 57 58 59 | |
year_distribution
classmethod
¤
year_distribution(dataframe: DataFrame, column_name: str = 'systime') -> Series
Returns the distribution of timestamps per year.
Source code in src/ts_shape/features/stats/timestamp_stats.py
46 47 48 49 | |