string_stats
string_stats ¤
StringStatistics ¤
StringStatistics(
dataframe: DataFrame, column_name: str = "systime"
)
Bases: Base
Provides class methods to calculate statistics on string columns in a pandas DataFrame.
count_unique
classmethod
¤
count_unique(
dataframe: DataFrame, column_name: str = "value_string"
) -> int
Returns the number of unique strings in the column.
most_frequent
classmethod
¤
most_frequent(
dataframe: DataFrame, column_name: str = "value_string"
) -> str
Returns the most frequent string in the column.
count_most_frequent
classmethod
¤
count_most_frequent(
dataframe: DataFrame, column_name: str = "value_string"
) -> int
Returns the count of the most frequent string in the column.
count_null
classmethod
¤
count_null(
dataframe: DataFrame, column_name: str = "value_string"
) -> int
Returns the number of null (NaN) values in the column.
average_string_length
classmethod
¤
average_string_length(
dataframe: DataFrame, column_name: str = "value_string"
) -> float
Returns the average length of strings in the column, excluding null values.
longest_string
classmethod
¤
longest_string(
dataframe: DataFrame, column_name: str = "value_string"
) -> str
Returns the longest string in the column.
shortest_string
classmethod
¤
shortest_string(
dataframe: DataFrame, column_name: str = "value_string"
) -> str
Returns the shortest string in the column.
string_length_summary
classmethod
¤
string_length_summary(
dataframe: DataFrame, column_name: str = "value_string"
) -> pd.DataFrame
Returns a summary of string lengths, including min, max, and average lengths.
most_common_n_strings
classmethod
¤
most_common_n_strings(
dataframe: DataFrame,
n: int,
column_name: str = "value_string",
) -> pd.Series
Returns the top N most frequent strings in the column.
contains_substring_count
classmethod
¤
contains_substring_count(
dataframe: DataFrame,
substring: str,
column_name: str = "value_string",
) -> int
Counts how many strings contain the specified substring.
starts_with_count
classmethod
¤
starts_with_count(
dataframe: DataFrame,
prefix: str,
column_name: str = "value_string",
) -> int
Counts how many strings start with the specified prefix.
ends_with_count
classmethod
¤
ends_with_count(
dataframe: DataFrame,
suffix: str,
column_name: str = "value_string",
) -> int
Counts how many strings end with the specified suffix.
uppercase_percentage
classmethod
¤
uppercase_percentage(
dataframe: DataFrame, column_name: str = "value_string"
) -> float
Returns the percentage of strings that are fully uppercase.
lowercase_percentage
classmethod
¤
lowercase_percentage(
dataframe: DataFrame, column_name: str = "value_string"
) -> float
Returns the percentage of strings that are fully lowercase.
contains_digit_count
classmethod
¤
contains_digit_count(
dataframe: DataFrame, column_name: str = "value_string"
) -> int
Counts how many strings contain digits.
summary_as_dict
classmethod
¤
summary_as_dict(
dataframe: DataFrame, column_name: str
) -> Dict[str, Union[int, str, float]]
Returns a dictionary with comprehensive string statistics for the specified column.
summary_as_dataframe
classmethod
¤
summary_as_dataframe(
dataframe: DataFrame, column_name: str
) -> pd.DataFrame
Returns a DataFrame with comprehensive string statistics for the specified column.