Module pydbsmgr.main

Functions

def check_dtypes(dataframe: pandas.core.frame.DataFrame, datatypes: pandas.core.series.Series) ‑> pandas.core.frame.DataFrame

Checks and updates the data types of columns in a DataFrame.

Parameters

dataframe : DataFrame
The DataFrame to check and update the data types.
datatypes : Series
The Series containing the desired data types for each column in the DataFrame.

Returns

dataframe : DataFrame
The DataFrame with updated data types.
def check_if_contains_dates(input_string: str) ‑> bool

Check if a string contains date.

def check_if_isemail(check_email: str) ‑> Tuple[str, bool]

Checks if a string is an email address and returns the cleaned string and a flag indicating if the string is an email.

Parameters

check_email : str
The input string to be checked for an email address.

Returns

check_email, found_email : str,bool
A tuple containing the cleaned string and a boolean flag indicating if an email address was found.
def clean(dirty_string: str,
pattern: str = '[a-zA-Zñáéíóú_@.0-9]+\\b',
no_emoji: bool = False,
title_mode: bool = False) ‑> str

Receive a string and clean it of special characters

Parameters

dirty_string : str
string of characters
pattern : str
regular expression string

Returns

result : str
clean character string
def clean_and_convert_to(x: str) ‑> str

Performs cleaning and some conversions on a str.

Parameters

x : str
The input string to be cleaned and converted.

Returns

x : str
The cleaned and converted string.
def clean_transform(col_index: pandas.core.indexes.base.Index,
mode: bool = True,
remove_spaces: bool = True,
remove_numeric: bool = True) ‑> List[str]

Transforms a column index by cleaning the column names and if needed makes them capital.

Parameters

col_index : Index
The column index to be transformed.
mode : bool
Indicates if names will be capitalized. By default it is set to True.

Returns

col_name_list : str
The transformed column names as a list of strings.
def clean_transform_helper(col: str,
mode: bool = True,
remove_numeric: bool = True,
remove_spaces: bool = True) ‑> str

Transforms a column name by cleaning the column name and if needed makes it capital.

Parameters

col : str
The column name to be transformed.
mode : bool
Indicates if names will be capitalized. By default it is set to True.
remove_numeric : bool
Indicates if numeric characters will be removed. By default it is set to True.
remove_spaces : bool
Indicates if spaces will be removed. By default it is set to True.

Returns

col_name : str
The transformed column name.
def clearConsole()
def convert_date(date_string: str) ‑> str

Converts a str of a date to a proper datetime64[ns] format.

Parameters

date_string : str
The input string representing a date.

Returns

proper_date : str
The date string in the proper format YYYY-MM-DD.
def correct_nan(check_missing: str) ‑> str

Corrects the format of missing values in a str to the correct empty str.

Parameters

check_missing : str
The string to be checked for incorrect missing value format.

Returns

check_missing : str
The corrected string format or empty str.
def drop_empty_columns(df_: pandas.core.frame.DataFrame) ‑> pandas.core.frame.DataFrame

Function that removes empty columns

def get_date_format(input_string: str) ‑> str

Infer the date format from a given string.

def intersection_cols(dfs_: List[pandas.core.frame.DataFrame]) ‑> pandas.core.frame.DataFrame

Function that resolves columns issues of a list of dataframes

Parameters

dfs_ : List[DataFrame]
The list of dataframes with columns to be resolves.

Returns

dfs_ : List[DataFrame]
The list of dataframes with the corrections in their columns (intersection).
def is_number_regex(s: str) ‑> bool

Returns True if string is a number.

def remove_char(input_string: str) ‑> str

Removes special characters from a string.

Parameters

input_string : str
The input string from which characters will be removed.

Returns

input_string : str
The string with specified characters removed.
def remove_numeric_char(input_string: str) ‑> str

Remove all numeric characters from a string.

Parameters

input_string : str
character string to be cleaned of numeric characters

Returns

str clean character string