Helper Functions Table of Contents
Importing Utilities Display Styles Time Series
pd_np_mpl_import css_styling pretty get_stylers timeseries_overview
url_import ser header_text df_style_string featurize_dt_index
yf_import dfme head_tail_vert style_df add_change_column
import_all sp, p, & d head_tail_horz fancy_style_df add_lags
get_complementary see & multi apply_style boxplot_correlation
time conversion missing_values get_accuracy
get_daily_error
Evan Marie online: | EvanMarie.com| EvanMarie@Proton.me | Linked In | GitHub | Hugging Face | Mastadon | Jovian.ai | TikTok | CodeWars | Discord ⇨ ✨ EvanMarie ✨#6114 |

Defining Colors for Function Use:


Importing Helpers

pd_np_mpl_import() [Return to Top]
Imports Pandas, NumPy, Matplotlib.pylot, Reload, and Seaborn

url_import() [Return to Top]
Imports urlretrieve()

yf_import() [Return to Top]
Imports the Yahoo Finance, yfinance, module

import_all() [Return to Top]
Imports whichever of the previously mentioned imports are contained within it so that most of my importing is simply importing the helpers file and running this function


Utility Helpers

css_styling() [Return to Top]
Imports the custom_styles.css file to be used with the notebook

ser() [Return to Top]
Converts the data passed to pd.Series(), just a shortcut version for pd.Series()

dfme() [Return to Top]
Converts the data passed to pd.DataFrame(), just a shortcut version for pd.DataFrame()

sp(), p(), & d() [Return to Top]
Three more shortcut functions:
- sp() - makes an empty row of space
- p() - shortcut for print()
- d() - shortcut for IPython display()

get_complementary() [Return to Top]
Pass a hexcode for a color as a string and get the hexcode string for the complementary color.

Timestamp Conversion Functions [Return to Top]
These functions are all utilized within the display functions so that when working with time series data whose timestamps include intraday values, a user can specify whether or not to show just the day, intraday = False, which is the default, or to show the entire timestamp data, intraday = True


Display Helpers

pretty() [Return to Top]
Displays non-series and non-df data in a clean and attractive way. Pass, for example, array data followed by a label, and it will print the data with a header label as passed.

header_text() [Return to Top]
This function can be used to output any string data as a header for other output data to help organize a notebook's outputs. It is used within many other display functions to create the headers/titles for the data they are displaying.

describe_em() [Return to Top]
Outputs the describe() data individually for all column names pass in the col_list.

head_tail_vert() [Return to Top]
Displays the head and tail portions of a passed df according to the num passed and includes a header/title description for the data being displayed for easy notebook navigation and data comprehension. This function will display the head and tail vertically on top of one another.

head_tail_horz() [Return to Top]
Displays the head and tail portions of a passed df according to the num passed and includes a header/title description for the data being displayed for easy notebook navigation and data comprehension. This function will display the head and tail horizontally beside one another. If the current window size is too small, it will stack them on top of one another and display identically to head_tail_vert().

see() [Return to Top]
Used for displaying any dataframe or series data along with a descriptive header / title.

multi() [Return to Top]
This function takes a list of tuples of dataframes or series and their corresponding headers/titles, and displays them side by side, horizontally for easier data comparison.

missing_values() [Return to Top]
Calculates all the missing values in a dataframe and displays the column names and how many missing values each has.


Styling Helpers

get_stylers() [Return to Top]
This function is used with style_df() to convert a style dictionary into a styler that will be used to display a dataframe.

df_style_string() [Return to Top]
Pass a dataframe name as a string, and this returns the style string needed for styling a dataframe, to which the df_stylers can be passed. It eliminates the index and column axes for clean display. The output string can then be used in other code to style a dataframe. Note: the hiding of the index and columns can be configured after cutting and pasting the output string.

style_df() [Return to Top]
Takes a dataframe and a styles dictionary and returns the dataframe with the styles passed.

- style.Styler.set_table_styles() documentation
- Pandas table visualization documentation

fancy_style_df() [Return to Top]
Utilizes the various Pandas df stylers and combines them into one function to produce the following options: (see the documentation for each below for more details on usage)

- background_gradient
- bars (a bar graph for each value)
- highlight_max
- highlight_min
- highlight_null
- highlight_between
- highlight_quantile
# DF with no styling
# DF with background_gradient
# DF with bars
# DF with highlight_max, axis = 1
# DF with highlight_max, axis = 0
# DF with highlight_min, axis = 1
# DF with highlight_min, axis = 0
# DF with highlight_null
# DF with highlight_ between, range 0.2 - 1, subset of columns
# DF with highlight_quantile, range 0.0002 - 0.5

apply_style() [Return to Top]
Applies a style function to a passed df and returns the styled df.
# apply_style() with a defined function
apply_style() with a lambda function


Time Series Helpers

timeseries_overview() [Return to Top]
Produces an overview of timeseries data and important information about the key metric column passed.
# data for this section will be energy consumption in megawatts in the UK from 2009 through 2022

add_change_column() [Return to Top]
Adds a column to time series data for the change from one timestamp to another for the given column.

featurize_dt_index() [Return to Top]
Creates time series features from a datetime index.
# featurize_dt_index, daytime = False
# featurize_dt_index() with daytime = True

add_lags() [Return to Top]
Takes a dataframe and a list of lags and their column labels and creates lag features for the given lags and the key metric passed.

boxplot_correlation() [Return to Top]
Creates a boxplot for viewing the correlation between a variable and the key metric in time series data.
# boxplot_correlation: week of year vs energy consumption
# boxplot_correlation: month of year vs energy consumption
# boxplot_correlation: time of day vs energy consumption

get_accuracy() [Return to Top]
Calculates the RMSE score, the absolute accuracy, the relative accuracy, the sharpe ratio, and the overall accuracy for predictions and target data.
# for the next examples, predictions from a trained model are needed
# preparing the data for a regressor model
# defining the model, training, and getting predictions

get_daily_error() [Return to Top]
Calculates the daily error for a key metric from the prediction results dataframe.

[Return to Top]