Waterfall Plot#
A waterfall plot shows the cumulative effect of sequentially introduced positive or negative values.
To use it, you need to import the ‘bistro’ module.
import pandas as pd
from lets_plot import *
from lets_plot.bistro import *
LetsPlot.setup_html()
data = {
"Accounts": ["Product revenue", "Services revenue", "Fixed costs", "Variable costs"],
"Values": [830_000, 290_000, -360_000, -150_000],
}
Default View#
waterfall_plot(data, "Accounts", "Values")
Improved View#
waterfall_plot(data, "Accounts", "Values", \
size=.75, width=.8, total_title="Profit", \
hline=element_line(linetype='solid', size=1), \
connector=element_line(linetype='dotted'), \
label=element_text(size=20, family="Courier", face='bold'), \
label_format="$~s") + \
scale_y_continuous(name="Values", format="$~s") + \
ggtitle("Company Profit (in USD)") + \
ggsize(1000, 500) + \
theme_minimal() + \
theme(plot_title=element_text(size=20, face='bold', hjust=.5))
Additional Parameters#
measure and group#
df = pd.DataFrame({
"Company": ["Badgersoft"] * 7 + ["AIlien Co."] * 7,
"Accounts": ["initial", "revenue", "costs", "Q1", "revenue", "costs", "Q2"] * 2,
"Values": [200, 200, -100, None, 250, -100, None, \
150, 50, -100, None, 100, -100, None],
"Measure": ['absolute', 'relative', 'relative', 'total', 'relative', 'relative', 'total'] * 2,
})
company_df = df[df["Company"] == "Badgersoft"]
waterfall_plot(df, "Accounts", "Values", measure="Measure", group="Company") + \
facet_grid(x="Company", scales='free_x')
calc_total#
calc_total=False disables the calculation of the total.
If the measure serie is specified however, the calc_total setting has no effect.
gggrid([
waterfall_plot(data, "Accounts", "Values", calc_total=False),
waterfall_plot(company_df, "Accounts", "Values", measure="Measure", calc_total=False),
])
Labels#
There are several parameters that allow you to control the text labels on the waterfalls:
relative_labels: content and formatting of annotation labels on relative change bars (result of the call to thelayer_labels()function);absolute_labels: content and formatting of annotation labels on absolute value bars (result of the call to thelayer_labels()function);label: style settings for all text labels (result of the call to theelement_text()function).
waterfall_plot(data, "Accounts", "Values", relative_labels=layer_labels().line("@{..flow_type..}d:\n@..label.."),
absolute_labels=layer_labels().line("Result:\n@..label.."),
label=element_text(face="bold_italic"))
Hiding Labels
gggrid([
waterfall_plot(data, "Accounts", "Values", relative_labels='none') + ggtitle("Hide relative labels only"),
waterfall_plot(data, "Accounts", "Values", absolute_labels='none') + ggtitle("Hide absolute labels only"),
waterfall_plot(data, "Accounts", "Values", label='blank') + ggtitle("Hide all labels"),
])
Tooltips#
Tooltips for relative and absolute measures should be specified independently.
relative_tooltips = layer_tooltips().title("Account: @..xlabel..")\
.format("@..initial..", " $,.3~s")\
.format("@..value..", " $,.3~s")\
.line("@{..flow_type..}d from @..initial.. to @..value..")\
.disable_splitting()
absolute_tooltips = 'none'
gggrid([
waterfall_plot(data, "Accounts", "Values",
relative_tooltips='detailed', absolute_tooltips='detailed') + \
ggtitle("'detailed' tooltips"),
waterfall_plot(data, "Accounts", "Values",
relative_tooltips=relative_tooltips, absolute_tooltips=absolute_tooltips) + \
ggtitle("Custom tooltips"),
])
sorted_value#
waterfall_plot(data, "Accounts", "Values", sorted_value=True)
threshold/max_values#
gggrid([
waterfall_plot(data, "Accounts", "Values") + ggtitle("Default"),
waterfall_plot(data, "Accounts", "Values", threshold=300_000) + ggtitle("Specified threshold"),
waterfall_plot(data, "Accounts", "Values", max_values=2) + ggtitle("Specified max_values"),
])
base#
waterfall_plot(data, "Accounts", "Values", base=400_000)
Combining waterfall_plot() with Other Geometry Layers#
Waterfall plots can be enhanced by adding background and foreground layers. Foreground layers can be added using the regular + operator. Background layers can be added using the background_layers parameter.
Limitations:
layers must provide their own data;
data coordinates must be numeric.
# background layer and its data
quarter_data = {
"period_start": [0.5, 3.5],
"period_end": [3.5, 6.5],
"ai_introduced": [False, True],
}
quarter_layer = geom_band(
aes(
xmin="period_start",
xmax="period_end",
paint_a="ai_introduced"
),
data=quarter_data,
alpha=0.2,
# we use "paint_a" to color the bands based on a separate category (e.g., quarters),
# so they have their own color palette independent from the waterfalls
fill_by="paint_a", color_by="paint_a"
)
# foreground layers and their data
quarter_label_data = {
"name": ["Q1", "Q2"],
"x": [2, 5],
"y": [600, 600],
}
quarter_ai_status_data = {
"text": ["Before AI\nintroduction", "After AI\nintroduction"],
"x": [1.5, 4.5],
"y": [100, 100],
}
text_layers = geom_text(aes(x="x", y="y", label="name"), data=quarter_label_data, size=8) + \
geom_text(aes(x="x", y="y", label="text"), data=quarter_ai_status_data, size=12)
# whole plot
(waterfall_plot(company_df, "Accounts", "Values", measure="Measure",
background_layers=quarter_layer) # background layer
+ text_layers # foreground layers
+ scale_hue("paint_a", guide="none") # color for the background layer (bands)
+ ggtitle("Waterfall with additional layers"))
Customize Colors#
Let’s look at the names of the flow types using the show_legend parameter:
wp = waterfall_plot(company_df, "Accounts", "Values", measure="Measure", show_legend=True)
wp
Use these names to customize the colors:
wp + scale_fill_manual(values={
"Increase": "#66c2a5",
"Decrease": "#fc8d62",
"Absolute": "#e78ac3",
"Total": "#8da0cb",
})
If desired, you can also change the names of the flow types in the legend:
wp + scale_fill_manual(values={
"Increase": "#66c2a5",
"Decrease": "#fc8d62",
"Absolute": "#e78ac3",
"Total": "#8da0cb",
}, labels=["inc", "dec", "abs", "total"])
You can use a constant color for boxes and 'flow_type' color for their borders:
waterfall_plot(company_df, "Accounts", "Values", measure="Measure", size=.75,
fill="gray90", color="flow_type")
To paint the text labels, combine color="flow_type" and label=element_text(color='inherit'):
waterfall_plot(company_df, "Accounts", "Values", measure="Measure",
fill="gray90",
color="flow_type", # Needed for mapping color to flow type
label=element_text(color='inherit')) # Needed to inherit the text label color from the color of boxes border
The same can be done, for example, only for the relative text labels:
waterfall_plot(company_df, "Accounts", "Values", measure="Measure",
fill="gray90",
color="flow_type", # Map color to flow type
label=element_text(color="indigo"), # Choose some default color for absolute labels
relative_labels=layer_labels().line("@..label..").inherit_color()) # Inherit color for the relative text labels