lets_plot.geom_boxplot¶
- lets_plot.geom_boxplot(mapping=None, *, data=None, stat=None, position=None, show_legend=None, sampling=None, tooltips=None, orientation=None, fatten=None, outlier_color=None, outlier_fill=None, outlier_shape=None, outlier_size=None, outlier_stroke=None, varwidth=None, whisker_width=None, color_by=None, fill_by=None, **other_args)¶
Display the distribution of data based on a five number summary (“minimum”, first quartile (Q1), median, third quartile (Q3), and “maximum”), and “outlying” points individually.
- Parameters
- mappingFeatureSpec
Set of aesthetic mappings created by aes() function. Aesthetic mappings describe the way that variables in the data are mapped to plot “aesthetics”.
- datadict or DataFrame or polars.DataFrame
The data to be displayed in this layer. If None, the default, the data is inherited from the plot data as specified in the call to ggplot.
- statstr, default=’boxplot’
The statistical transformation to use on the data for this layer, as a string.
- positionstr or FeatureSpec, default=’dodge’
Position adjustment, either as a string (‘identity’, ‘stack’, ‘dodge’, …), or the result of a call to a position adjustment function.
- show_legendbool, default=True
False - do not show legend for this layer.
- samplingFeatureSpec
Result of the call to the sampling_xxx() function. To prevent any sampling for this layer pass value “none” (string “none”).
- tooltipslayer_tooltips
Result of the call to the layer_tooltips() function. Specify appearance, style and content.
- orientationstr, default=’x’
Specify the axis that the layer’s stat and geom should run along. Possible values: ‘x’, ‘y’.
- fattenfloat, default=1.0
A multiplicative factor applied to size of the middle bar.
- outlier_colorstr
Default color aesthetic for outliers.
- outlier_fillstr
Default fill aesthetic for outliers.
- outlier_shapeint
Default shape aesthetic for outliers, an integer from 0 to 25.
- outlier_sizefloat
Default size aesthetic for outliers.
- outlier_strokefloat
Default width of the border for outliers.
- varwidthbool, default=False
If False, make a standard box plot. If True, boxes are drawn with widths proportional to the square-roots of the number of observations in the groups.
- whisker_widthfloat, default=0.5
A multiplicative factor applied to the box width to draw horizontal segments on whiskers.
- color_by{‘fill’, ‘color’, ‘paint_a’, ‘paint_b’, ‘paint_c’}, default=’color’
Define the color aesthetic for the geometry.
- fill_by{‘fill’, ‘color’, ‘paint_a’, ‘paint_b’, ‘paint_c’}, default=’fill’
Define the fill aesthetic for the geometry.
- other_args
Other arguments passed on to the layer. These are often aesthetics settings used to set an aesthetic to a fixed value, like color=’red’, fill=’blue’, size=3 or shape=21. They may also be parameters to the paired geom/stat.
- Returns
- LayerSpec
Geom object specification.
Notes
Computed variables:
..lower.. : lower hinge, 25% quantile.
..middle.. : median, 50% quantile.
..upper.. : upper hinge, 75% quantile.
..ymin.. : lower whisker = smallest observation greater than or equal to lower hinge - 1.5 * IQR.
..ymax.. : upper whisker = largest observation less than or equal to upper hinge + 1.5 * IQR.
geom_boxplot() understands the following aesthetics mappings:
lower : lower hinge.
middle : median.
upper : upper hinge.
ymin : lower whisker.
ymax : upper whisker.
alpha : transparency level of a layer. Accept values between 0 and 1.
color (colour) : color of the geometry lines. String in the following formats: RGB/RGBA (e.g. “rgb(0, 0, 255)”); HEX (e.g. “#0000FF”); color name (e.g. “red”).
fill : fill color. String in the following formats: RGB/RGBA (e.g. “rgb(0, 0, 255)”); HEX (e.g. “#0000FF”); color name (e.g. “red”).
size : lines width.
linetype : type of the line of border. Codes and names: 0 = ‘blank’, 1 = ‘solid’, 2 = ‘dashed’, 3 = ‘dotted’, 4 = ‘dotdash’, 5 = ‘longdash’, 6 = ‘twodash’.
width : width of boxplot. Typically ranges between 0 and 1. Values that are greater than 1 lead to overlapping of the boxes.
Examples
1import numpy as np 2from lets_plot import * 3LetsPlot.setup_html() 4n = 100 5np.random.seed(42) 6x = np.random.choice(['a', 'b', 'c'], size=n) 7y = np.random.normal(size=n) 8ggplot({'x': x, 'y': y}, aes(x='x', y='y')) + \ 9 geom_boxplot()
1import numpy as np 2from lets_plot import * 3LetsPlot.setup_html() 4n = 100 5np.random.seed(42) 6x = np.random.choice(['a', 'b', 'b', 'c'], size=n) 7y = np.random.normal(size=n) 8ggplot({'x': x, 'y': y}, aes(x='x', y='y')) + \ 9 geom_boxplot(fatten=5, varwidth=True, \ 10 outlier_shape=8, outlier_size=5)
1import numpy as np 2import pandas as pd 3from lets_plot import * 4LetsPlot.setup_html() 5n = 100 6np.random.seed(42) 7x = np.random.choice(['a', 'b', 'c'], size=n) 8y = np.random.normal(size=n) 9df = pd.DataFrame({'x': x, 'y': y}) 10agg_df = df.groupby('x').agg({'y': [ 11 'min', lambda s: np.quantile(s, 1/3), 12 'median', lambda s: np.quantile(s, 2/3), 'max' 13]}).reset_index() 14agg_df.columns = ['x', 'y0', 'y33', 'y50', 'y66', 'y100'] 15ggplot(agg_df, aes(x='x')) + \ 16 geom_boxplot(aes(ymin='y0', lower='y33', middle='y50', \ 17 upper='y66', ymax='y100'), stat='identity')
1import numpy as np 2import pandas as pd 3from lets_plot import * 4LetsPlot.setup_html() 5n, m = 100, 5 6np.random.seed(42) 7df = pd.DataFrame({'x%s' % i: np.random.normal(size=n) \ 8 for i in range(1, m + 1)}) 9ggplot(df.melt()) + \ 10 geom_boxplot(aes(x='variable', y='value', color='variable', \ 11 fill='variable'), \ 12 outlier_shape=21, outlier_size=4, size=2, \ 13 alpha=.5, width=.5, show_legend=False)