lets_plot.bistro.residual.residual_plot#
- lets_plot.bistro.residual.residual_plot(data=None, x=None, y=None, *, method='lm', deg=1, span=0.5, seed=None, max_n=None, geom='point', bins=None, binwidth=None, color=None, size=None, alpha=None, color_by=None, show_legend=None, hline=True, marginal='dens:r')#
Produce a residual plot that shows the difference between the observed response and the fitted response values.
To use residual_plot(), the numpy and pandas libraries are required. Also, statsmodels and scipy are required for ‘lm’ and ‘loess’ methods.
- Parameters:
- datadict or Pandas or Polars DataFrame
The data to be displayed.
- xstr
Name of independent variable.
- ystr
Name of dependent variable that will be fitted.
- method{‘lm’, ‘loess’, ‘lowess’, ‘none’}, default=’lm’
Fitting method: ‘lm’ (Linear Model) or ‘loess’/’lowess’ (Locally Estimated Scatterplot Smoothing). If value of deg parameter is greater than 1 then linear model becomes polynomial of the given degree. If method is ‘none’ then data lives as is.
- degint, default=1
Degree of polynomial for linear regression model.
- spanfloat, default=0.5
Only for ‘loess’ method. The fraction of source points closest to the current point is taken into account for computing a least-squares regression. A sensible value is usually 0.25 to 0.5.
- seedint
Random seed for ‘loess’ sampling.
- max_nint
Maximum number of data-points for ‘loess’ method. If this quantity exceeded random sampling is applied to data.
- geom{‘point’, ‘tile’, ‘density2d’, ‘density2df’, ‘none’}, default=’point’
The geometric object to use to display the data. No object will be used if geom=’none’.
- binsint or list of int
Number of bins in both directions, vertical and horizontal. Overridden by binwidth. If only one value given - interpret it as list of two equal values. Applicable simultaneously for ‘tile’ geom and ‘histogram’ marginal.
- binwidthfloat or list of float
The width of the bins in both directions, vertical and horizontal. Overrides bins. The default is to use bin widths that cover the entire range of the data. If only one value given - interpret it as list of two equal values. Applicable simultaneously for ‘tile’ geom and ‘histogram’ marginal.
- colorstr
Color of the geometry. For more info see https://lets-plot.org/python/pages/aesthetics.html#color-and-fill.
- sizefloat
Size of the geometry.
- alphafloat
Transparency level of the geometry. Accept values between 0 and 1.
- color_bystr
Name of grouping variable.
- show_legendbool, default=True
False - do not show legend for the main layer.
- hlinebool, default=True
False - do not show horizontal line passing through 0.
- marginalstr, default=’dens:r’
Description of marginal layers packed to string value. Different marginals are separated by the ‘,’ char. Parameters of a marginal are separated by the ‘:’ char. First parameter of a marginal is a geometry name. Possible values: ‘dens’/’density’, ‘hist’/’histogram’, ‘box’/’boxplot’. Second parameter is a string specifying which sides of the plot the marginal layer will appear on. Possible values: ‘t’ (top), ‘b’ (bottom), ‘l’ (left), ‘r’ (right). Third parameter (optional) is size of marginal. To suppress marginals use marginal=’none’. Examples: “hist:tr:0.3”, “dens:tr,hist:bl”, “box:tr:.05, hist:bl, dens:bl”.
- Returns:
- PlotSpec
Plot object specification.
Notes
When using ‘lm’ and ‘loess’ methods, this function requires the statsmodels and scipy libraries to be installed.
Examples
1import numpy as np 2from lets_plot import * 3from lets_plot.bistro.residual import * 4LetsPlot.setup_html() 5n = 100 6np.random.seed(42) 7data = { 8 'x': np.random.uniform(size=n), 9 'y': np.random.normal(size=n) 10} 11residual_plot(data, 'x', 'y')
1import numpy as np 2from lets_plot import * 3from lets_plot.bistro.residual import * 4LetsPlot.setup_html() 5n, m = 1000, 5 6np.random.seed(42) 7x = np.random.uniform(low=-m, high=m, size=n) 8y = x**2 + np.random.normal(size=n) 9residual_plot({'x': x, 'y': y}, 'x', 'y', \ 10 deg=2, geom='tile', binwidth=[1, .5], \ 11 hline=False, marginal="hist:tr")
1import numpy as np 2from lets_plot import * 3from lets_plot.bistro.residual import * 4LetsPlot.setup_html() 5n = 200 6np.random.seed(42) 7x = np.random.uniform(size=n) 8y = x * np.random.normal(size=n) 9g = np.random.choice(['A', 'B'], size=n) 10residual_plot({'x': x, 'y': y, 'g': g}, 'x', 'y', \ 11 method='none', bins=[30, 15], \ 12 size=5, alpha=.5, color_by='g', show_legend=False, \ 13 marginal="hist:t:.2, hist:r, dens:tr, box:bl:.05")
1import numpy as np 2from lets_plot import * 3from lets_plot.bistro.residual import * 4LetsPlot.setup_html() 5n = 100 6color, fill = "#bd0026", "#ffffb2" 7np.random.seed(42) 8data = { 9 'x': np.random.uniform(size=n), 10 'y': np.random.normal(size=n) 11} 12residual_plot(data, 'x', 'y', geom='none', hline=False, marginal='none') + \ 13 geom_hline(yintercept=0, size=1, color=color) + \ 14 geom_point(shape=21, size=3, color=color, fill=fill) + \ 15 ggmarginal('r', layer=geom_area(stat='density', color=color, fill=fill))