lets_plot.bistro.residual.residual_plot#

lets_plot.bistro.residual.residual_plot(data=None, x=None, y=None, *, method='lm', deg=1, span=0.5, seed=None, max_n=None, geom='point', bins=None, binwidth=None, color=None, size=None, alpha=None, color_by=None, show_legend=None, hline=True, marginal='dens:r')#

Produce a residual plot that shows the difference between the observed response and the fitted response values.

To use residual_plot(), the numpy and pandas libraries are required. Also, statsmodels and scipy are required for ‘lm’ and ‘loess’ methods.

Parameters:
datadict or Pandas or Polars DataFrame

The data to be displayed.

xstr

Name of independent variable.

ystr

Name of dependent variable that will be fitted.

method{‘lm’, ‘loess’, ‘lowess’, ‘none’}, default=’lm’

Fitting method: ‘lm’ (Linear Model) or ‘loess’/’lowess’ (Locally Estimated Scatterplot Smoothing). If value of deg parameter is greater than 1 then linear model becomes polynomial of the given degree. If method is ‘none’ then data lives as is.

degint, default=1

Degree of polynomial for linear regression model.

spanfloat, default=0.5

Only for ‘loess’ method. The fraction of source points closest to the current point is taken into account for computing a least-squares regression. A sensible value is usually 0.25 to 0.5.

seedint

Random seed for ‘loess’ sampling.

max_nint

Maximum number of data-points for ‘loess’ method. If this quantity exceeded random sampling is applied to data.

geom{‘point’, ‘tile’, ‘density2d’, ‘density2df’, ‘none’}, default=’point’

The geometric object to use to display the data. No object will be used if geom=’none’.

binsint or list of int

Number of bins in both directions, vertical and horizontal. Overridden by binwidth. If only one value given - interpret it as list of two equal values. Applicable simultaneously for ‘tile’ geom and ‘histogram’ marginal.

binwidthfloat or list of float

The width of the bins in both directions, vertical and horizontal. Overrides bins. The default is to use bin widths that cover the entire range of the data. If only one value given - interpret it as list of two equal values. Applicable simultaneously for ‘tile’ geom and ‘histogram’ marginal.

colorstr

Color of the geometry.

sizefloat

Size of the geometry.

alphafloat

Transparency level of the geometry. Accept values between 0 and 1.

color_bystr

Name of grouping variable.

show_legendbool, default=True

False - do not show legend for the main layer.

hlinebool, default=True

False - do not show horizontal line passing through 0.

marginalstr, default=’dens:r’

Description of marginal layers packed to string value. Different marginals are separated by the ‘,’ char. Parameters of a marginal are separated by the ‘:’ char. First parameter of a marginal is a geometry name. Possible values: ‘dens’/’density’, ‘hist’/’histogram’, ‘box’/’boxplot’. Second parameter is a string specifying which sides of the plot the marginal layer will appear on. Possible values: ‘t’ (top), ‘b’ (bottom), ‘l’ (left), ‘r’ (right). Third parameter (optional) is size of marginal. To suppress marginals use marginal=’none’. Examples: “hist:tr:0.3”, “dens:tr,hist:bl”, “box:tr:.05, hist:bl, dens:bl”.

Returns:
PlotSpec

Plot object specification.

Examples

 1import numpy as np
 2from lets_plot import *
 3from lets_plot.bistro.residual import *
 4LetsPlot.setup_html()
 5n = 100
 6np.random.seed(42)
 7data = {
 8    'x': np.random.uniform(size=n),
 9    'y': np.random.normal(size=n)
10}
11residual_plot(data, 'x', 'y')

 1import numpy as np
 2from lets_plot import *
 3from lets_plot.bistro.residual import *
 4LetsPlot.setup_html()
 5n, m = 1000, 5
 6np.random.seed(42)
 7x = np.random.uniform(low=-m, high=m, size=n)
 8y = x**2 + np.random.normal(size=n)
 9residual_plot({'x': x, 'y': y}, 'x', 'y', \
10              deg=2, geom='tile', binwidth=[1, .5], \
11              hline=False, marginal="hist:tr")

 1import numpy as np
 2from lets_plot import *
 3from lets_plot.bistro.residual import *
 4LetsPlot.setup_html()
 5n = 200
 6np.random.seed(42)
 7x = np.random.uniform(size=n)
 8y = x * np.random.normal(size=n)
 9g = np.random.choice(['A', 'B'], size=n)
10residual_plot({'x': x, 'y': y, 'g': g}, 'x', 'y', \
11              method='none', bins=[30, 15], \
12              size=5, alpha=.5, color_by='g', show_legend=False, \
13              marginal="hist:t:.2, hist:r, dens:tr, box:bl:.05")

 1import numpy as np
 2from lets_plot import *
 3from lets_plot.bistro.residual import *
 4LetsPlot.setup_html()
 5n = 100
 6color, fill = "#bd0026", "#ffffb2"
 7np.random.seed(42)
 8data = {
 9    'x': np.random.uniform(size=n),
10    'y': np.random.normal(size=n)
11}
12residual_plot(data, 'x', 'y', geom='none', hline=False, marginal='none') + \
13    geom_hline(yintercept=0, size=1, color=color) + \
14    geom_point(shape=21, size=3, color=color, fill=fill) + \
15    ggmarginal('r', layer=geom_area(stat='density', color=color, fill=fill))