Overriding Default Grouping with the group Aesthetic¶
How Grouping Works in Lets-Plot¶
Default Grouping Behavior:
- Lets-Plot automatically groups data by discrete variables mapped to aesthetics like
color, shape, linetype, etc. - This creates separate visual elements (
lines, paths, polygons) for each unique combination of these variables
Explicit Group Control:
- Use
group = 'var'to group only by that specific variable, overriding default grouping - Use
group = [var1, var2, ...]to group by the interaction of multiple variables - Use
group = []to disable all grouping completely
In [1]:
from lets_plot import *
import polars as pl
In [2]:
LetsPlot.setup_html()
In [3]:
mtcars = pl.read_csv("https://raw.githubusercontent.com/JetBrains/lets-plot-docs/refs/heads/master/data/mpg.csv")
mtcars.head()
Out[3]:
1. Highway MPG by Drive Type¶
In [4]:
seed = 21
( ggplot(mtcars, aes(x='drv', y='hwy'))
+ geom_violin(tooltips='none')
+ geom_sina(seed=seed)
)
Out[4]:
2. Add More Information - color¶
In [5]:
( ggplot(mtcars, aes(x='drv', y='hwy'))
+ geom_violin(tooltips='none')
+ geom_sina(aes(color='cyl'), seed=seed)
)
Out[5]:
3. Discrete color: Default Grouping Creates Unwanted Separation¶
Let's add discrete colors by marking the cyl variable as discrete.
When we map color=as_discrete('cyl'), Lets-Plot automatically groups the data by the discrete color variable.
This means:
- Automatic grouping: Each combination of
drv(x-axis) andcyl(color) becomes a separate group - Position adjustment: The
geom_sina()uses "dodge" positioning by default, which separates overlapping groups horizontally - Result: Instead of one sina plot per drive type, we get 4 separate sina plots (one for each cylinder count) within each drive type category
In [6]:
( ggplot(mtcars, aes(x='drv', y='hwy'))
+ geom_violin(tooltips='none')
+ geom_sina(aes(color=as_discrete('cyl')), seed=seed)
)
Out[6]:
4. Fix with Explicit Grouping by Drive Type¶
In [7]:
( ggplot(mtcars, aes(x='drv', y='hwy'))
+ geom_violin(tooltips='none')
+ geom_sina(aes(color=as_discrete('cyl'),
group='drv' # <-- group only by drive type (ignoring the color variable for grouping)
), seed=seed)
)
Out[7]:
5. Cleaner Fix: Disable All Grouping¶
In [8]:
( ggplot(mtcars, aes(x='drv', y='hwy'))
+ geom_violin(tooltips='none')
+ geom_sina(aes(color=as_discrete('cyl'),
group=[] # <-- disable all grouping entirely
), seed=seed)
)
Out[8]: