Plotting is a crucial part of bioinformatics and computational biology — it serves as both a starting place, through exploratory data analysis, and the endpoint of a data workflow.

One of the most well-known plotting libraries in Python is Matplotlib. It was one of the first widely-adopted plotting libraries in Python and is used as the plotting package within many domain-specific analysis packages, such as Scanpy.

Matplotlib supports a wide variety of plot types and allows for extremely fine-grained control over plot styling, which makes it a great choice for producing publication-ready plots.

However, this level of control is a double-edged sword. Its syntax can be quite verbose. Because Matplotlib uses the imperative programming paradigm, you need to explicitly define the construction of each visualization, with step-by-step commands, from the axes to the marks to the labels and legends.

Many bioinformaticians and computational biologists who are familiar with the ggplot package in R, which uses a declarative programming paradigm, find the procedural nature of plotting with Matplotlib to be quite onerous.

In this notebook, we briefly introduce a few declarative plotting packages that may allow you to more quickly jump into exploratory data analysis or modify figures as needed for presentations or reports.

We use the Palmer Penguins dataset as our dataset.

Data were collected and made available by Dr. Kristen Gorman and the Palmer Station, Antarctica LTER, a member of the Long Term Ecological Research Network.

We demonstrate how to create a scatter plot of flipper length vs. body mass, colored by penguin species, in Matplotlib, Seaborn, Plotnine, Vega-Altair, and Holoviews.

Seaborn

Plotnine

Vega-Altair

Holoviews