plotting in Python with Seaborn: bar plot
Introduction
In Visualization with Seaborn of this series, we were introduced on the power visualization and dove into distributions plot. In this post we are going learn barplot. The bar plot is used to capture the relationship between a categorical and numerical column. For each unique value in a categorical column, a bar is plotted, which by default, displays the mean value for the data in a numeric column specified by the bar plot. Seaborn has nifty function called barplot()
, which is dedicated for this type of plot.
Loading libraries
Though most people are familiar with plotting using matplot, as it inherited most of the functions from MatLab. Python has an extremely nady library for data visualiztion called seaborn. The Seaborn library is based on the Matplotlib library. Therefore, you will also need to import the Matplotlib library.
Dataset
We are going to use a penguin dataset from palmerpenguins package (Horst, Hill, and Gorman 2020). We do not need to download this dataset as it comes with the seaborn package. We only need to load it from the package into our session using sns.load_dataset
function and specify the penguins
as the name of the dataset and assign it as df;
species island bill_length_mm ... body_mass_g sex year
0 Adelie Torgersen 39.1 ... 3750 male 2007
1 Adelie Torgersen 39.5 ... 3800 female 2007
2 Adelie Torgersen 40.3 ... 3250 female 2007
3 Adelie Torgersen NaN ... -2147483648 NaN 2007
4 Adelie Torgersen 36.7 ... 3450 female 2007
[5 rows x 8 columns]
A printed df
dataset shows that is made up of various measurements of three different penguin species — Adelie, Gentoo, and Chinstrap. The dataset contains seven variables – species, island, bill_length_mm, bill_depth_mm, flipper_length_mm, body_mass_g, sex, and year.
Next, we will call the barplot()
function from the Seaborn library to plot a bar plot that displays the average length of penguin species.
The Figure 1 shows that the average length of penguins from the three island. We can parse an argument hue = "sex"
to stack the plot as Figure 2 shows.