Visualize and compare change over time using gganimate.

The article provides detailed instructions on using the gganimate package to animate a static ggplot2 plot, enhancing visualizations with dynamic transitions between data states.
Data Science
Data Visualization
ggplot
Author
Affiliation
Published

June 20, 2024

Introduction

One of the main critiques of ggplot2 (Wickham, 2016) is its static nature, often cited as a limitation compared to more interactive tools like Highcharts. However, this doesn’t mean that animation is entirely unavailable. Enter gganimate, a natural extension of ggplot2 designed to animate visualizations and capture changes in data over time or other variables. With gganimate, you can create dynamic charts and export them as a series of PNGs, or as a single GIF or MP4 file.

Today, you’ll explore gganimate (Pedersen and Robinson, 2024) and discover how to create compelling animated visualizations. By the end of this article, you’ll have a set of visually engaging animations to showcase your data. Let’s start by loading packages we need for this session;

require(gganimate)
require(tidyverse)

Dataset

The dataset we will use for the example exercises is “Global CO2 emissions from cement production” (Andrew 2022).

dat <- readr::read_csv("https://zenodo.org/record/7081360/files/1.%20Cement_emissions_data.csv", show_col_types = FALSE)
dat
# A tibble: 142 × 221
    Year Afghanistan Albania Algeria Andorra Angola Anguilla
   <dbl>       <dbl>   <dbl>   <dbl>   <dbl>  <dbl>    <dbl>
 1  1880          NA      NA      NA       0     NA       NA
 2  1881          NA      NA      NA       0     NA       NA
 3  1882          NA      NA      NA       0     NA       NA
 4  1883          NA      NA      NA       0     NA       NA
 5  1884          NA      NA      NA       0     NA       NA
 6  1885          NA      NA      NA       0     NA       NA
 7  1886          NA      NA      NA       0     NA       NA
 8  1887          NA      NA      NA       0     NA       NA
 9  1888          NA      NA      NA       0     NA       NA
10  1889          NA      NA      NA       0     NA       NA
# ℹ 132 more rows
# ℹ 214 more variables: `Antigua and Barbuda` <dbl>, Argentina <dbl>,
#   Armenia <dbl>, Aruba <dbl>, Australia <dbl>, Austria <dbl>,
#   Azerbaijan <dbl>, Bahamas <dbl>, Bahrain <dbl>, Bangladesh <dbl>,
#   Barbados <dbl>, Belarus <dbl>, Belgium <dbl>, Belize <dbl>, Benin <dbl>,
#   Bermuda <dbl>, Bhutan <dbl>, `Bonaire, Saint Eustatius and Saba` <dbl>,
#   `Bosnia and Herzegovina` <dbl>, Botswana <dbl>, Brazil <dbl>, …

We can subset the emmission measurement from 1960 onwards, and any columns with all NAs or zeros have been dropped. The table below displays all the data that will be used in this tutorial.

dat |> 
  dplyr::filter(Year >= 1960) |> 
  select_if(function(x) all(!is.na(x))) |> 
  select_if(function(x) all(!x == 0))
# A tibble: 62 × 109
    Year Afghanistan Algeria Angola Argentina Armenia Australia Austria
   <dbl>       <dbl>   <dbl>  <dbl>     <dbl>   <dbl>     <dbl>   <dbl>
 1  1960        18.0    523.   79.4      1305    89.6      1381    1399
 2  1961        21.8    530.   76.9      1436   100.       1414    1523
 3  1962        29.1    432.   83.3      1446   113.       1450    1512
 4  1963        50.9    436.   94.5      1254   120.       1541    1635
 5  1964        61.8    389.  105.       1439   128.       1792    1865
 6  1965        83.6    367.  120.       1632   143.       1879    1999
 7  1966        87.2    327.  131.       1723   159.       1817    2224
 8  1967        65.4    360.  138.       1756   168.       1890    2246
 9  1968        47.1    429   152.       2064   174        1941    2250
10  1969        50.9    469.  189.       2148   179.       2130    2253
# ℹ 52 more rows
# ℹ 101 more variables: Azerbaijan <dbl>, Bangladesh <dbl>, Belarus <dbl>,
#   Belgium <dbl>, `Bosnia and Herzegovina` <dbl>, Brazil <dbl>,
#   Bulgaria <dbl>, Canada <dbl>, Chile <dbl>, China <dbl>, Colombia <dbl>,
#   Croatia <dbl>, Cyprus <dbl>, `Czech Republic` <dbl>, `North Korea` <dbl>,
#   `Democratic Republic of the Congo` <dbl>, Denmark <dbl>,
#   `Dominican Republic` <dbl>, Ecuador <dbl>, Egypt <dbl>, …

We notice that using the select_if have removed even the countries we are interested, therefore, we need to make some changes to the code and use the filter verb instead

selected.countries = dat |> 
  select(Year, Kenya, Uganda, Tanzania, Angola, Malawi, Zambia, Mozambique) |> 
  filter(Year > 1960)

Unfortunate, the dataset is in wide format and for us to visualize and analyse to compare the emmissions across the countries, we need to convert the dataset from wide to long format first.

selected.countries.long = selected.countries |> 
  pivot_longer(cols =  -Year, names_to = "countries", values_to = "Emmissions")

Create and visualize the emmission of cement productions for the selecte dcountires in the Africa continent.

selected.countries.long |> 
  ggplot(aes(x = Year, y = Emmissions, color = countries))+
  geom_line() + 
  # scale_y_log10() +
  labs(
    x = "Year", 
    y = "log10(Emission)") +
  theme(axis.title =  element_blank())

Then rank the cement emmission for all the countries for over the entire period the dataset was sampled. TO do that we use tne rank function and pass the argument of annual emmisions. The rank function will assign the rank for the country with the highest emmision as to the country with the lowest emmisions

rankings = selected.countries.long |> 
  group_by(Year) |> 
  mutate(Rank = rank(-Emmissions)) |> 
  # filter(Year == 1970) |> 
  arrange(Rank) |> 
  mutate(labelling = as.character(Emmissions))

Let’s filter the cement production emmissions for 2021

rankings |> 
  filter(Year == 2021) |> 
  select(-labelling) |> 
  flextable::flextable()

Year

countries

Emmissions

Rank

2,021

Kenya

3,583.00

1

2,021

Tanzania

2,524.00

2

2,021

Angola

1,124.00

3

2,021

Mozambique

1,011.00

4

2,021

Zambia

961.00

5

2,021

Uganda

581.80

6

2,021

Malawi

94.16

7

We notice that in 2021, data on emissions from cement production for several African countries were compiled and ranked in ascending order. Malawi emitted the least amount of emissions, with 94.16 tons. Following Malawi, Uganda emitted 581.8 tons, Zambia emitted 961 tons, and Mozambique emitted 1011 tons. Angola followed with 1124 tons of emissions, Tanzania with 2524 tons, and Kenya with the highest emissions of 3583 tons.

Creating and Styling a Bar Chart for a Single Time Period

Creating an animated chart can be time-consuming, so it’s wise to start with a simpler task by building a visualization for a single time period. This allows you to ensure that everything appears exactly as you intend. For race charts to function correctly, they require one essential element: the rank. In our case, the rank indicates the position of a ticker’s value compared to other tickers. Essentially, it determines the position of each column in the bar chart.

ggplot(
  data = rankings |> filter(Year == 1980), 
  aes(x = reorder(countries, -Rank), y = Emmissions, color = countries, fill = countries)
  )+
  geom_col()+
  coord_flip(clip = "off", expand = FALSE) +
  geom_text(aes(label = labelling), hjust = -.25) +
  theme_minimal() +
  theme(
    legend.position = "none", axis.title.y = element_blank(),
    plot.margin = margin(1, 2, 1, 2, unit = "cm")
    ) +
  ggsci::scale_color_jama()+
  ggsci::scale_fill_jama()

Animate

Animating a chart with R gganimate is easy. The only changes required in the charting code are removing the data filtering to capture all time periods and storing the entire plot into a variable. We assign our ggplot object as p, which is going to be fed into the animations functions

p = ggplot(
  data = rankings , 
  aes(x = reorder(countries, -Rank), y = Emmissions, color = countries, fill = countries)
  )+
  geom_col()+
  coord_flip(clip = "off", expand = FALSE) +
  geom_text(aes(label = labelling), hjust = -.25) +
  theme_minimal() +
  theme(
    legend.position = "none", axis.title.y = element_blank(),
    plot.margin = margin(1, 2, 1, 2, unit = "cm")
    ) +
  ggsci::scale_color_jama()+
  ggsci::scale_fill_jama()

Once we have the plot object, we can animate with gganimate as the chunk below highlight. The p object represents a basic ggplot created. The transition_states function from gganimate is used to animate the plot based on the Year variable, with a transition length of 4 seconds between states and each state lasting for 1 second. The view_follow(fixed_x = TRUE) function keeps the x-axis consistent throughout the animation for a stable frame of reference. The plot’s title dynamically updates to show the current year using the {closest_state} syntax. The caption at the bottom indicates the source as “SEMBA@2024”.

animated_plot = p + 
  transition_states(
    states = Year, 
    transition_length = 4, 
    state_length = 1
    ) +
  view_follow(fixed_x = TRUE) +
  labs(title = "Annual Cement Production in Tonnes ({closest_state})", caption = "SEMBA@2024")

animated_plot

This approach allows for a dynamic and engaging visualization of changes in annual cement production over time.

The animate() function will generate a GIF with a resolution of 1024x768 and a total of 600 frames. This means that the function will create a sequence of 600 images and compile them into a single animated GIF file. The resulting animation will have a resolution of 1024x800 pixels, providing a high-quality visual output.

animate(
  plot = animated_plot,
  width = 1024,
  height = 800,
  res = 150,
  nframes = 600,
  fps = 30,
  end_pause = 60,
  renderer = gifski_renderer("cement_production_racee_chart.gif")
)

References

Pedersen, T.L., Robinson, D., 2024. Gganimate: A grammar of animated graphics.
Wickham, H., 2016. ggplot2: Elegant graphics for data analysis. Springer-Verlag New York.

Citation

BibTeX citation:
@online{semba2024,
  author = {Semba, Masumbuko},
  title = {Visualize and Compare Change over Time Using Gganimate.},
  date = {2024-06-20},
  url = {https://lugoga.github.io/kitaa/posts/visualize_animate/},
  langid = {en}
}
For attribution, please cite this work as:
Semba, M., 2024. Visualize and compare change over time using gganimate. [WWW Document]. URL https://lugoga.github.io/kitaa/posts/visualize_animate/