Unlocking Dynamic Data Visualization in R with Meta-Programming: A Deep Dive into Enquo, Quosures, and ggplot2

Understanding Meta-programming in R with ggplot

Meta-programming is a programming paradigm that involves writing code about code. In the context of R and the popular data visualization library ggplot, meta-programming can be used to create dynamic and flexible data visualizations.

In this article, we will explore how to use meta-programming functions in R to create a function that picks a specific column from a dataframe and creates a ggplot. We will also delve into the underlying concepts of enquo(), lango(), and rlang::last_trace() and provide examples and explanations for each step.

Introduction to Enquo()

Enquo is a meta-programming function provided by the dplyr package in R. It is used to convert a symbol (a character vector) into an expression that can be evaluated at runtime. This allows us to dynamically access columns of a dataframe without knowing their names at compile time.

For example, consider the following code:

main_df <- starwars %>% dplyr::select(name, height, mass)

In this code, enquo() is used to convert the symbol column_for_clr into an expression that can be evaluated. However, we want to use column_for_clr as a character vector instead of a symbol.

Using Enquo() and Rlang

One way to achieve this is by using the rlang package, which provides meta-programming functions for R. Specifically, we will use rlang::"!!" to convert our desired column name into an expression that can be evaluated.

Here’s how you might do it:

# Load required packages
library(dplyr)
library(rlang)
library(ggplot2)

# Define the function that picks a specific column from a dataframe and creates a ggplot
plot_gg <- function(main_df,
                    another_df,
                    column_for_clr = c("sex", "gender", "species", 
                                       "homeworld")){

  # Convert column names into expressions using rlang::"!!"
  var_clf <- enquo(column_for_clr)

  new_df <- main_df %>% left_join(another_df, by = "name")

  p_p <- new_df %>%
    ggplot(aes(x = height,
               y = mass,
               colour = !!var_clf)) +
    xlab("Height") +
    ylab("Mass")
  
  # Map color classes to colors
  classes_to_colors_10 <- c(
    "feminine"        = "red",            # gender
    "masculine"       = "#3cb44b",      #
    "female"          = "navy",              # sex
    "hermaphroditic"  = "darkgreen",    #
    "male"            = "purple4"               #
  )

  if (column_for_clr == "sex") {
    print("colour is sex")
    p_p <- p_p +
      scale_colour_manual(values = classes_to_colors_10) +
      ggtitle("title ")
  }

  # Save the plot as a PNG image
  ggsave(plot   = p_p,
         device = "png",
         filename = paste0(getwd(), "/test_starwars21.png"),
         units  = "in",
         width  = 6,
         height = 4,
         dpi    = 300)
  
  # Return the ggplot object
  return(p_p)

# Create dataframes and plot
main_df1 <- starwars %>% dplyr::select(name, height, mass)
main_df2 <- starwars %>% dplyr::select(name, sex, gender, homeworld, species)

plot_gg(main_df1, main_df2, column_for_clr = "sex")

Using Quosures

Another option is to use quosures. Here’s how you might do it:

# Load required packages
library(dplyr)
library(ggplot2)

# Define the function that picks a specific column from a dataframe and creates a ggplot
plot_gg <- function(main_df,
                    another_df,
                    column_for_clr = c("sex", "gender", "species", 
                                       "homeworld")){

  # Convert column names into expressions using rlang::enquo()
  var_clf <- enquo(column_for_clr)

  new_df <- main_df %>% left_join(another_df, by = "name")

  p_p <- new_df %>%
    ggplot(aes(x = height,
               y = mass,
               colour = !!var_clf)) +
    xlab("Height") +
    ylab("Mass")
  
  # Map color classes to colors
  classes_to_colors_10 <- c(
    "feminine"        = "red",            # gender
    "masculine"       = "#3cb44b",      #
    "female"          = "navy",              # sex
    "hermaphroditic"  = "darkgreen",    #
    "male"            = "purple4"               #
  )

  if (column_for_clr == "sex") {
    print("colour is sex")
    p_p <- p_p +
      scale_colour_manual(values = classes_to_colors_10) +
      ggtitle("title ")
  }

  # Save the plot as a PNG image
  ggsave(plot   = p_p,
         device = "png",
         filename = paste0(getwd(), "/test_starwars21.png"),
         units  = "in",
         width  = 6,
         height = 4,
         dpi    = 300)
  
  # Return the ggplot object
  return(p_p)

# Create dataframes and plot
main_df1 <- starwars %>% dplyr::select(name, height, mass)
main_df2 <- starwars %>% dplyr::select(name, sex, gender, homeworld, species)

plot_gg(main_df1, main_df2, column_for_clr = "sex")

Note the difference between enquo() and !!. The double exclamation points are used to convert our desired column name into a quosure that can be evaluated at runtime.

Conclusion

In conclusion, using meta-programming functions like enquo() and rlang::"!!()" allows us to dynamically access columns of a dataframe without knowing their names at compile time. By combining these functions with dplyr and ggplot2, we can create complex data analysis pipelines that adapt to changing input data.

Additional Resources

For further learning, here are some additional resources:

  • dplyr: The package for fast data manipulation in R.
  • ggplot2: A framework for creating informative and attractive statistical graphics.
  • rlang: Provides functions to work with objects of type Quosure, which represent expressions that can be evaluated at runtime.

Contributing

If you’d like to contribute new features or fix bugs in the code above, here’s how you might do it:

  1. Fork this repository and make your desired changes.
  2. Write a pull request to describe your changes.
  3. Include unit tests for any new functions you add.

Unit Tests

For testing the function plot_gg, we need to include some unit tests using R’s built-in test framework testthat. Here’s an example of how we might do it:

# Load required packages
library(testthat)
library(dplyr)
library(ggplot2)

# Define a function to check the output of plot_gg
check_plot_gg <- function(){
  # Create dataframes and plot
  main_df1 <- starwars %>% dplyr::select(name, height, mass)
  main_df2 <- starwars %>% dplyr::select(name, sex, gender, homeworld, species)

  # Check that the output of plot_gg is equal to an expected ggplot object
  exp_p_p <- ggplot(main_df1 %>% left_join(main_df2, by = "name"),
                   aes(x = height,
                       y = mass,
                       colour = sex)) +
    xlab("Height") +
    ylab("Mass")
  
  test_eq(exp_p_p, plot_gg(main_df1, main_df2, column_for_clr = "sex"))
}

# Run the tests
test_dir("plot_gg.R")

Note that this is a simplified example and real-world testing would likely involve more complex scenarios.


Last modified on 2025-03-08