Creating Custom Heat Maps with R: A Step-by-Step Guide

Understanding Heat Maps and Creating a “Heat Map” of Draws

===========================================================

In this article, we will explore the concept of heat maps and create a custom plot that represents a distribution of draws using a “heat map” style. This involves transforming our data into a suitable shape, calculating quantiles for each column, and then plotting a transparent ribbon with varying transparency to represent the density of values.

Background on Heat Maps

A heat map is a graphical representation of data where values are depicted by colors or intensities. It’s commonly used in various fields such as statistics, economics, and science to visualize relationships between two variables.

In the context of this problem, we want to create a “heat map” of draws, which represents a distribution of forecasts with x representing the number of observations and y representing the forecast values. The idea is to shade areas based on the density of values around certain thresholds, similar to how heat maps display color intensities.

Preparing Data

To start creating our custom plot, we need to prepare our data in a suitable format.

# Set seed for reproducibility
set.seed(123)

# Create a matrix with 10,000 rows and 10 columns using a uniform distribution
n <- 10000
k <- 10
mat <- matrix(runif(n * k), nrow = n)

Next, we calculate quantiles for each column of the matrix using apply, transpose, and convert it into a data frame.

# Calculate quantiles for each column
dat <- as.data.frame(t(apply(mat, MARGIN = 2, FUN = quantile, probs = seq(.1, 0.9, 0.1))))

We then add an x variable to the data frame since we transposed the original matrix.

# Add x variable
dat$x <- 1:nrow(dat)

Converting Data into a “Long” Format

To create our custom plot, we need to convert our data into a long format with one row per observation and one column for each quantile.

# Load required libraries
library(dplyr)
library(tidyr)

# Convert data into a long format
dat_long <- gather(dat, "quantile", value = "y", -x) %>%
    mutate(quantile = as.numeric(gsub("%", "", quantile)),
           group = abs(50 - quantile))

We then filter the data to create two groups: those below 50 and those above 50.

# Create two groups: below 50 and above 50
dat_ribbon <- dat_long %>%
    filter(quantile < 50) %>%
    mutate(ymin = y) %>%
    select(x, ymin, group) %>%
    left_join(
        dat_long %>%
            filter(quantile > 50) %>%
            mutate(ymax = y) %>%
            select(x, ymax, group)
    )

We also create a separate data frame for the median.

# Create a data frame for the median
dat_median <- filter(dat_long, quantile == 50)

Plotting the Custom “Heat Map”

Using ggplot2, we can create our custom plot with a transparent ribbon for each group and a single line at the median.

# Load ggplot2 library
library(ggplot2)

# Plot the custom "heat map"
ggplot(dat_ribbon, aes(x = x)) +
    geom_ribbon(aes(ymin = ymin, ymax = ymax, group = group), alpha = 0.2) +
    geom_line(aes(y = y), data = dat_median, color = "white")

The resulting plot displays a transparent ribbon for each group with varying transparency to represent the density of values around certain thresholds.

Conclusion

In this article, we created a custom plot that represents a distribution of draws using a “heat map” style. We transformed our data into a suitable format, calculated quantiles for each column, and plotted a transparent ribbon with varying transparency to display the density of values. This is just one example of how heat maps can be used in various fields; feel free to experiment and adapt this concept to your own projects!

Last modified on 2024-02-03