Why Hinton Diagrams?

library(gghinton)
library(ggplot2)

Hinton diagrams were introduced by Geoffrey Hinton, one of the founders of deep learning, as a practical debugging tool for neural network weights in the 1980s. The diagram appeared in textbooks on neural networks and connectionist models and became a standard visualization in that literature. Despite their long history, they remain underused in modern data analysis toolkits, in part because no convenient, ggplot2-native implementation existed.

gghinton aims to fix that.

The problem with heatmaps for signed data

Suppose you are training a neural network and you want to inspect a weight matrix: to understand which connections are large, which are small, and which are inhibitory versus excitatory. The standard tool is a heatmap:

set.seed(7)
nr <- 10
nc <- 18
W <- matrix(rnorm(nr*nc, sd = 0.4), nrow = nr, ncol = nc)
rownames(W) <- paste0("neuron_", 1:nr)
colnames(W) <- paste0("input_",  1:nc)

# The standard heatmap approach
df <- as.data.frame(as.table(W))
names(df) <- c("row", "col", "value")

ggplot(df, aes(x = col, y = row, fill = value)) +
  geom_tile() +
  scale_fill_gradient2(low = "blue", mid = "white", high = "red",
                       midpoint = 0) +
  coord_fixed() +
  theme_minimal() +
  theme(panel.grid = element_blank(), 
        axis.text.x = element_text(angle = 45, vjust = 1, hjust=1)) +
  labs(title = "Weight matrix as a heatmap")

This works, but it has weaknesses:

  1. Colour choice matters a lot. Blue/white/red is readable; many diverging palettes are not (especially for colourblind readers).
  2. Small differences are hard to judge. Is 0.35 more than twice 0.16? With colour, you can’t easily tell.
  3. Near-zero entries look similar to each other and to slightly positive/negative entries.

Now the same data as a Hinton diagram:

df_h <- matrix_to_hinton(W,
  rowname_col = "row", colname_col = "col", value_col = "weight")

ggplot(df_h, aes(x = col, y = row, weight = weight)) +
  geom_hinton() +
  scale_fill_hinton() +
  scale_x_continuous(breaks = seq_along(colnames(W)), labels = colnames(W)) +
  scale_y_continuous(breaks = seq_along(rownames(W)), labels = rev(rownames(W))) +
  coord_fixed() +
  theme_hinton() +
  theme(axis.text.x = element_text(angle = 45, vjust = 1, hjust=1))+
  labs(title = "Weight matrix as a Hinton diagram")

The key differences:

Why area beats colour for magnitude

A large body of research in visual perception (Mackinlay 1986; Cleveland & McGill 1984) ranks visual encoding channels by how accurately humans can decode quantitative information. The consensus ranking for magnitude:

  1. Position on a common scale (best)
  2. Length
  3. Area
  4. Angle / slope
  5. Colour saturation (worst for quantitative comparison)

Heatmaps use colour saturation (the worst channel for magnitude). Hinton diagrams use area (a dramatically better channel). The improvement is most pronounced when:

The signed data advantage

For correlation matrices or weight matrices where sign matters, Hinton diagrams have an additional advantage. A heat map must choose a diverging colour scheme, map its midpoint correctly to zero, and hope that readers can distinguish near-zero from slightly-positive from slightly-negative.

A Hinton diagram encodes sign with the most basic visual distinction possible: black vs white. There is no perceptual ambiguity.

set.seed(3)
# Simulate a correlation matrix
S <- matrix(c(
   1.00,  0.72, -0.35,  0.15,
   0.72,  1.00, -0.21,  0.08,
  -0.35, -0.21,  1.00, -0.58,
   0.15,  0.08, -0.58,  1.00
), 4, 4)
vars <- c("IQ", "Memory", "Anxiety", "Stress")
rownames(S) <- colnames(S) <- vars

df_cor <- matrix_to_hinton(S)

ggplot(df_cor, aes(x = col, y = row, weight = weight)) +
  geom_hinton() +
  scale_fill_hinton() +
  scale_x_continuous(breaks = 1:4, labels = vars) +
  scale_y_continuous(breaks = 1:4, labels = rev(vars)) +
  coord_fixed() +
  theme_hinton() +
  labs(title = "Correlation matrix",
       subtitle = "White = positive, black = negative")

Notice how the Anxiety-Stress negative correlation is immediately visible as a large black square, while the small positive IQ-Stress correlation is nearly absent.

When other visualisations are better

Hinton diagrams are not universally superior. Other visualisations may be better when: