Why Hinton Diagrams?

Hinton diagrams were introduced by Geoffrey Hinton, one of the founders of deep learning, as a practical debugging tool for neural network weights in the 1980s. The diagram appeared in textbooks on neural networks and connectionist models and became a standard visualization in that literature. Despite their long history, they remain underused in modern data analysis toolkits, in part because no convenient, ggplot2-native implementation existed.

The problem with heatmaps for signed data

Suppose you are training a neural network and you want to inspect a weight matrix: to understand which connections are large, which are small, and which are inhibitory versus excitatory. The standard tool is a heatmap:

set.seed(7)
nr <- 10
nc <- 18
W <- matrix(rnorm(nr*nc, sd = 0.4), nrow = nr, ncol = nc)
rownames(W) <- paste0("neuron_", 1:nr)
colnames(W) <- paste0("input_",  1:nc)

# The standard heatmap approach
df <- as.data.frame(as.table(W))
names(df) <- c("row", "col", "value")

ggplot(df, aes(x = col, y = row, fill = value)) +
  geom_tile() +
  scale_fill_gradient2(low = "blue", mid = "white", high = "red",
                       midpoint = 0) +
  coord_fixed() +
  theme_minimal() +
  theme(panel.grid = element_blank(), 
        axis.text.x = element_text(angle = 45, vjust = 1, hjust=1)) +
  labs(title = "Weight matrix as a heatmap")

This works, but it has weaknesses:

Colour choice matters a lot. Blue/white/red is readable; many diverging palettes are not (especially for colourblind readers).
Small differences are hard to judge. Is 0.35 more than twice 0.16? With colour, you can’t easily tell.
Near-zero entries look similar to each other and to slightly positive/negative entries.

Now the same data as a Hinton diagram:

df_h <- matrix_to_hinton(W,
  rowname_col = "row", colname_col = "col", value_col = "weight")

ggplot(df_h, aes(x = col, y = row, weight = weight)) +
  geom_hinton() +
  scale_fill_hinton() +
  scale_x_continuous(breaks = seq_along(colnames(W)), labels = colnames(W)) +
  scale_y_continuous(breaks = seq_along(rownames(W)), labels = rev(rownames(W))) +
  coord_fixed() +
  theme_hinton() +
  theme(axis.text.x = element_text(angle = 45, vjust = 1, hjust=1))+
  labs(title = "Weight matrix as a Hinton diagram")

The key differences:

Dominant weights are immediately visible: large squares catch the eye.
Near-zero weights are nearly invisible: the background shows through.
Sign is black-and-white: no colour palette decisions, no colourblind concerns.
Magnitude comparisons are accurate: area comparisons are pre-attentive and well-calibrated in human vision.

Why area beats colour for magnitude

A large body of research in visual perception (Mackinlay 1986; Cleveland & McGill 1984) ranks visual encoding channels by how accurately humans can decode quantitative information. The consensus ranking for magnitude:

Position on a common scale (best)
Length
Area
Angle / slope
Colour saturation (worst for quantitative comparison)

Heatmaps use colour saturation (the worst channel for magnitude). Hinton diagrams use area (a dramatically better channel). The improvement is most pronounced when:

Values span a wide range (e.g., 0.01 to 0.99): tiny vs large squares are unmistakable; pale vs saturated blue is not.
You need to compare non-adjacent entries: spatial position makes area comparisons easy across the matrix.

The signed data advantage

For correlation matrices or weight matrices where sign matters, Hinton diagrams have an additional advantage. A heat map must choose a diverging colour scheme, map its midpoint correctly to zero, and hope that readers can distinguish near-zero from slightly-positive from slightly-negative.

A Hinton diagram encodes sign with the most basic visual distinction possible: black vs white. There is no perceptual ambiguity.

set.seed(3)
# Simulate a correlation matrix
S <- matrix(c(
   1.00,  0.72, -0.35,  0.15,
   0.72,  1.00, -0.21,  0.08,
  -0.35, -0.21,  1.00, -0.58,
   0.15,  0.08, -0.58,  1.00
), 4, 4)
vars <- c("IQ", "Memory", "Anxiety", "Stress")
rownames(S) <- colnames(S) <- vars

df_cor <- matrix_to_hinton(S)

ggplot(df_cor, aes(x = col, y = row, weight = weight)) +
  geom_hinton() +
  scale_fill_hinton() +
  scale_x_continuous(breaks = 1:4, labels = vars) +
  scale_y_continuous(breaks = 1:4, labels = rev(vars)) +
  coord_fixed() +
  theme_hinton() +
  labs(title = "Correlation matrix",
       subtitle = "White = positive, black = negative")

Notice how the Anxiety-Stress negative correlation is immediately visible as a large black square, while the small positive IQ-Stress correlation is nearly absent.

Why Hinton Diagrams?

The problem with heatmaps for signed data

Why area beats colour for magnitude

The signed data advantage

When other visualisations are better