To illustrate the ideas behind discriminant analysis, I created the image below containing a scatterplot with two groups and a plot of the densities of the scores, LD1 on the discriminant function.

The ggplot code for this plot is contained in the gist [discrim-demo.R](https://gist.github.com/friendly/7cd853ced1eeae3b3a91b64205163743)

enter image description here

But I had to do this as two separate plots plots and then compose them in PPT in a single figure. I'm wondering if I can do more of this in R.

  • For example, I my code draws thick line between the means A and B. How can I find and plot the line perpendicular to this, half way between A and B?

  • Is there any way to compose a figure from two separate plots, with one rotated a specified angle?

5 Replies 5

This is very uncommon illustration. Why not rotating the data?

In my view, R's graphics capabilities are designed primarily to represent data rather than create illustrations from scratch. In that sense, you are trying to force a quart into a pint pot: you may be better off using another tool. You ask "… can I find and plot the line perpendicular to this, half way between A and B?". Yes: with geometry. That's not intended to be a flippant answer. If you want a custom plot that lies outside the capabilities of existing packages, you need to put in some effort.

When I need to superimpose two plots with different orientations, I use something like

library(grid)
assemble_plot <- function(plot1, plot2){
  dev.off()
  print(plot1)
  pushViewport(
    viewport(
      name = "rotate", 
      angle = 45, 
      clip = "off", 
      width = sqrt(2)/2, 
      height = sqrt(2)/2
      )
    )
  print(plot2, vp="rotate") 
}

assemble_plot(plot1, plot2)

You could draw a geom_density layer for the rotated (!) data, add it to the scatterplot, and rotate/shift the density plot layer with {ggblend}.

Sorry to make this more complicated, but I think as it stands the diagram gives the wrong impression of how LDA works. I mean, it does give a rough visual intuition of two-group LDA, but it could really wrong-foot someone who was trying to get a mathematical or geometrical understanding of how it works.

The LD vector is not just the line connecting the raw means of each group (unless both groups have equal and spherical covariance, which in this case they don't). It is only equivalent to the line joining the group means after the original data has undergone a whitening transform (i.e. a linear transformation that makes the covariance ellipses circular).

The discriminant boundary is orthogonal to the LD vector, but it is not in general orthogonal to the line connecting the means of the two groups in raw data space.

For your diagram to work as an accurate representation of LDA in its current format, you would need to show the data after a whitening transform.

Let's set-up the same way as you did in your gist:

library(MASS)
library(tidyverse)

set.seed(1234)
n <- 100
mu1 <- c(3,4)
mu2 <- c(6,2)
Sigma <- matrix(c(2, 1, 1, 2), 2, 2)

X1 <- mvrnorm(n, mu1, Sigma) |>
  as.data.frame() |>
  setNames(nm = c("y1", "y2")) |>
  cbind(group = "A")

X2 <- mvrnorm(n, mu2, Sigma) |>
  as.data.frame() |>
  setNames(nm = c("y1", "y2")) |>
  cbind(group = "B")

X <- rbind(X1, X2)

Suppose now we make this easier to transform and visualise by doing away with the ggplot grid and annotations altogether and create our own background grid

grid <- data.frame(x1 = c(seq(-10, 10, 2.5), rep(-10, 9)),
                   y1 = c(rep(-10, 9), seq(-10, 10, 2.5)),
                   x2 = c(seq(-10, 10, 2.5), rep(10, 9)),
                   y2 = c(rep(10, 9), seq(-10, 10, 2.5)),
                   wt = rep(c(0.2, 0.1, 0.2, 0.1, 0.5, 0.1, 0.2, 0.1, 0.2), 2)
                   )

Then we can make a plot that shows our initial data:

ggplot(X, aes(y1, y2, color = group)) +
  geom_segment(aes(x = x1, y = y1, xend = x2, yend = y2, linewidth = I(wt)),
               colour = "gray50", data = grid) +
  geom_point() +
  stat_ellipse(geom = "polygon", aes(fill = group), level = 0.68, alpha = 0.2) +
  theme_void(16) +
  coord_equal(xlim = c(-2.5, 10), ylim = c(-2.5, 10))

enter image description here

If you want to demonstrate how an LDA works geometrically, the first thing that needs to happen is you center your data, so you could plot this

X_centered <- scale(as.matrix(X[1:2]), center = TRUE, scale = FALSE) |>
  as.data.frame() |>
  cbind(group = X$group) 

ggplot(X_centered, aes(y1, y2, color = group)) +
  geom_segment(aes(x = x1, y = y1, xend = x2, yend = y2, linewidth = I(wt)),
               colour = "gray50", data = grid) +
  geom_point() +
  stat_ellipse(geom = "polygon", aes(fill = group), level = 0.68, alpha = 0.2) +
  theme_void(16) +
  coord_equal(xlim = c(-6.25, 6.25), ylim = c(-6.25, 6.25))

enter image description here

The next thing you need to do is whiten the data (so that the intragroup covariance is spherical). This is a linear transformation, and we can apply it to our grid as well so that we know what the transformation is actually doing, effectively squeezing the grid in the Northeast-Southwest direction.

e <- eigen((cov(X1[1:2]) + cov(X2[1:2])) / 2)
white_mat <- e$vectors %*% diag(1 / sqrt(e$values)) %*% t(e$vectors)

X_white <- as.matrix(X_centered[1:2]) %*% white_mat

X_white <- as.data.frame(X_white) |>
  cbind(group = X$group) |>
  rename(y1 = V1, y2 = V2)

grid_white <- cbind(as.data.frame(as.matrix(grid[1:2]) %*% white_mat),
                    as.data.frame(as.matrix(grid[3:4]) %*% white_mat),
                    grid[5]) |> setNames(names(grid))

ggplot(X_white, aes(y1, y2, color = group)) +
  geom_segment(aes(x = x1, y = y1, xend = x2, yend = y2, linewidth = I(wt)),
               colour = "gray50", data = grid_white) +
  geom_point() +
  stat_ellipse(geom = "polygon", aes(fill = group), level = 0.68, alpha = 0.2) +
  theme_void(16) +
  coord_equal(xlim = c(-6.25, 6.25), ylim = c(-6.25, 6.25))

enter image description here

We can see that the covariance ellipses are now approximately circular. (The reason why they are not exactly circular is that though they were drawn from a bivariate normal distribution with the same covariance, neither sample actually has that exact covariance - this is why we had to take their mean covariance for our eigen() decomposition).

Now we can rotate this around the origin so that the group means and the origin all lie along a single horizontal line. We can do this by using the normalized vector between the two group means as a basis for a rotation matrix.

Z <- as.matrix(X_white[1:2])
v <- colMeans(Z[X_white$group == "B", ]) - colMeans(Z[X_white$group == "A", ])
v_len <- sqrt(sum(v^2))
u <- v / v_len  
rotate_mat <- t(rbind(u, c(-u[2], u[1])))

X_rot <- (Z %*% rotate_mat) |>
  as.data.frame() |>
  setNames(c("y1", "y2"))
X_rot$group <- X_white$group

grid_rot <- cbind(as.data.frame(as.matrix(grid_white[1:2]) %*% rotate_mat),
                  as.data.frame(as.matrix(grid_white[3:4]) %*% rotate_mat),
                  grid[5]) |> setNames(names(grid))

ggplot(X_rot, aes(y1, y2, color = group)) +
    geom_segment(aes(x = x1, y = y1, xend = x2, yend = y2, linewidth = I(wt)),
                 colour = "gray50", data = grid_rot) +
    geom_point() +
    stat_ellipse(geom = "polygon", aes(fill = group), level = 0.68, alpha = 0.2) +
    theme_void(16) +
    coord_equal(xlim = c(-6.25, 6.25), ylim = c(-6.25, 6.25))

enter image description here

Now we can draw a line between the group means and project down the centre to get our LD1 density image:

A <- colMeans(X_rot[X_rot$group == "A", 1:2])
B <- colMeans(X_rot[X_rot$group == "B", 1:2])

ggplot(X_rot, aes(y1, y2 + 5, color = group)) +
  geom_segment(aes(x = x1, y = y1 + 5, xend = x2, yend = y2 + 5, 
                   linewidth = I(wt)),
               colour = "gray50", data = grid_rot) +
  geom_point() +
  stat_ellipse(geom = "polygon", aes(fill = group), level = 0.68, alpha = 0.2) +
  annotate("point", x = c(A[1], B[1]), y = c(A[2], B[2]) + 5) +
  annotate("segment", x = A[1], xend = B[1], y = A[2] + 5, yend = B[2] + 5) +
  annotate("segment", x = 0, y = 5, xend = 0, yend = 0, linetype = 2) +
  geom_density(aes(x = y1, y = after_stat(density) * 4, fill = group), 
               inherit.aes = FALSE, alpha = 0.5) +
  annotate("text", x = c(seq(-5, 5, 2.5)), y = -0.5,
           label = seq(-5, 5, 2.5)) +
  annotate("segment", x = c(seq(-5, 5, 2.5)), y = -0.2, yend = 0) +
  annotate("text", x = 0, y = -1, label = "LD1") +
  theme_void(16) +
  coord_equal(xlim = c(-5, 5), ylim = c(-1, 9))

enter image description here

To prove that this geometrical interpretation really shows the LDA, we can take the values from an "actual" LDA and compare them to the values we have plotted. We can see they are the same (within the limits of numerical precision expected from different calculation methods)

LDA_values  <- predict(lda(group ~ y1 + y2, data = X))$x
our_values  <- X_rot$y1
differences <- abs(LDA_values - our_values)
all(differences < 10^-15)
#> [1] TRUE

Now, you could reverse all the transformations and show the original data with the LDA below it, and it would remain accurate. But the discrimination line would no longer be at 90 degrees to the line joining your group means, the bumps on the LDA plot would no longer be orthogonal projections from your point clouds, the scale on the LDA plot would be different from the scale in data space, and it would no longer be parallel to the LDA vector, so the explanatory power of the diagram as a visual representation of LDA would be lost.

I know my sequence of plots here is a lot more complicated to code, but it has the virtue of being a genuine geometric interpretation of LDA that gives a clear intuition of the process.

Your Reply

By clicking “Post Your Reply”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.