1

I try to find any R implementations that allow to perform hierarchical classification (not clustering).

The considered classification problems consist of hierarchically nested outcome classes. For example, consider the class "sport" (1). Sub-classes of that class could be "basketball" (1.1), "soccer" (1.2), and "tennis" (1.3). But usually more than two classes are nested. For example, the first level could have classes 1 to 4, then we have classes 1.1, 1.2, 1.3, 2.1, 2.2, 2.3, 2.4, 2.5, ..., then classes 1.1.1, 1.1.2, ..., and then classes 1.1.1.1, 1.1.1.2, ..., and so on. It is a tree-structured classification problem, where each sub-class belongs to exactly one parent class.

Hierarchical classification problems of this kind can be tackled using so-called top-down classification approaches, where conventional multi-class classifiers are applied to each node in the tree. That is, in the example above, one classifier would differentiate between classes 1 to 4, another classifier between classes 1.1, 1.2, and 1.3, another classifier between classes 2.1 to 2.5, and so on. One implementation of this in R was the package "HieRanFor", which seems to have been developed on R-Forge, but it is no longer available. But there also exist special classifiers designed for hierarchical classification problems, so-called big-bang approaches.

Are there any possibilities in R to perform hierarchical classification?

2
  • Sounds similar to divisive cluster analysis where the data is split into two groups, then those groups are split, etc. The package cluster has functions mona (mona: MONothetic Analysis Clustering of Binary Variables) and diana (diana: DIvisive ANAlysis Clustering) that might work for your application. Commented Aug 17, 2022 at 19:01
  • Thank you for answer. However, we do not consider clustering, but have fixed outcome classes, which is why we would need to apply a classification algorithm. Commented Aug 18, 2022 at 9:38

1 Answer 1

0

Mixed effects models are the traditional way to capture hierarchies in a regression model. The way that the likelihood is expressed creates a preference to keep coefficients of subcategories within a category similar to each other. There can be any number of layers of hierarchy. With a binomial family, can perform classification. lme4 is the most well-used package for this.

There is probably a better data set to demonstrate this on, but this model for example classifies high efficiency cars by number of cylinders (categorical variable) and car manufacturer within number of cylinders.

library(tidyverse)
library(lme4)
#> Loading required package: Matrix
#> 
#> Attaching package: 'Matrix'
#> The following objects are masked from 'package:tidyr':
#> 
#>     expand, pack, unpack

mpg2 <- mpg %>% 
  mutate(
    high_efficiency = as.factor(cty > 20),
    cyl = as.factor(cyl)
  )

model <- glmer(high_efficiency ~ (1 | cyl / manufacturer), data = mpg2, family = "binomial")

summary(model)
#> Generalized linear mixed model fit by maximum likelihood (Laplace
#>   Approximation) [glmerMod]
#>  Family: binomial  ( logit )
#> Formula: high_efficiency ~ (1 | cyl/manufacturer)
#>    Data: mpg2
#> 
#>      AIC      BIC   logLik deviance df.resid 
#>    119.7    130.1    -56.9    113.7      231 
#> 
#> Scaled residuals: 
#>      Min       1Q   Median       3Q      Max 
#> -1.53719 -0.02545 -0.02457 -0.02434  2.66476 
#> 
#> Random effects:
#>  Groups           Name        Variance Std.Dev.
#>  manufacturer:cyl (Intercept)  2.61    1.615   
#>  cyl              (Intercept) 29.89    5.467   
#> Number of obs: 234, groups:  manufacturer:cyl, 32; cyl, 4
#> 
#> Fixed effects:
#>             Estimate Std. Error z value Pr(>|z|)
#> (Intercept)   -5.997      5.835  -1.028    0.304

coef(model)
#> $`manufacturer:cyl`
#>              (Intercept)
#> audi:4         -6.832850
#> audi:6         -6.011465
#> audi:8         -5.999128
#> chevrolet:4    -5.963526
#> chevrolet:6    -6.002150
#> chevrolet:8    -6.020727
#> dodge:4        -6.781927
#> dodge:6        -6.020611
#> dodge:8        -6.031986
#> ford:6         -6.013001
#> ford:8         -6.022351
#> honda:4        -3.710665
#> hyundai:4      -6.832850
#> hyundai:6      -6.006829
#> jeep:6         -6.002150
#> jeep:8         -6.005873
#> land rover:8   -6.004195
#> lincoln:8      -6.002512
#> mercury:6      -6.000581
#> mercury:8      -6.000823
#> nissan:4       -5.209047
#> nissan:6       -6.009925
#> nissan:8       -5.999128
#> pontiac:6      -6.003715
#> pontiac:8      -5.999128
#> subaru:4       -7.897771
#> toyota:4       -5.077635
#> toyota:6       -6.017581
#> toyota:8       -6.002512
#> volkswagen:4   -5.152200
#> volkswagen:5   -5.530277
#> volkswagen:6   -6.006829
#> 
#> $cyl
#>   (Intercept)
#> 4 -0.05988493
#> 5 -0.64663800
#> 6 -7.40782848
#> 8 -7.33366858
#> 
#> attr(,"class")
#> [1] "coef.mer"

Created on 2022-08-17 by the reprex package (v2.0.1)

Sign up to request clarification or add additional context in comments.

1 Comment

Thank you for your answer. However, in my case the outcome classes are hierarchical and there can be many classes, often hundreds or even thousands of classes. The covariates are just plain metric or binary (dummy-coded), as in conventional regression models.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.