How do I convert a hierarchical dataframe to a list in R?

Question

If I had a hierarchical dataframe like this:

level_1<-c("a","a","a","b","c","c")
level_2<-c("flower","flower","tree","mushroom","dog","cat")
level_3<-c("rose","sunflower","pine",NA,"spaniel",NA)
level_4<-c("pink",NA,NA,NA,NA,NA)
df<-data.frame(level_1,level_2,level_3,level_4)

How do I convert this to a list which orders according to the hierarchy, like this:

> list
 [1] "a"         "flower"    "rose"      "pink"      "sunflower" "tree"      "pine"      "b"         "mushroom"  "c"        
[11] "dog"       "spaniel"   "c"         "cat"

So for in value in level 1, it list all level 2 values expanded across the other levels. Hopefully that makes sense?

Thanks in advance!

ThomasIsCoding · Accepted Answer · 2022-09-13 12:09:56Z

4

We can try this

> unique(na.omit(c(t(df))))
 [1] "a"         "flower"    "rose"      "pink"      "sunflower" "tree"
 [7] "pine"      "b"         "mushroom"  "c"         "dog"       "spaniel"
[13] "cat"

answered Sep 13, 2022 at 12:09

ThomasIsCoding

106k9 gold badges38 silver badges110 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

G. Grothendieck · Accepted Answer · 2022-09-13 14:00:48Z

4

In the question "c" appears twice in the desired answer but "a" and "b" only appear once. We assume that this is an error and what is wanted is that each should only appear once.

uniq <- function(x) unique(na.omit(c(t(x))))
unname(unlist(by(df, df$level_1, uniq)))
##  [1] "a"         "flower"    "rose"      "pink"      "sunflower" "tree"     
##  [7] "pine"      "b"         "mushroom"  "c"         "dog"       "spaniel"  
## [13] "cat"

It could also be expressed using pipes:

uniq <- \(x) x |> t() |> c() |> na.omit() |> unique()
by(df, df$level_1, uniq) |> unlist() |> unname()

As one of the other answers points out the same result could be obtained using just uniq(df) .

edited Sep 13, 2022 at 14:00

answered Sep 13, 2022 at 11:49

G. Grothendieck

273k18 gold badges220 silver badges365 bronze badges

Comments

zx8754 · Accepted Answer · 2022-09-13 11:50:11Z

1

Convert columnwise duplicated values to NA, then rowwise exclude NAs and unlist.

df[ sapply(df, duplicated) ] <- NA

unlist(apply(df, 1, function(i){ i[ !is.na(i) ]}), use.names = FALSE)
# [1] "a"         "flower"    "rose"      "pink"      "sunflower"
# [6] "tree"      "pine"      "b"         "mushroom"  "c"        
# [11] "dog"       "spaniel"   "cat"

answered Sep 13, 2022 at 11:50

zx8754

56.7k12 gold badges131 silver badges229 bronze badges

Comments

jpdugo17 · Accepted Answer · 2022-09-13 12:44:31Z

0

An alernative way

library(magrittr)

df %>%
  apply(1, function(x) x) %>%
  as.character() %>% 
  {.[!is.na(.)]} %>%
  unique()


# [1] "a"         "flower"    "rose"      "pink"      "sunflower"
# [6] "tree"      "pine"      "b"         "mushroom"  "c"        
# [11] "dog"       "spaniel"   "cat"

edited Sep 13, 2022 at 12:44

answered Sep 13, 2022 at 12:34

jpdugo17

7,1662 gold badges15 silver badges23 bronze badges

Collectives™ on Stack Overflow

How do I convert a hierarchical dataframe to a list in R?

4 Answers 4

Comments

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

Comments

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related