4

If I had a hierarchical dataframe like this:

level_1<-c("a","a","a","b","c","c")
level_2<-c("flower","flower","tree","mushroom","dog","cat")
level_3<-c("rose","sunflower","pine",NA,"spaniel",NA)
level_4<-c("pink",NA,NA,NA,NA,NA)
df<-data.frame(level_1,level_2,level_3,level_4)

How do I convert this to a list which orders according to the hierarchy, like this:

> list
 [1] "a"         "flower"    "rose"      "pink"      "sunflower" "tree"      "pine"      "b"         "mushroom"  "c"        
[11] "dog"       "spaniel"   "c"         "cat"      

So for in value in level 1, it list all level 2 values expanded across the other levels. Hopefully that makes sense?

Thanks in advance!

4 Answers 4

4

We can try this

> unique(na.omit(c(t(df))))
 [1] "a"         "flower"    "rose"      "pink"      "sunflower" "tree"
 [7] "pine"      "b"         "mushroom"  "c"         "dog"       "spaniel"
[13] "cat"
Sign up to request clarification or add additional context in comments.

Comments

4

In the question "c" appears twice in the desired answer but "a" and "b" only appear once. We assume that this is an error and what is wanted is that each should only appear once.

uniq <- function(x) unique(na.omit(c(t(x))))
unname(unlist(by(df, df$level_1, uniq)))
##  [1] "a"         "flower"    "rose"      "pink"      "sunflower" "tree"     
##  [7] "pine"      "b"         "mushroom"  "c"         "dog"       "spaniel"  
## [13] "cat"

It could also be expressed using pipes:

uniq <- \(x) x |> t() |> c() |> na.omit() |> unique()
by(df, df$level_1, uniq) |> unlist() |> unname()

As one of the other answers points out the same result could be obtained using just uniq(df) .

Comments

1

Convert columnwise duplicated values to NA, then rowwise exclude NAs and unlist.

df[ sapply(df, duplicated) ] <- NA

unlist(apply(df, 1, function(i){ i[ !is.na(i) ]}), use.names = FALSE)
# [1] "a"         "flower"    "rose"      "pink"      "sunflower"
# [6] "tree"      "pine"      "b"         "mushroom"  "c"        
# [11] "dog"       "spaniel"   "cat" 

Comments

0

An alernative way

library(magrittr)

df %>%
  apply(1, function(x) x) %>%
  as.character() %>% 
  {.[!is.na(.)]} %>%
  unique()


# [1] "a"         "flower"    "rose"      "pink"      "sunflower"
# [6] "tree"      "pine"      "b"         "mushroom"  "c"        
# [11] "dog"       "spaniel"   "cat"      

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.