0

I have a folder structure along a path like this: ":/Users/XXX/Individual reports/". Within this Individual Reports folder, there is a folder for every state and within each state folder there are .xlsx files with names along the lines of "(AK) FFY 2014 - Survey Results v1.xlsx" where each state folder has multiple files for a given year (2014-2023) and each file begins with (state_abbr), e.g. "(AK)".

I used this post to try the following: https://www.reddit.com/r/rstats/comments/z0eali/trying_to_load_multiple_excel_sheets_in_multiple/

library(tidyverse)
library(readxl)

file_path = ":/Users/XXX/Individual reports/"

df <- list.files(file_path,full.names = TRUE,pattern = ".xlsx",recursive = TRUE) %>%
  set_names(basename(.)) %>%
  map_df(.x = .,
         .f = read_xlsx,
         .id = "filename",
         sheet = "Table 5")

However, I get the the following error:

> df <- list.files(file_path,full.names = TRUE,pattern = ".xlsx",recursive = TRUE) %>%
+   set_names(basename(.)) %>%
+   map_df(.x = .,
+          .f = read_xlsx,
+         # col_types = cols(.default = "c"), # if you want to set specific coltypes. I usually read everything as character first otherwise you can omit this
+          .id = "filename",
+         sheet = 5)
Error in `map()`:
ℹ In index: 1.
ℹ With name: (AK) FFY 2014 - Synar Survey Results v1.xlsx.
Caused by error in `utils::unzip()`:
! zip file 'C:\Users\XXX\Individual reports\AK\(AK) FFY 2014 - Survey Results v1.xlsx' cannot be opened
Run `rlang::last_trace()` to see where the error occurred.
> rlang::last_trace()
<error/purrr_error_indexed>
Error in `map()`:
ℹ In index: 1.
ℹ With name: (AK) FFY 2014 - Survey Results v1.xlsx.
Caused by error in `utils::unzip()`:
! zip file 'C:\Users\XXX\Individual reports\AK\(AK) FFY 2014 - Survey Results v1.xlsx' cannot be opened
---
Backtrace:
     ▆
  1. ├─... %>% ...
  2. ├─purrr::map_df(.x = ., .f = read_xlsx, .id = "filename", sheet = 5)
  3. │ └─purrr::map(.x, .f, ...)
  4. │   └─purrr:::map_("list", .x, .f, ..., .progress = .progress)
  5. │     ├─purrr:::with_indexed_errors(...)
  6. │     │ └─base::withCallingHandlers(...)
  7. │     ├─purrr:::call_with_cleanup(...)
  8. │     └─readxl (local) .f(.x[[i]], ...)
  9. │       └─readxl:::read_excel_(...)
 10. │         ├─readxl:::set_readxl_names(...)
 11. │         │ └─tibble::as_tibble(l, .name_repair = .name_repair)
 12. │         └─readxl (local) read_fun(...)
 13. └─readxl (local) `<fn>`(...)
 14.   └─utils::unzip(zip_path, list = TRUE)
Run rlang::last_trace(drop = FALSE) to see 4 hidden frames.
> rlang::last_trace(drop = FALSE)
<error/purrr_error_indexed>
Error in `map()`:
ℹ In index: 1.
ℹ With name: (AK) FFY 2014 - Survey Results v1.xlsx.
Caused by error in `utils::unzip()`:
! zip file 'C:\Users\XXX\Individual reports\AK\(AK) FFY 2014 - Survey Results v1.xlsx' cannot be opened
---
Backtrace:
     ▆
  1. ├─... %>% ...
  2. ├─purrr::map_df(.x = ., .f = read_xlsx, .id = "filename", sheet = 5)
  3. │ └─purrr::map(.x, .f, ...)
  4. │   └─purrr:::map_("list", .x, .f, ..., .progress = .progress)
  5. │     ├─purrr:::with_indexed_errors(...)
  6. │     │ └─base::withCallingHandlers(...)
  7. │     ├─purrr:::call_with_cleanup(...)
  8. │     └─readxl (local) .f(.x[[i]], ...)
  9. │       └─readxl:::read_excel_(...)
 10. │         ├─readxl:::set_readxl_names(...)
 11. │         │ └─tibble::as_tibble(l, .name_repair = .name_repair)
 12. │         └─readxl (local) read_fun(...)
 13. ├─readxl (local) `<fn>`(...)
 14. │ └─utils::unzip(zip_path, list = TRUE)
 15. └─base::.handleSimpleError(...)
 16.   └─purrr (local) h(simpleError(msg, call))
 17.     └─cli::cli_abort(...)
 18.       └─rlang::abort(...)

When I try to open the file individually, I also get a similar error:

> df1 = read_xlsx(path = "C:/Users/XXX/Individual reports/AK/(AK) FFY 2014 - Survey Results v1.xlsx",
+                  sheet = 5)
Error in utils::unzip(zip_path, list = TRUE) : 
  zip file 'C:\Users\XXX\Individual reports\AK\(AK) FFY 2014 - Survey Results v1.xlsx' cannot be opened

I'm confused because my folders and files are not zipped. Any idea of how I can successfully import multiple xlsx files within a folder tree?

4
  • 3
    .xlsx files are actually .zip files with a specific structure and a different extension. You can change their extension to .zip and look inside at their structure (which contains mostly .xml files). It's therefore strange that you are getting this warning - are you sure you can open these files with Excel, and that they definitely have the correct extension (.xls files are not .zip files)? Commented May 28 at 21:58
  • 1
    @AllanCameron Perhaps it is because they are 'protected', i.e., I think they are locked for editing, but not for opening and viewing? Commented May 28 at 23:13
  • Protected files can still be imported using your code. Are the files open in another application, such as Excel? If so, you cannot import them while they are open. Commented May 29 at 4:16
  • 2
    I get this error if I name a csv file to have the xlsx extension. Maybe the files are misnamed. Commented May 30 at 2:44

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.