1

I need to extract data from a BYTEA field of a postgeSQL database. In the BYTEA field, the data is stored in the RDS format (it is the contents of a .rds file created by the command saveRDS corresponding to a dataframe). Using the package RPostgreSQL I am able to retrieve the bytea column in a character object thanks to the code given in this link https://github.com/codeinthehole/rpostgresql/blob/master/RPostgreSQL/tests/bytea.R

But how can I transform then my character object (in RDS format) in the corresponding dataframe ? I try to use readRDS but it seems only to work with a file.

Thank you

## (supposing the connection to the database)
## I do the select statement to retrieve the bytea column
##
rows <- dbGetQuery(con, "SELECT image_bytea from tabdata")
str(rows)
# 'data.frame': 1 obs. of  1 variable:
#   $ image_bytea: chr "\\x1f8b08000000000000065| __truncated__
str(rows[1,1])
# chr "\\x1f8b08000000000000065c9d655456585"| __truncated__
test <- postgresqlUnescapeBytea(rows[1,1])
conraw <- rawConnection(test,open="r")
mydf <- readRDS(file=conraw)
# Error in readRDS(file = conraw) : format d'entrée inconnu

When I use the psql in a cmd MSwindows, it works fine:

psql -At -p 5432 -h localhost -d cabri -U postgres -W -c "select lo_from_bytea(0,image_bytea) from tabdata'
# 31385
psql -At -p 5432 -h localhost -d cabri -U postgres -W -c "\lo_export 31385 'c:/upload-postgre/test-out.rds'"

And in R:

mydf <- readRDS(file = "c:/upload-postgre/test-out.rds")
class(mydf)
##   [1] "data.frame"
head(mydf)

##     tps      pc1     pc1f pc1f_partiel
## 1 -0.4000000 154.2554 70.62402     157.4627  93.89868 
## 2 -0.3999995 154.2127 70.94351     157.4627 153.9946 
## etc... 

Laurent

6
  • What does your code look like? It seems like you can use postgresqlUnescapeBytea to get the byte data in a raw vector, and then you can use rawConnection() as an input to readRDS to read from that raw vector. Commented Mar 25, 2021 at 20:29
  • Thank you Mr Flick. Here is my code: Commented Mar 26, 2021 at 16:11
  • I added also the psql commands because when I use psql it works fine. Commented Mar 26, 2021 at 16:33
  • @MrFlick I tried to use rawConnection without success. However the format is correct since the psql method works fine. Commented Mar 27, 2021 at 7:38
  • The page of test on github doesn't use lo_from_bytea so i'm not sure what that function does. I do not have access to a postgeSQL so I can't run any tests or anything myself. Sorry. Commented Mar 27, 2021 at 17:11

1 Answer 1

1

Here's what worked for me - do note that it makes use of dbplyr (here assuming that you're trying to get a single row from some table "tablename" in schema "schemaname", and the column with the RDS data is called "rds_output"):

library(tidyverse)
library(dbplyr)

con <- DBI::dbConnect(RPostgres::Postgres(), dbname = "example_db", host = "localhost", port = 5432, user = "username", password = "supersafepassword")

proj_tbl <- tbl(con, in_schema("schemaname", "tablename"))

proj_id <- 1 ## just something to grab a specific row

data <- proj_tbl %>%
  filter(pr_id == proj_id) %>% ## filtering for a specific row
  pull(rds_output) %>% unlist %>% rawConnection %>% gzcon %>% readRDS
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.