1

I have a SQLite-db with a size of 11 GB and 16 GB of RAM (shared with OS and so on). I want to perform a subsetting method with data.table:

# database connection
con = dbConnect(dbDriver("SQLite"), dbname=sqlite_database)

# read table from database
inventory <- as.data.table(dbGetQuery( con,'select * from inventory'))

# subset table
unfulfilled_inventory <- inventory[period >= stableStateStart, .(period, articleID, unfulfilledQuantities, backlog, stock)]

Getting more RAM would be the cheapest way to solve this problem, but unfortunately this is not an option.

The inventory object has 127,500,000 rows with 6 variables. The inventory object has an allocated size in memory of 5.2 GB.

dim(unfulfilled_inventory)
[1] 127500000         6

Is there a way to do this subsetting in a more memory-efficient way? I tried building a vector for vector scanning, but it has the same result. Or is there a way to use swap space for this operation (I do not really care about speed).

2
  • Just to confirm, which object is 5.2GB here? Also what does dim(inventory) result in? Commented Sep 24, 2015 at 13:23
  • updated the question Commented Sep 24, 2015 at 13:35

1 Answer 1

4

The only two I have in mind at the moment:

  1. use setDT instead of as.data.table, you will save some memory when reading from db.

  2. You can compute your condition on the database side as then use computed column in R:

sql = "SELECT *, period >= stableStateStart AS tmpcol FROM inventory"
inventory = setDT(dbGetQuery(conn, sql), key="tmpcol")
inventory[.(TRUE)]
  1. Adding ORDER BY tmpcol to sql query may also help on setDT(., key="tmpcol") in later step.

Be sure to use data.table 1.9.6 - recently published to CRAN.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.