i have a Ubuntu laptop with 8 GB ram .and also have a 2 GB CSV file but when i use pandas method read_csv to load my data the ram is completely filled while there was 7 GB ram free . how does 2 GB file fill 7 GB ram ?
-
1Can you paste code to accompany your question?user378704– user3787042016-11-09 20:16:21 +00:00Commented Nov 9, 2016 at 20:16
-
These SO threads maybe helpful stackoverflow.com/questions/19590966/… stackoverflow.com/questions/17557074/…Bharath– Bharath2016-11-09 20:28:17 +00:00Commented Nov 9, 2016 at 20:28
Add a comment
|
2 Answers
The reason you get this low_memory warning might be because guessing dtypes for each column is very memory demanding. Pandas tries to determine what dtype to set by analyzing the data in each column.
In case using 32-bit system : Memory errors happens a lot with python when using the 32bit version in Windows. This is because 32bit processes only gets 2GB of memory to play with by default.
Try this :
tp = pd.read_csv('file_name.csv', header=None, chunksize=1000)
df = pd.concat(tp, ignore_index=True)
try to make use of chunksize parameter:
df = pd.concat((chunk for chunk in pd.read_csv('/path/to/file.csv', chunksize=10**4)),
ignore_index=True)
2 Comments
Jeff
your first is horribly inefficient add the note: pandas.pydata.org/pandas-docs/stable/merging.html
Jeff
every loop iteration you were making a copy of a bigger and bigger frame; instead append to a list and call concat once (as the current example does)