Skip to main content
25 events
when toggle format what by license comment
Sep 7, 2024 at 19:52 vote accept dipea
Sep 7, 2024 at 15:08 answer added jjanes timeline score: 2
Sep 7, 2024 at 14:36 comment added dipea @0xn0b174 yes that was the Freeable memory during the load
Sep 7, 2024 at 14:35 history edited dipea CC BY-SA 4.0
added 701 characters in body
Sep 7, 2024 at 13:14 comment added kdgregory And lastly, you're almost certainly throwing away money on provisioned IOPS. First, because with that small memory allotment you're going to be frequently hitting the disk, and second, because instance types impose their own limits on IO.
Sep 7, 2024 at 13:13 comment added kdgregory You also said that you can't use COPY, but you might find that COPY to a staging table and insert from that table will also help you.
Sep 7, 2024 at 13:12 comment added kdgregory With only 2 GB of RAM available, you're almost certainly hitting the disk often, shuffling pages in and out of the buffer cache (out because you say there's other processes that are active, and writing pages). Especially if you have multiple indexes on the table. So increasing to a larger instance size is almost certainly the solution (but, as always, test this hypothesis).
Sep 7, 2024 at 13:10 comment added kdgregory That said, I think you're on the right track with suspecting read waits. You have a single server-side thread doing this work. If it needs to read a block from disk, it needs to wait until that block is available. And regardless of your IOPS, individual reads are still in the millisecond range (IOPS primarily indicates how much concurrent activity can take place).
Sep 7, 2024 at 13:08 comment added kdgregory Things that you didn't put into your question: (1) how you're inserting these rows (although from comments it seems discrete/bulk INSERT statements, (2) how big the pipe is from whatever is doing the loading (are you running this on the AWS network, or from your office network?), (3) how long it takes to perform a load using a local database with the same existing data (this calls out whether the performance issue is caused by things like indexes on the tables that have to be rewritten).
Sep 7, 2024 at 12:11 comment added Dunes Indexes are always cached in RAM. This is unlike tables that are cached into RAM on demand, and then evicted if other queries need the RAM. Assuming a single database with a single empty table, with a single index on an 8-byte wide column and each csv record being about 1kB, then your final index size will be about 1.6GB. Your instance doesn't have anywhere near enough RAM to work with the schema and data you have.
Sep 7, 2024 at 11:55 comment added 0xn0b174 is that memory chart during the bulk insert??
Sep 7, 2024 at 11:38 comment added dipea @JohnRotenstein I have added a chart of Freeable memory. That metric seems to hardly have been affected by the load.
Sep 7, 2024 at 11:37 history edited dipea CC BY-SA 4.0
added 113 characters in body
Sep 7, 2024 at 11:24 comment added John Rotenstein Databases love RAM. Can you show that too? I originally suspected it was due to your usage of a T-family instance (that normally has CPU limits), but it seems that RDS databases using T-family instances have Unlimited mode activated, which gives full CPU at an additional charge (I think).
Sep 7, 2024 at 11:20 comment added dipea I can imagine more RAM would improve it and I may have to change the instance type if I can't figure out a cheaper solution using COPY. I'm not sure how more vCPUs would help though; from what I can see from the first screenshot in my post (showing Database Load) it doesn't look like even one of my two vCPUs is being maxed out, and I can see that I was using hardly any CPU credits during the load.
Sep 7, 2024 at 10:45 comment added 0xn0b174 db.t3.small only has 2 vCPUs and 2 GB of RAM and after your cpu credit is over it will start to slow down do you think that will address your bulk insert. @dipea
Sep 7, 2024 at 10:05 comment added Bohemian If it's OK to prevent other processes from accessing the table, try dropping all indexes except the unique one that detects the conflict, and begin; lock table in exclusive mode before inserting and commit after.
Sep 7, 2024 at 9:48 comment added dipea @0xn0b174 I don't think COPY will work in my case, because I have a uniqueness constraint that I need to maintain. Currently my INSERT is using ON CONFLICT DO NOTHING for this. It's a bit complicated by the fact that in the production version of this system there is another process writing rows to the table, which may include duplicates of the ones from the bulk insert. When you say the instance is too small - which resource in particular do you think makes the difference? CPU, RAM or something else?
Sep 7, 2024 at 9:40 comment added dipea @JohnRotenstein I have added a chart of the CPU usage
S Sep 7, 2024 at 9:39 history edited dipea CC BY-SA 4.0
added 143 characters in body
Sep 7, 2024 at 9:31 comment added 0xn0b174 your instance is too small for this kind of operation, and use COPY instead of INSERT which is relatively fast and also disable non-essential indexes during bulk loading and re-enable them afterward. There must be others too you can research on it but its most likely your thourhgput is not bein utilized by small size intace
Sep 7, 2024 at 9:29 comment added John Rotenstein What do the CPU Metrics look like?
S Sep 7, 2024 at 9:22 history edited John Rotenstein
edited tags
Sep 7, 2024 at 9:16 review Close votes
Sep 11, 2024 at 0:01
Sep 7, 2024 at 8:34 history asked dipea CC BY-SA 4.0