5

I have to merge multiple 1 page pdf's into one pdf. I'm using iTextSHarp 5.5.5.0 to accomplish this, but when I get to merge more than 900-1000 pdf I get an out of memory exception. I noticed that even if I free my reader and close it the memory never gets cleaned properly (the amount of memory used by the process never decreases)so I was wondering what I could possibly be doing wrong. This is my code:

 using (MemoryStream msOutput = new MemoryStream())
        {
            Document doc = new Document();
            PdfSmartCopy pCopy = new PdfSmartCopy(doc, msOutput);
            doc.Open();
            foreach (Tuple<string, int> file in filesList)
            {
                PdfReader pdfFile = new PdfReader(file.Item1);
                for (int j = 0; j < file.Item2; j++)
                    for (int i = 1; i < pdfFile.NumberOfPages + 1; i++)//in this case it's always 1. 
                        pCopy.AddPage(pCopy.GetImportedPage(pdfFile, i));
                pCopy.FreeReader(pdfFile);
                pdfFile.Close();
                File.Delete(file.Item1);
            }
            pCopy.Close();
            doc.Close();

            byte[] content = msOutput.ToArray();
            using (FileStream fs = File.Create(Out))
            {
                fs.Write(content, 0, content.Length);
            }
        }

It never gets to writing the file, I get an out of memory exception during the p.Copy().AddPage() part. I even tried flushing the pCopy variable but didn't change anything. I looked in the documentation of iText and various questions around StackOverflow but seems to me that I'm taking every suggestion to keep memory usage low, but this isn't happening. Any ideas on this?

3
  • 3
    Use PdfCopy instead of PdfSmartCopy. PdfSmartCopy is "smart" because it keeps plenty of stuff in memory. Commented Apr 2, 2015 at 14:09
  • 6
    Also try writing to the FileStream directly instead of the MemoryStream. You might literally be running out of memory. Commented Apr 2, 2015 at 14:10
  • Hy, thank you very much for the very quick responses, The PdfCopy instead of the PdfSmartCopy was good and decreased the memory usage to nearly half, but the FileStream was much better, I now use only 15 MB of memory instead of 1GB, not bad. And I can use the PDfSmartCopy wich produces much smaller PDF's. If you make and answer out of it I will upvote and accept, thank you. Commented Apr 2, 2015 at 14:18

1 Answer 1

7

Since this is a large amount of stuff I'd recommend writing directly to a FileStream instead of a MemoryStream. This might be an actual case where an Out of Memory Exception might literally mean "Out of Memory".

Also, as Bruno pointed out, the "smart" part of PdfSmartCopy unfortunately comes at the cost of memory, too. Switching to PdfCopy should reduce memory pressure although your final PDF might be larger.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.