我一直在尝试将多个较小的PDF(到目前为止,我使用的最大的6 MB)合并为一个PDF。每当我尝试使用超过14 MB的输入时,都会收到“内存不足”错误。
合并时,该进程的内存使用量将跃升至550MB以上。对于14MB的输入来说,这似乎过多。
我正在使用PDFBox版本1.8.5在IBM Websphere Application Server上本地运行此应用程序
我将堆大小增加到1024MB,尽管这使我可以使用更多文件进行输入,但是我很快遇到了相同的问题。
在评论者的建议下,我更改了将文档对合并在一起,然后进一步合并以前合并的对的方法。这使我比以前更进一步。我仍然遇到内存不足错误,文件大小约为30 MB,但是它更可行。
File sourceLoc = new File(System.getProperty("java.io.tmpdir") + "source_files");
File scratch = new File(System.getProperty("java.io.tmpdir") + "scratch.txt");
PDFMergerUtility merger = new PDFMergerUtility();
merger.setDestinationFileName(System.getProperty("java.io.tmpdir") + "merged.pdf");
for(File file : sourceLoc.listFiles())
merger.addSource(file);
merger.mergeDocumentsNonSeq(new org.apache.pdfbox.io.RandomAccessFile(scratch, "rw"));
这是生成的日志:
JVMDUMP039I Processing dump event "systhrow", detail "java/lang/OutOfMemoryError" at 2014/08/01 13:01:50 - please wait.
JVMDUMP032I JVM requested System dump using 'C:\Working\IntranetApps\I-Document\Services\core.20140801.130150.2408.0001.dmp' in response to an event
JVMDUMP010I System dump written to C:\Working\IntranetApps\I-Document\Services\core.20140801.130150.2408.0001.dmp
JVMDUMP032I JVM requested Heap dump using 'C:\Working\IntranetApps\I-Document\Services\heapdump.20140801.130150.2408.0002.phd' in response to an event
JVMDUMP010I Heap dump written to C:\Working\IntranetApps\I-Document\Services\heapdump.20140801.130150.2408.0002.phd
JVMDUMP032I JVM requested Java dump using 'C:\Working\IntranetApps\I-Document\Services\javacore.20140801.130150.2408.0003.txt' in response to an event
JVMDUMP010I Java dump written to C:\Working\IntranetApps\I-Document\Services\javacore.20140801.130150.2408.0003.txt
JVMDUMP032I JVM requested Snap dump using 'C:\Working\IntranetApps\I-Document\Services\Snap.20140801.130150.2408.0004.trc' in response to an event
JVMDUMP010I Snap dump written to C:\Working\IntranetApps\I-Document\Services\Snap.20140801.130150.2408.0004.trc
JVMDUMP013I Processed dump event "systhrow", detail "java/lang/OutOfMemoryError".
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at org.apache.pdfbox.io.RandomAccessBuffer.clone(RandomAccessBuffer.java:69)
at org.apache.pdfbox.cos.COSStream.clone(COSStream.java:72)
at org.apache.pdfbox.cos.COSStream.<init>(COSStream.java:96)
at org.apache.pdfbox.pdfparser.NonSequentialPDFParser.parseCOSStream(NonSequentialPDFParser.java:1513)
at org.apache.pdfbox.pdfparser.NonSequentialPDFParser.parseObjectDynamically(NonSequentialPDFParser.java:1266)
at org.apache.pdfbox.pdfparser.NonSequentialPDFParser.parseObjectDynamically(NonSequentialPDFParser.java:1192)
at org.apache.pdfbox.pdfparser.NonSequentialPDFParser.parseDictObjects(NonSequentialPDFParser.java:1166)
at org.apache.pdfbox.pdfparser.NonSequentialPDFParser.initialParse(NonSequentialPDFParser.java:479)
at org.apache.pdfbox.pdfparser.NonSequentialPDFParser.parse(NonSequentialPDFParser.java:740)
at org.apache.pdfbox.pdmodel.PDDocument.loadNonSeq(PDDocument.java:1306)
at org.apache.pdfbox.pdmodel.PDDocument.loadNonSeq(PDDocument.java:1289)
at org.apache.pdfbox.util.PDFMergerUtility.mergeDocuments(PDFMergerUtility.java:232)
at org.apache.pdfbox.util.PDFMergerUtility.mergeDocumentsNonSeq(PDFMergerUtility.java:201)
at com.my.pkg.MyMergeClass.main(MyMergeClass.java:90)
PDF大多是后记,这是它自己的一种语言...因此14MB的输入可以是从零到无限输出的任何内容。最好的选择就是弄清楚如何正确地运行OOM。
本文收集自互联网,转载请注明来源。
如有侵权,请联系[email protected] 删除。
我来说两句