PDF Processing with Gnostice PDFtoolkit – Part 2 - Improvements in version 3

By: V. Subhash

Abstract: In the first part of this article, we saw how PDFtoolkit could meet your PDF-related needs. In the second part of this article, we will see what improvements users will see when Gnostice PDFtoolkit VCL v3.0 comes out.

    A New Version, A New Beginning

Version 3 of PDFtoolkit will use a whole new PDF processing engine that has been completely re-written from the ground up. Thanks to high modularization of the PDF processor code, PDFtoolkit will see enormous improvements in speed, robustness, and scalability.

The new PDF processor has a lot of capabilities and includes more than everything that PDFtoolkit needs to implement its interface. So, the scope for bringing new features to PDFtoolkit has also greatly increased.

    PDF Processor On Steroids

Quite a lot of original ideas have gone in to the new version. Many improvements have been based on customer feedback and our own observations. What are they?

    Instant Loading

Even if you throw a 10,000 page PDF document at it, PDFtoolkit will load the document almost instantly. This greatly improves the robustness and responsiveness of applications written using PDFtoolkit.

    Intelligent Processing

PDFtoolkit processes documents at a blistering speed that promises near-instantaneous access to different parts of a document. PDFtoolkit intelligently loads only those parts of a document that will be required for a particular action. This also gives PDFtoolkit v3 a very small resource footprint.

    Heavy-Duty Performance

PDFtoolkit can simultaneously process a large amount of files. For example, in a merge-and-split test case, we tested PDFtoolkit for merging over 10,000 files into one in one go. No problems at all.

    Fault Tolerance

Not all PDF documents are blessed alike. Some have errors. Others have lost a few chunks of bytes lost along the way. PDFtoolkit will be kind to such files. Even if a few pages are missing, PDFtoolkit will enable applications to handle these documents peacefully and without much fuss.

    Optimized Algorithms

The development team decided that some popular implementations were not up to the mark and chose to implement their own logic to handle standards such as JPEG, CCITT, and Flate. This has greatly contributed the speed and robustness of the API.

    Better Content Extraction

Text in a PDF document is not organized into well-marked paragraphs, as one might expect in a web page or a Microsoft Word document. Therefore, extracting text from a PDF document to everyone’s satisfaction will not be easy. The new version has improved greatly on this and if the beta version is anything to go by users should be very happy.

    New Viewer

PDF viewer component has seen a lot of improvements.

  • Instant loading and navigation thanks to intelligent processing
  • Multi-page views and new events to support multi-page views
  • Automatic zoom based on page width and also custom multi-page zoom
  • Faster performance with Search Panel component

Hide image
Click to see full-sized image

    Quality Control

Over the years, we have come to know about a wide variety of PDF documents that our clients have used. We have incorporated this knowledge in our new testing framework using AQtime. PDFtoolkit features have been and are being tested every day with tens of thousands of various types of PDF documents in a totally automated process. This has helped us in simulating different types of client requirements efficiently. The ultimate goal is to provide a product that the customers can blindly trust.

    In The Next Part

In the final part of this article, we will go into more detail on all these improvements.

Trial Download - http://www.gnostice.com/PDFtoolkitOverview.asp

Feature Matrix - http://www.gnostice.com/pdftoolkitFeatures.asp

Pre-Order and Get 20% Discount - http://www.gnostice.com/pdftoolkitbuynow.asp

Server Response from: ETNASC04