Posts tagged: PDF

toPDF update

Post to Twitter

Some time ago, I wrote about toPDF. It's a simple web application that allows office format to PDF conversion using PDFCreator. It's open source and allows synchronuous conversion via http post requests or REST calls.

We made some changes in toPDF that makes it possible to use different converter tools/SDKs. It's now possible to use PDFCreator or easyPDF SDK. The first solution is OpenSource and the last is commercial. I prefer the commercial SDK because it has a lot of useful features, offers a Java bridge and works super fast in server environments. The OpenSource solution also works, but needs some tinkering.

Anyway, the toPDF application supports both and it's easy to integrate other converter. Simply implement the interface

public interface IPdfOperator
{
    public void convert(File pDocument, File pPdf, PdfSettings pSettings) throws Exception;
}

Set the context parameter: operatorClass in your web.xml, e.g.

<context-param>
  <description>PDF operator</description>
       
  <param-name>operatorClass</param-name>
  <param-value>com.sibvisions.topdf.operator.pdfforge.PdfCreatorOperator</param-value>
</context-param>

The default operator is com.sibvisions.topdf.operator.bcl.EasyPdfOperator.

New project: toPDF

Post to Twitter

We tried to find a simple solution to convert MS Office files to pdf, without online services. We tried OpenOffice but the results were awful! There are a lot of free, and commercial, PDF printers available. But they are for desktops and a user has to print manually. We wanted a solution that works without user interaction.

There is a very useful open source project called PDFCreator. It also is a printer but has a useful API. The API is available via COM, which is not the best technology for Java, but it's also not bad.

We didn't find a ready-to-use solution for our idea and it shouldn't cost money. The solution had to be open source. We found some great commercial tools and SDKs but all of them were not cheap.

We spent some hours and used PDFCreator, Jacob and some other open source tools to create an "Online service for PDFCreator". The result of our work is toPDF.

What is toPDF?

It's a small library that allows conversion of files to PDF, via PDFCreator. It's also a web application that offers services for remote conversion via http. The application has a REST service and a simple servlet service.

Simply POST binary data via http request and receive a PDF in the response. The servlet supports multipart form-data and simple application/octet-stream as requests. The REST service also supports multipart form-data but also JSON requests.

A short example:

URL url = new URL(getServletService());

URLConnection ucon = url.openConnection();
ucon.setDoOutput(true);
ucon.setDoInput(true);
ucon.setUseCaches(false);
ucon.setRequestProperty("Content-Type", "application/octet-stream");
ucon.setRequestProperty("Content-Disposition", "attachment; filename=\"Forms.docx\";");

FileUtil.copy(ResourceUtil.getResourceAsStream("/com/sibvisions/topdf/Forms.docx"),
              ucon.getOutputStream());

byte[] byData = FileUtil.getContent(ucon.getInputStream());

or as Multipart:

MultipartUtil multipart = new MultipartUtil("UTF-8");
multipart.addDataPart("data", "Forms.docx",
                  ResourceUtil.getResourceAsStream("/com/sibvisions/topdf/Forms.docx"));

byte[] byData = multipart.post(getServletService());

The conversion via PDFCreator works great, but not perfect. There are different problems with small page margins in Word documents, problems with OpenOffice documents, ...

The problem is not toPDF, because it works as good as PDFCreator does. If PDFCreator doesn't convert a document, toPDF has no chance to convert it.

We had problems with simple image conversions to PDF because default windows print dialog appeared and we didn't associate image extensions with another tool. We solved the conversion of images with iText instead of PDFCreator. Now it's possible to create PDFs from images very easily without pop-ups.

License?

AGPL 3.0, because PDFCreator is licensed under GPL and iText is licensed under AGPL.

Used tools and libraries

toPDF is a mixture of different open source projects:

PDFCreator
iText
RESTlet
Jackson
JVx
Apache commons FileUpload and IO
Jacob
PDFCreator4J

Installation?

  • toPDF was written in Java, but the installation only makes sense on Windows (same requirements as PDFCreator)
  • Install PDFCreator (default desktop installation, with COM)
  • Deploy topdf.war on Tomcat or JBoss or your preferred Java application server. If your application server runs as windows service, be sure that it runs as OS user.