Mihail Radu Solcan

  Note on pdf2djvu installation

2008-12-21

The program pdf2djvu converts pdf files into djvu. It inserts the text layer, the outline etc. The program is written by Jakub Wilk. 

I noticed first the existence of such a program in Ubuntu 8.04. The program does a very useful job: it transforms a pdf into a fully featured djvu. For example, I can write a LATEX source, make a pdf out of it and then convert the pdf into a djvu. 

I looked for a Fedora package, but there was none for FC7, which is my working installation. I did find however a spec file for FC9. I build the RPMs in my home directory. 

What is to be done? I took the source from Google Code. I took the spec from the Fedora Project. The author of the spec is Rakesh Pandit. I had also a look at another spec, written by Krzysztof Kotlenga. From both spec files and Jakub Wilk's notes on dependencies I realized that I was in trouble on FC7. 

Rakesh Pandit indicates the following dependencies:

BuildRequires: djvulibre-devel

BuildRequires: libjpeg-devel

BuildRequires: pkgconfig

BuildRequires: poppler-devel

BuildRequires: pstreams-devel

Only pkgconfig and libjpeg-devel were not a problem. I will analyze the rest step by step. 

An RPM for PStreams

First, I had a look at the PStreams Project. The author is Jonathan Wakely. I downloaded version 0.6.0

In fact, you have to compile only in order to test. As the author says, “just copy pstream.h to some directory and include in your programs”. Does it make sense to build an RPM? I think it does. For two reasons. 

First, for example, the spec for pdf2djvu looks for an RPM. We could comment the corresponding line, but this does not seem to be a good practice. 

Second, the RPM manager is invaluable when you want to know what is on your RPM-based GNU/Linux. After one year I will forget about the pstream.h file and install another version in a different location. Then I might get in trouble. Anyway, not knowing systematically what is in the system is a receipe for the creation of an intractable maze of files. 

I just wrote a simple spec file that installs pstream.h plus the manuals. You may consult for details both the spec and the result in this archive

The Poppler

The program pdf2djvu works with the help of two essential resources. First, it uses Poppler and reads the pdf file. Then it builds the djvu file with the help of djvulibre

When I write this, FC7 has poppler-0.5.4. According to Wilk, this version has security vulnerabilities (when confronted with a crafted pdf). Also according to Wilk, poppler-0.5.4 works with pdf2djvu. I decided, however, to go for a newer version. 

The process is somewhat similar to what I have already explained. I downloaded the last Poppler from the Poppler Project's site. I looked for a spec on the Fedora Project. I tried to make sense of the specific situation on my box. For example, in the FC10 spec:

BuildRequires: qt3-devel

This caused an error. I used the command

rpm -qa | grep qt

and I noticed that it should be only

BuildRequires: qt-devel

This is, in fact, qt3-devel. There is also another qt3 requirement that has to be corrected. 

In other situations, there might be other things to check. 

The build went smoothly, but the command

rpm -Uvh poppler-0.10.2-1.fc7.i386.rpm poppler-devel-0.10.2-1.fc7.i386.rpm poppler-glib-0.10.2-1.fc7.i386.rpm poppler-glib-devel-0.10.2-1.fc7.i386.rpm poppler-qt-0.10.2-1.fc7.i386.rpm poppler-qt4-0.10.2-1.fc7.i386.rpm poppler-qt-devel-0.10.2-1.fc7.i386.rpm poppler-qt4-devel-0.10.2-1.fc7.i386.rpm poppler-utils-0.10.2-1.fc7.i386.rpm

showed that I am in trouble. Evince, the viewer, depends on poppler-0.5.4. One has to make a decision. I decided to add the option --nodeps to the rpm command. The command

rpm -qR evince | grep poppler

shows that what evince needs is libpoppler-glib.so.1. 

Rebuilding RPMs, beyond certain limits, can start a snowball: you have to rebuild package after packege. This does not make sense. You just should upgrade. However, in my case, I stopped because I do not open pdf-files with evince. I use xpdf. 

DjVuLibre

The final dependency is the most interesting. In this case, it is easy to download from the DjVuLibre Project's site and build the RPM. There is a spec inside the archive. 

There are, however, two other problems. When I write the note, only djvulibre-3.5.19 works. Wilk is right: djvulibre 3.5.20-4 and 3.5.20-5 are broken. I tried 3.5.19-1 and this one works. 

The second problem is that the spec from DjVuLibre generates only one RPM. The “devel” files are in this RPM. This should be known when we finally adjust the spec for pdf2djvu. 

Building pdf2djvu

When we have solved all the dependency problems, we may go back to the spec file for pdf2djvu. In my case, I have adjusted the dependencies in the following way:

BuildRequires: djvulibre >= 3.5.19

BuildRequires: libjpeg-devel

BuildRequires: pkgconfig

BuildRequires: poppler-devel >= 0.10.1

BuildRequires: pstreams >= 0.6.0

The rest is just

rpmbuild -bb pdf2djvu.spec

The following lines show us how to get the version of pdf2djvu:

pdf2djvu --version

pdf2djvu 0.4.13 (DjVuLibre 3.5.19, poppler 0.10.2)

The use of pdf2djvu is quite simple:

pdf2djvu the-pdf-file.pdf -o the-djvu-file.djvu

You have to indicate explicitely the djvu file.