The program pdf2djvu converts pdf files into djvu. It inserts the text layer, the outline etc. The program is written by Jakub Wilk.
I noticed first the existence of such a program in Ubuntu 8.04. The program does a very useful job: it transforms a pdf into a fully featured djvu. For example, I can write a LATEX source, make a pdf out of it and then convert the pdf into a djvu.
I looked for a Fedora package, but there was none for FC7, which is my working installation. I did find however a spec file for FC9. I build the RPMs in my home directory.
What is to be done? I took the source from Google Code. I took the spec from the Fedora Project. The author of the spec is Rakesh Pandit. I had also a look at another spec, written by Krzysztof Kotlenga. From both spec files and Jakub Wilk's notes on dependencies I realized that I was in trouble on FC7.
Rakesh Pandit indicates the following dependencies:
BuildRequires: djvulibre-devel
BuildRequires: libjpeg-devel
BuildRequires: pkgconfig
BuildRequires: poppler-devel
BuildRequires: pstreams-devel
Only pkgconfig and libjpeg-devel were not a problem. I will analyze the rest step by step.
First, I had a look at the PStreams Project. The author is Jonathan Wakely. I downloaded version 0.6.0.
In fact, you have to compile only in order to test. As the author says, “just copy pstream.h to some directory and include in your programs”. Does it make sense to build an RPM? I think it does. For two reasons.
First, for example, the spec for pdf2djvu looks for an RPM. We could comment the corresponding line, but this does not seem to be a good practice.
Second, the RPM manager is invaluable when you want to know what is on your RPM-based GNU/Linux. After one year I will forget about the pstream.h file and install another version in a different location. Then I might get in trouble. Anyway, not knowing systematically what is in the system is a receipe for the creation of an intractable maze of files.
I just wrote a simple spec file that installs pstream.h plus the manuals. You may consult for details both the spec and the result in this archive.
The program pdf2djvu works with the help of two essential resources. First, it uses Poppler and reads the pdf file. Then it builds the djvu file with the help of djvulibre.
When I write this, FC7 has poppler-0.5.4. According to Wilk, this version has security vulnerabilities (when confronted with a crafted pdf). Also according to Wilk, poppler-0.5.4 works with pdf2djvu. I decided, however, to go for a newer version.
The process is somewhat similar to what I have already explained. I downloaded the last Poppler from the Poppler Project's site. I looked for a spec on the Fedora Project. I tried to make sense of the specific situation on my box. For example, in the FC10 spec:
BuildRequires: qt3-devel
This caused an error. I used the command
rpm -qa | grep qt
and I noticed that it should be only
BuildRequires: qt-devel
This is, in fact, qt3-devel. There is also another qt3 requirement that has to be corrected.
In other situations, there might be other things to check.
The build went smoothly, but the command
rpm -Uvh poppler-0.10.2-1.fc7.i386.rpm poppler-devel-0.10.2-1.fc7.i386.rpm poppler-glib-0.10.2-1.fc7.i386.rpm poppler-glib-devel-0.10.2-1.fc7.i386.rpm poppler-qt-0.10.2-1.fc7.i386.rpm poppler-qt4-0.10.2-1.fc7.i386.rpm poppler-qt-devel-0.10.2-1.fc7.i386.rpm poppler-qt4-devel-0.10.2-1.fc7.i386.rpm poppler-utils-0.10.2-1.fc7.i386.rpm
showed that I am in trouble. Evince, the viewer, depends on poppler-0.5.4. One has to make a decision. I decided to add the option --nodeps to the rpm command. The command
rpm -qR evince | grep poppler
shows that what evince needs is libpoppler-glib.so.1.
Rebuilding RPMs, beyond certain limits, can start a snowball: you have to rebuild package after packege. This does not make sense. You just should upgrade. However, in my case, I stopped because I do not open pdf-files with evince. I use xpdf.
The final dependency is the most interesting. In this case, it is easy to download from the DjVuLibre Project's site and build the RPM. There is a spec inside the archive.
There are, however, two other problems. When I write the note, only djvulibre-3.5.19 works. Wilk is right: djvulibre 3.5.20-4 and 3.5.20-5 are broken. I tried 3.5.19-1 and this one works.
The second problem is that the spec from DjVuLibre generates only one RPM. The “devel” files are in this RPM. This should be known when we finally adjust the spec for pdf2djvu.
When we have solved all the dependency problems, we may go back to the spec file for pdf2djvu. In my case, I have adjusted the dependencies in the following way:
BuildRequires: djvulibre >= 3.5.19
BuildRequires: libjpeg-devel
BuildRequires: pkgconfig
BuildRequires: poppler-devel >= 0.10.1
BuildRequires: pstreams >= 0.6.0
The rest is just
rpmbuild -bb pdf2djvu.spec
The following lines show us how to get the version of pdf2djvu:
pdf2djvu --version
pdf2djvu 0.4.13 (DjVuLibre 3.5.19, poppler 0.10.2)
The use of pdf2djvu is quite simple:
pdf2djvu the-pdf-file.pdf -o the-djvu-file.djvu
You have to indicate explicitely the djvu file.