Today I’ve reviewed again the advantages and disadvantages of DjVu and PDF for scanning and archiving old books. I’ve decided to abandon DjVu. While googling, I’ve found a post which confirmed that the situation with DjVu is bad (the post is from Feb 2007 but I have no reason to believe that the situation is better today).
When I started using it several years ago (my oldest DjVu files date from late 2004) support was sparse but I understood that it was still relatively new. I started using it with the hope that it would catch on. It seems however, that the situation has not improved much since then. Yes, there is more software available for it but most of it is not free (in any sense of the word free) and there does not seem to be any kind of end-to-end support for all features that the format allows. For instance, I have no tool in Ubuntu which allows me to add comments to a DjVu file like I do with PDF files. So I’ve given up on DjVu. It used to be that the size advantages more than made up for all the other disadvantages but PDF has evolved and gotten better at compressing images (by using more advanced compression algorithms), disk space is not the issue it used to be years ago and PDF is just better supported.
That’s very unfortunate because I think if the creators of DjVu had opened it fully, it would most likely have displaced PDF.
Michael White over on the Scientific Blogging Site posted an article in which he exposes the largely mythical nature of the narrative of the scientific underdog repressed by an entrenched scientific establishment. I urge people to read his article but if you are too pressed for time, I’ll quote White on what that narrative precisely consists of. I am quoting White but keep in mind the context: White is critical of this narrative.
The narrative goes like this:
1. The famous, brilliant scientist So-and-so hypothesized that X was true.
2. X, forever after, became dogma among scientists, simply by virtue of the brilliance and fame of Dr. So-and-so.
3. This dogmatic assent continues unchallenged until an intrepid, underdog scientist comes forward with a dramatic new theory, completely overturning X, in spite of sustained, hostile opposition by the dogmatic scientific establishment.
I’ve been working on some Chinese extensions for OO but at every step of the way I have to fight with obscure documentation and really strange design decisions. Here’s the latest example. Want to display an image in a dialog? We’re not talking about anything fancy here but just one single image which remains static. There’s nothing dynamic about this. So can you just put in the relative path in the dlg:src parameter which indicates where the image lives (e.g. dlg:src=”../image.jpg”)? No way! That would be way too simple and would violate the spirit of OO which is “why make things simple when you can make them complicated”. Instead you have to create two additional XML files to tell Open Office where to find the image in your extension and then at run time you have to query Open Office to find where the image really lives and load it into your dialog. Yay! The reason for this is that you do not know ahead of time where your extension is going to reside on disk. You’d think a relative path would be rock solid because it is relative to where your extension is located, but no: that does not work. You have to file extra paperwork with Open Office to declare the existence of the images.
(I searched through the dialog files bundled with Open Office to see if I could find something useful in there but what I found were paths like “file://D:/…”. Ooops, I guess even the Open Office developers are having a hard time keeping their paths portable.)
And this is just the latest in a loooooooooooooooooooong series of grievances. Here’s a new motto: “you don’t know the meaning of bondage-and-discipline programing until you’ve tried to write extensions for Open Office.” Open Office is like a bureaucrat: you can’t do anything without filing multiple forms to announce what you want to do and justify it.
References: here, here and here.