One of the best-known features of the OpenOffice.org suite is its ability to export files into PDF format.
Combined with the ability to read files in many formats such as the ones used in WordPerfect and MS Office, this makes OpenOffice.org a good converter.
Of course there are more free office converters (example: wvWare) but OpenOffice.org is probably the best one due to the large number of formats that is supports and the good quality of its filters.
Its main drawback, however, is that it’s not trivial to convert documents using this suite in batch mode from the command line.
Of all the methods I’ve seen, this is probably the easiest one. Here are the steps:
- Start OpenOffice.org:
$ openoffice "-accept=socket,host=localhost,port=2002;urp;"
- Download this script.
- Use it!
$ ./ooextract.py --pdf mydocument.odt
OpenOffice.org has to be running in background for this script to have effect, so to avoid the annoyance of having the office window laying around in your desktop (or to use it in a machine with no display) you can use Xvfb or a VNC server. Example:
$ Xvfb :1 & $ DISPLAY=:1 openoffice "-accept=socket,host=localhost,port=2002;urp;"
And that’s it. It would be nice if OpenOffice.org had direct support for conversion from the command line, but meanwhile this method will do the job.