Indexing PDF

From: David C Sterratt <david c sterratt ed ac uk>
To: David Wheeler <dwheeler ida org>
Cc: gnome-devel-list gnome org, david c sterratt ed ac uk
Subject: Indexing PDF
Date: Wed, 9 Apr 2003 18:11:13 +0100

>>>>> David Wheeler writes:

 > David C Sterratt <david c sterratt ed ac uk> proclaimed:
 >> * Content indexers for pdf, openoffice and word docuements.  If it
 >> really is quite simple to add support for these (e.g. by
 >> specifying command line tools to call) and someone could tell me
 >> where to look in the code, I'd be happy to work on this.

 > For PDF, take a peek at "pdftotext".  On Red Hat Linux it's part of
 > the xpdf package; the URL given is "http://www.foolabs.com/xpdf";.
 > Probably sufficient for indexing, at least a first cut.

Thanks for the tip.

I perhaps didn't make it clear, but I was really wondering about where
in the medusa codebase programs like pdftotext could be hooked in.  If
I understand my quick look today at the medusa code, at least a c-file
or two are needed to implement each type of converter...

David.

 > --- David A. Wheeler
 >      dwheeler ida org

References:
- Indexing PDF
  - From: David Wheeler

[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]