I am trying to take a PDF and remove any of its text. I then want to make it available by using Coldfusion's available Verity Search to search for content.
Is there a library that already does very well? I can call from CF because Java or. I am enclosing net (Java preferred) libraries.
Any insight or experience will be highly appreciated ... Thanks!
EDIT: PDF indexing works when the text is embedded in PDF as far as I know with CF. I have to face pdf to deal with scanned text as an image.
If you have the ability to run your software (i.e. dedicated / VPS), then you can check using cfexecute
to convert PDF to text
Comments
Post a Comment