Performing Optical Character Recognition on PDF's from ColdFusion using a Java or .NET Library? -


I am trying to take a PDF and remove any of its text. I then want to make it available by using Coldfusion's available Verity Search to search for content.

Is there a library that already does very well? I can call from CF because Java or. I am enclosing net (Java preferred) libraries.

Any insight or experience will be highly appreciated ... Thanks!

EDIT: PDF indexing works when the text is embedded in PDF as far as I know with CF. I have to face pdf to deal with scanned text as an image.

If you have the ability to run your software (i.e. dedicated / VPS), then you can check using cfexecute to convert PDF to text

< / Html>

Comments