extracting text from *.pdf files

Microsoft Microsoft visual foxpro profes...
November 3, 2009 at 16:04:40
Specs: Windows XP
We are sent laboratory data as *pdf files. I am trying to write a program which will extract the text from the pdf file and then import it to a VFP table. I can manually open the pdf document, select all the text, copy it to the windows clipboard, and wrote a prg that parses the _cliptext and files it. I want to be able to do the first part programmatically from VFP so I don't have to open the pdf file manually.

See More: extracting text from *.pdf files

Report •

November 3, 2009 at 22:51:41
i've worked some with vfp, and based on my experience, i would guess that vfp cannot open and extract PDF files text. If however you have access to vis.basic, it's not much of a problem. the vis.basic program can be made into a .EXE, and vfp can run it via the ! operation.
here's vis.basic core code needed:
shell ("notepad"),1
shell ("start xxxxx.pdf"),1
'foll is a "wait n seconds" routine set for "g" seconds
gosub wait
appactivate "Acrobat Reader -"
'alt-E, S select-all, ctl-c
sendkeys "%E",true
sendkeys "S",true
sendkeys "^C",true
appactivate "Untitled - Notepad"
'ctl-v paste, alt-F, A saveAs, close notepad
sendkeys "^V",true
sendkeys "%f",true
sendkeys "a",true
sendkeys "\sykoPath\testfil.txt"
sendkeys "%{F4}",true
'end vbasic program

t1=timer mod g
for i=1 to 2000
next i
t2=abs((timer mod g)-t1)
if t2=o then return
goto wait

i know this is one more step, and vis.basic window manipulation can be funky (very slow loads, other windows interrupting, etc). Only way I know of to get from PDF to text without paying a bunch of dollars to adobe (they do offer the programs to do these kinds of things, impor/export).

Report •

November 8, 2009 at 08:26:40
Conversions have been done in many way, I would suggest a google with "pdf2txt"

Report •

Related Solutions

Ask Question