Computing.Net > Forums > Programming > extracting text from *.pdf files

Computer Problems? Computing.Net has over 1,000,000 posts about all things technology related! Over 90% answered within 24 hours! Click here to start participating now! Also, be sure to check out the New User Guide.

extracting text from *.pdf files

Reply to Message Icon

Name: drmkahn
Date: November 3, 2009 at 16:04:40 Pacific
OS: Windows XP
Product: Microsoft Microsoft visual foxpro professional 9.0 software, win32 english academic cd, model: 340-01232. ds-d
Subcategory: General
Comment:

We are sent laboratory data as *pdf files. I am trying to write a program which will extract the text from the pdf file and then import it to a VFP table. I can manually open the pdf document, select all the text, copy it to the windows clipboard, and wrote a prg that parses the _cliptext and files it. I want to be able to do the first part programmatically from VFP so I don't have to open the pdf file manually.



Sponsored Link
Ads by Google

Response Number 1
Name: nbrane
Date: November 3, 2009 at 22:51:41 Pacific
Reply:

i've worked some with vfp, and based on my experience, i would guess that vfp cannot open and extract PDF files text. If however you have access to vis.basic, it's not much of a problem. the vis.basic program can be made into a .EXE, and vfp can run it via the ! operation.
here's vis.basic core code needed:
shell ("notepad"),1
shell ("start xxxxx.pdf"),1
'foll is a "wait n seconds" routine set for "g" seconds
g=20
gosub wait
appactivate "Acrobat Reader -"
'alt-E, S select-all, ctl-c
sendkeys "%E",true
sendkeys "S",true
sendkeys "^C",true
appactivate "Untitled - Notepad"
'ctl-v paste, alt-F, A saveAs, close notepad
sendkeys "^V",true
sendkeys "%f",true
sendkeys "a",true
sendkeys "\sykoPath\testfil.txt"
sendkeys "%{F4}",true
'end vbasic program
end

wait:
t1=timer mod g
for i=1 to 2000
doevents
next i
t2=abs((timer mod g)-t1)
if t2=o then return
goto wait

i know this is one more step, and vis.basic window manipulation can be funky (very slow loads, other windows interrupting, etc). Only way I know of to get from PDF to text without paying a bunch of dollars to adobe (they do offer the programs to do these kinds of things, impor/export).


-1

Response Number 2
Name: tvc
Date: November 8, 2009 at 08:26:40 Pacific
Reply:

Conversions have been done in many way, I would suggest a google with "pdf2txt"


0

Sponsored Link
Ads by Google
Reply to Message Icon

Related Posts

See More


Task scheduler - create i... how to make multiple resp...


Use following form to reply to current message:

Login or Register to Reply
LoginRegister


Sponsored links

Ads by Google


Results for: extracting text from *.pdf files

Extracting text from images www.computing.net/answers/programming/extracting-text-from-images/10266.html

Win32: Reading a texts from a file www.computing.net/answers/programming/win32-reading-a-texts-from-a-file/11673.html

Copy text from file A to B at specific locati www.computing.net/answers/programming/copy-text-from-file-a-to-b-at-specific-locati/20270.html