Serving the Quantitative Finance Community

 
User avatar
theGreek
Topic Author
Posts: 3
Joined: September 30th, 2003, 12:29 pm

PDF to excel

February 11th, 2004, 12:21 pm

hi people, I've been recently assigned a project from a back office manager in order to help with some data collection (i should have kept my mouth closed when at the pub). apparently these people collect data from pdf and save them in excel in a certain format. Problem is that pdf is probably the worst filetype one can work with. To add insult to injury the data in pdfs do not always have the same format. I was wondering whether anyone has worked with any reliable tool that converts pdfs to excel. I have come across a few progs on the net but most of them seem to be making a real mess out of the initial file. any ideas on how to approach the problem? my first thought is to get some tool that extracts the data from the pdf and then use vba to put them in order. a long shot... is there a prog language specific to pdf (like vba is to excel and access)? any suggestion to solve the problem would be highly valued
 
User avatar
player
Posts: 0
Joined: August 5th, 2002, 10:00 am

PDF to excel

September 15th, 2006, 11:29 am

Anyone else know how to do it for free??
 
User avatar
zeta
Posts: 26
Joined: September 27th, 2005, 3:25 pm
Location: Houston, TX
Contact:

PDF to excel

September 15th, 2006, 11:42 am

do you just want to extract/target tabulated data? As it happens I have been working on a database project, extracting data from PDF images etc, but my work has been more slanted towards converting 2d plots of data into numbers and strings
 
User avatar
player
Posts: 0
Joined: August 5th, 2002, 10:00 am

PDF to excel

September 15th, 2006, 1:54 pm

basically a table from a pdf document..I tried copying and pasting but it doesnt recognise the individual columns...
 
User avatar
PointerLover
Posts: 0
Joined: March 1st, 2006, 4:14 pm

PDF to excel

September 15th, 2006, 5:12 pm

I recently had a similiar problem.. I wanted to import a data table from a PDF file into Excel.. I solved it as follows:- start the Xpdf (http://www.foolabs.com/xpdf/download.html) program "pdftotext.exe" via vba passing the required parameters (input-file / output-file) to convert the .pdf to a pure text-file without any formatting- read this text file line by line until one of the column-headers (which are always the same in my case) is found- now read the following lines and assume that the different columns are seperated by blanksMaybe not the cleanest solution but it works fine and was easy to implement..
Last edited by PointerLover on September 14th, 2006, 10:00 pm, edited 1 time in total.
 
User avatar
csa
Posts: 0
Joined: February 21st, 2003, 3:16 am

PDF to excel

September 16th, 2006, 3:33 am

Some newer versions of Acrobat (I believe 6 and higher if I'm not mistaken) will let you do something like "Copy As Table" from a PDF. Then, you can paste it into Excel as you would do normally (i.e. CTRL+V) and it will preserve the columns. There may be some tricky issues depending on the PDF, so you may have to fiddle around with it. For example, you may have to copy and paste the headers separately because for some reason there are some headers that are not recognized as columns which mess up the paste. Although that would require you two steps, it sure beats re-typing the whole table.