This site uses cookies.
Some of these cookies are essential to the operation of the site,
while others help to improve your experience by providing insights into how the site is being used.
For more information, please see the ProZ.com privacy policy.
Bharg Shah India Local time: 20:59 French to English + ...
Nov 1, 2003
Hi all,
One of my clients has given me a bilingual glossary of about 400 pages as a PDF file. The terms are arranged in a table in 2 distinct columns. I was wondering if I could convert this into a 2-column Excel worksheet which I could then import into Multiterm. I tried saving the PDF file as RTF but it doesn't retain the table format and all terms are just listed one after the other. The PDF is editable text and not a scanned image so I guess there must be some way to extract the... See more
Hi all,
One of my clients has given me a bilingual glossary of about 400 pages as a PDF file. The terms are arranged in a table in 2 distinct columns. I was wondering if I could convert this into a 2-column Excel worksheet which I could then import into Multiterm. I tried saving the PDF file as RTF but it doesn't retain the table format and all terms are just listed one after the other. The PDF is editable text and not a scanned image so I guess there must be some way to extract the table. All help will be appreciated. ▲ Collapse
Subject:
Comment:
The contents of this post will automatically be included in the ticket generated. Please add any additional comments or explanation (optional)
Natalie Poland Local time: 17:29 Member (2002) English to Russian + ...
MODERATOR
SITE LOCALIZER
Try using good OCR software
Nov 1, 2003
For example, FineReader Pro version 6 or higher. If your file is large, then divide it first into smaller parts using full version of Acrobat, otherwise opening file in FineReader would last for ages.
After having opened the file, recognize the text as usually and then choose "Send to Word". 99% of formatting will be saved.
Subject:
Comment:
The contents of this post will automatically be included in the ticket generated. Please add any additional comments or explanation (optional)
Harry Bornemann Mexico Local time: 09:29 English to German + ...
Write a macro
Nov 1, 2003
I would write a macro in Word-VBA or Perl. First you could insert a sign like # after every second end-of-paragraph mark and then search and replace until you got a tab separated table.
400 pages might be too much for FineReader and even too much for Word. That's where Perl becomes interesting, it would do it within a few seconds. HTH, Harry
[Edited at 2003-11-01 12:04]
Subject:
Comment:
The contents of this post will automatically be included in the ticket generated. Please add any additional comments or explanation (optional)
Mónica Machado United Kingdom Local time: 16:29 English to Portuguese + ...
Fine Reader 7 could be useful
Nov 1, 2003
Hello,
Fine Reader 7 could be useful. You can download a trial version for 15 days (serch under Abby). If 400 pages is too much for it, split the document in two. Fine Reader 7 works ok with 270 pages (I have never tried more than that for each doc).
Hope this helps
Regards, Mónica
Subject:
Comment:
The contents of this post will automatically be included in the ticket generated. Please add any additional comments or explanation (optional)
Create your account in minutes, and start working! 3-month trial for agencies, and free for freelancers!
The system lets you keep client/vendor database, with contacts and rates, manage projects and assign jobs to vendors, issue invoices, track payments, store and manage project files, generate business reports on turnover profit per client/manager etc.
Exclusive discount for ProZ.com users!
Save over 13% when purchasing Wordfast Pro through ProZ.com. Wordfast is the world's #1 provider of platform-independent Translation Memory software. Consistently ranked the most user-friendly and highest value