Pages in topic: [1 2 3] > | Is there a CAT tool with integrated OCR? Thread poster: 6764890385 (X)
|
I need this because I got this powerpoint file that's composed of both text and images with text in them. Now the OCR I got (Finereader) won't properly format the target file like Trados will, and Trados won't read the images and extract the text out of them. If I do this manually it will take a ridiculous amount of time. Any suggestions? | | | Michael Beijer United Kingdom Local time: 06:39 Member (2009) Dutch to English + ...
Sadly, FineReader is pretty much the best there is in terms of converting a difficult PDF into a usable document. Some CAT tools have built in PDF converters (some using OCR) but not one of them is as good as FineReader. One question, how well do you know FR? Often, with a little messing around you can greatly improve your results. Fiddle with the little drop-down thing that selects whether the area under scrutiny will be read as text, image or as a table. Michael ... See more Sadly, FineReader is pretty much the best there is in terms of converting a difficult PDF into a usable document. Some CAT tools have built in PDF converters (some using OCR) but not one of them is as good as FineReader. One question, how well do you know FR? Often, with a little messing around you can greatly improve your results. Fiddle with the little drop-down thing that selects whether the area under scrutiny will be read as text, image or as a table. Michael
[Edited at 2013-06-25 21:14 GMT] ▲ Collapse | | | Michael Beijer United Kingdom Local time: 06:39 Member (2009) Dutch to English + ... try various different converters | Jun 25, 2013 |
My second piece of advice would be to try as many converters as you can get your hands on. I find that sometimes a certain converter will just do a much better job on a particular file. For example, do you own Adobe Acrobat? Adobe Acrobat can sometimes do a great job, sometimes even better than FineReader. Or try the free Wordfast Anywhere converter. I have heard good things about that too. Michael
[Edited at 2013-06-25 16:13 GMT] | | | Michael Davies Denmark Local time: 07:39 Member (2009) Danish to English + ... OCR (powerpoint with text in images) | Jun 25, 2013 |
I do not know of any CAT with built-in OCR but there are a number of on-line OCR services (some require payment - often a quite low price per page - others free of cost though usually with a limitation on the number of pages or pages per hour). One such service I have just 'googled' is able to convert PDF documents to powerpoint - so an alternative solution could be to first convert your powerpoint slides to PDF and then use this on-line OCR (or maybe your FR?) to convert back to po... See more I do not know of any CAT with built-in OCR but there are a number of on-line OCR services (some require payment - often a quite low price per page - others free of cost though usually with a limitation on the number of pages or pages per hour). One such service I have just 'googled' is able to convert PDF documents to powerpoint - so an alternative solution could be to first convert your powerpoint slides to PDF and then use this on-line OCR (or maybe your FR?) to convert back to powerpoint including the texts within images to editable texts, which Trados should be happier with. Yu can see more about this on-line OCR and conversion of PDF to powerpoint at: http://www.verypdf.com/wordpress/201211/how-to-convert-image-pdf-to-editable-powerpoint-by-ocr-tech-33121.html. The OCR referred to in the text is http://www.verypdf.com/app/pdf-to-table-extractor-ocr/index.html. I wish you luck! ▲ Collapse | |
|
|
Michael Davies Denmark Local time: 07:39 Member (2009) Danish to English + ... Wordfast Anywhere | Jun 25, 2013 |
I can definitely recommend Wordfast Anywhere as an excellent on-line translation tool, which I use sometimes as an alternative to Trados (I have Studio 2011), as it has some features, which Trados does not. I do not, however, have any exprience of using it on documents such as the ones you are havin... See more I can definitely recommend Wordfast Anywhere as an excellent on-line translation tool, which I use sometimes as an alternative to Trados (I have Studio 2011), as it has some features, which Trados does not. I do not, however, have any exprience of using it on documents such as the ones you are having problems with. According to http://www.wordfast.com/products_wordfast_anywhere.html it can work with both Powerpoint and PDF (including scanned PDFs - which suggests that it does have OCR capabilities). ▲ Collapse | | | 6764890385 (X) TOPIC STARTER Couldn't batch modify the resolutions of images within the pptx | Jun 25, 2013 |
I don't know much about it, but I wanted to increase the resolution of each image and couldn't find a way to do that in batch. Now you can change the resolution of images individually or mess around as you said showing the program where there's text and tables and images etc. but that just takes too much time. I tried Wordfast Anywhere and it didn't even have a clue that there's images in the file. Is there a way to batch modify the resolution of images in FR? ... See more I don't know much about it, but I wanted to increase the resolution of each image and couldn't find a way to do that in batch. Now you can change the resolution of images individually or mess around as you said showing the program where there's text and tables and images etc. but that just takes too much time. I tried Wordfast Anywhere and it didn't even have a clue that there's images in the file. Is there a way to batch modify the resolution of images in FR? --- Michael Davies: thanks but MS Office can already make a pdf from the presentation or save each slide as an image, and that's already what I'm providing FR to read from (PNG images). And I did hear that Wordfast Anywhere has OCR but the file probably needs to be only image based for it to apply that (see above), it recognized no images in my powerpoint and only took the text. ▲ Collapse | | | ghislandi Local time: 06:39 English to Italian | 6764890385 (X) TOPIC STARTER SDL Trados and OCR | Jun 25, 2013 |
Hi Massimo, maybe Trados should have OCR, I don't know maybe come with FineReader integrated in it in partnership with ABBYY, so as to prevent us from having to figure out ridiculous workarounds to manage difficult files like this. Just a thought.
[Edited at 2013-06-25 16:49 GMT] | |
|
|
Fluency 2013 | Jun 25, 2013 |
Hi Mert, Fluency 2013 (Western Digital) has a built-in OCR app (under Tools), download a trial version and give it a go. HTH, Bernard | | | 6764890385 (X) TOPIC STARTER
I can't download the trial, it won't accept my phone number. I tried putting both a plus and two zeros at the beginning of the number to no avail. "Please enter a valid phone number." By the way, what I want is the software to be able to recognize both images and text, WordFast also had an OCR but it didn't read the images in my pptx which also has text in it. So do you know if Fluency can manage that?
[Edited at 2013-06-25 18:22 GMT] | | | Tech Support | Jun 25, 2013 |
Hi again, Send an e-mail to [email protected], you should get a reply within 5-10 minutes. HTH, Bernard
[Edited at 2013-06-25 19:23 GMT] | | | 6764890385 (X) TOPIC STARTER Did get the trial | Jun 25, 2013 |
Sorry I did get that trial, apparently it wasn't my telephone (typed my email wrong) but it nonetheless prompted about the phone number. | |
|
|
Keep me posted | Jun 25, 2013 |
... about the results as I've never tried that functionality and tech support will certainly help you out if necessary. Thanks, Bernard
[Edited at 2013-06-25 20:14 GMT] | | | 6764890385 (X) TOPIC STARTER | Wordfast Anywhere is Good | Jun 25, 2013 |
I have used the OCR function of Wordfast Anywhere several times, and I can say that it is very good. And you don't have to use the TM to make it work. Enter WordFast Anywhere website, go to File, Upload Document, Select document in your HD, then click UPload WFA will display a disclaimer, with the option to Download as .Doc file. or load it to start using WFA. If you chose to download it, just click Download as .Doc file. Next option, click... See more I have used the OCR function of Wordfast Anywhere several times, and I can say that it is very good. And you don't have to use the TM to make it work. Enter WordFast Anywhere website, go to File, Upload Document, Select document in your HD, then click UPload WFA will display a disclaimer, with the option to Download as .Doc file. or load it to start using WFA. If you chose to download it, just click Download as .Doc file. Next option, click OK. Chose where you want it downloaded to. It will be downloaded as a .zip file. Just open the ZIP package with winrar, et voila, You have you scanned file, with the same format as the original. Provided, of course, that your PDF file is of good quality. I did convert a file while writing this for you. I am writing long after the scanning finished, and the file is open and ready for me to work on. Hope to have helped. Regards, ▲ Collapse | | | Pages in topic: [1 2 3] > | To report site rules violations or get help, contact a site moderator: You can also contact site staff by submitting a support request » Is there a CAT tool with integrated OCR? Anycount & Translation Office 3000 | Translation Office 3000
Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.
More info » |
| Trados Studio 2022 Freelance | The leading translation software used by over 270,000 translators.
Designed with your feedback in mind, Trados Studio 2022 delivers an unrivalled, powerful desktop
and cloud solution, empowering you to work in the most efficient and cost-effective way.
More info » |
|
| | | | X Sign in to your ProZ.com account... | | | | | |