Dear visitor, if you know the answer to this question, please post it. Thank you!

Note that this thread has not been updated in a long time, and its content might not be up-to-date anymore.

Scanning / Translation for Japanese 2021/1/27 05:44
Hello. This might be a weird question to answer but I'm not sure where else to look. Does anyone know of a good scanning / translation software for the Japanese language? Thank you, Evan

PS. If you need a fuller explanation:
Scanning / OCR technology helps you recognize text that was, for example, scanned from a book, or notes etc. Usually when you scan pages you are unable to edit it. But with a OCR software you can recognize this text and edit it later (Adobe PDF has an some OCR abilities - optical character recognition). With English, it's pretty easy to scan text and then transfer it into word document, html etc.

But is there any software that could do it for Japanese reliably? I found that Omnipage and Grooper Reader https://www.bisok.com/grooper-data-capture-method-features/multi-pass-... should be able to do it...but have never used this software in real life and before shelling out $500 I'd like to know how good it is.

I need this because I am currently working on my dissertation that requires me to read a lot of Japanese literature but with my level of Japanese I still can't read very quickly (I have to constantly to look kanji that I don't know). Since this takes obviously a ton of time I thought it would be much easier if I scanned the books, transfered the text into HTML and then used Rikaichan while reading.

Any ideas or help will be truly appreciated.
by Evan (guest)  

Re: Scanning / Translation for Japanese 2021/1/27 14:57
I know that Miraitranslation application (itfs web based, no installation needed) is really good (at least compared to google translate which often just comes up with gibberish). Miraitranslation can also handle PDF, PowerPoint and excel and create formatted output. I have a vague memory that it also worked for a scanned pdf. I canft test it now, as my account only works in Japan (long story) but probably their website can explain.
Your user account will cost a certain fee.
by LikeBike rate this post as useful

Re: Scanning / Translation for Japanese 2021/1/28 00:12
I've never used OCR, but did find a "best 15" list on the internet.

It suggests that it's best to use software made in the country that uses the language you want to scan. For example, if you want to scan a Japanese book, you should use a Japanese OCR.

As for automatic translation, so far, I could find nothing better than DeepL.

Someone was recommending it on TV, hoping it would be useful for viewers who wish to update themselves about the new virus. I'm a professional E-to-J / J-to-E translator, and it's so good that it's almost a joke. I mean, they even translate jokes to the point you can properly laugh. I've only used the free version, but it seems that the Pro version allows you to use Word, Powerpoint and .txt.

Note, however, DeepL is so smart that when the original text is grammatically incorrect, it simply skips that part and creates a page of translation that only looks good as a whole. Hence, so far I haven't been able to use it to write translations, but it could often help you read especially when your brain is tired.
by Uco rate this post as useful

Re: Scanning / Translation for Japanese 2021/3/1 03:15
1. Try a Japanese ocr software.
This also has the ability to translate into English, but like all j to e software, don't expect more than a rough/poor translation.

Others to try that have been on the market for years. Performance on the texts you read will differ.

No ocr is perfect, and you may need to get a scanner that'll give you good source images. Even then, you'll find that you often have a few characters per page you'll need to proofread to insert the correct kanji.

Fujitsu snapscan auto sheet fed if you cut up books, else, overhead camera book scanners where you flip pages.

2. Google Translate in camera mode on the phone. Quick for a rough translation of a few characters or sentence, worsening output as the text becomes more complex/erudite/technical.
by D (guest) rate this post as useful

reply to this thread