CiviCRM Community Forums (archive)

*

News:

Have a question about CiviCRM?
Get it answered quickly at the new
CiviCRM Stack Exchange Q+A site

This forum was archived on 25 November 2017. Learn more.
How to get involved.
What to do if you think you've found a bug.



  • CiviCRM Community Forums (archive) »
  • Discussion »
  • Internationalization and Localization (Moderators: Michał Mach, mathieu) »
  • Automatic translation with Google Translate
Pages: [1]

Author Topic: Automatic translation with Google Translate  (Read 4791 times)

dragontree

  • Guest
Automatic translation with Google Translate
January 21, 2010, 10:19:23 pm
I'm finally at the point that I need to translate CiviCRM.
I though that it would be a good start to first use Google Translator to translate everything and then check all the strings and correct the wrong ones. So after some searching I found out that this can be done by using Google API. Most of the scripts I found were written in python and I have never had any contact with it.

One script can be found here: http://senko.net/services/googtext/index.html The online service didn't work for bigger .po files so I downloaded the source but I have no idea how to fix it or even find out what the problem is.

Now, if someone is willing to help me with this I would be very grateful.

Piotr Szotkowski

  • I live on this forum
  • *****
  • Posts: 1497
  • Karma: 57
Re: Automatic translation with Google Translate
January 22, 2010, 09:33:05 am
Hm, that sounds interesting – provided the strings are actually checked by someone before marked as ‘translated’ (I have a long story of Google Translate mishaps).

I’ll take a look at the script sometime next week (we’re a bit overloaded now with the just-released CiviCRMs 3.0.4 and 3.1.beta6, as well as with CiviCRM 3.1.0 scheduled for sometime next week or so).
If you found the above helpful, please consider helping us in return – you can even steer CiviCRM’s future and help us extend CiviCRM in ways useful to you.

dragontree

  • Guest
Re: Automatic translation with Google Translate
January 23, 2010, 03:45:01 am
As far as I know, its possible to add something like #fuzzy to each translation, not sure how though.

Not sure if I have a week, so I'll just try to figure this out myself. I'll let you know how it goes.

dragontree

  • Guest
Re: Automatic translation with Google Translate
January 25, 2010, 08:51:11 am
Had some trouble with importing python modules but got it working.
The script requires Python of course, also Django and Translate-Toolkit.
Gonna test it more tomorrow but it seems to work.

There is something I could use some help with. All the autotranslated strings should be set as fuzzy, but I don't understand how to do that. The script should add "#, fuzzy" to all of them like this:
Code: [Select]
#: CRM/Core/Permission.php
#, fuzzy
msgid "profile create"
msgstr "CiviCRM profiili loomine"

On line 3 I had to use absolute path for some reason
The script itself:
Code: [Select]
#!/usr/bin/python
import sys, os, re, urllib
sys.path.append('/usr/local/lib/python2.6/dist-packages/translate/storage/')

import po

from django.utils import simplejson

from htmlentitydefs import name2codepoint

 

def htmldecode(text):

        """Decode HTML entities in the given text."""

        if type(text) is unicode:

                uchr = unichr

        else:

                uchr = lambda value: value > 255 and unichr(value) or chr(value)

        def entitydecode(match, uchr=uchr):

                entity = match.group(1)

                if entity.startswith('#x'):

                        return uchr(int(entity[2:], 16))

                elif entity.startswith('#'):

                        return uchr(int(entity[1:]))

                elif entity in name2codepoint:

                        return uchr(name2codepoint[entity])

                else:

                        return match.group(0)

        charrefpat = re.compile(r'&(#(\d+|x[\da-fA-F]+)|[\w.:-]+);?')

        return charrefpat.sub(entitydecode, text)



def get_translation(sl, tl, text):

    """

    Response is in the format

   '{"responseData": {"translatedText":"Ciao mondo"}, "responseDetails": null, "responseStatus": 200}'''

    """

    if text.startswith('"'): text = text[1:-1]

    params = {'v':'1.0', 'q': text.encode('utf-8')}

    try:

        result = simplejson.load(urllib.urlopen('http://ajax.googleapis.com/ajax/services/language/translate?%s&langpair=%s%%7C%s' % (urllib.urlencode(params), sl, tl)))

    except IOError, e:

        print e

        return ""

    else:

        try:

            status = result['responseStatus']

        except KeyError:

            status = -1

        if status == 200:

            return result['responseData']['translatedText']

        else:

            print "Error %s: Translating string %s" % (status, text)

            return ""



def translate_po(file, sl, tl):

    openfile = po.pofile(open(file))

    nb_elem = len(openfile.units)

    moves = 1

    cur_elem = 0

    for unit in  openfile.units:

        # report progress

        cur_elem += 1

        s = "\r%f %% - (%d msg processed out of %d) " \

            % (100 * float(cur_elem) / float(nb_elem), cur_elem, nb_elem)

        sys.stderr.write(s)

        if not unit.isheader():

            if len(unit.msgid):

                if unit.msgstr==[u'""']:

                    moves += 1

                    unit.msgstr = ['"%s"' % htmldecode(get_translation(sl, tl, x)) for x in unit.msgid ]

        if not bool(moves % 50):

            print "Saving file..."

            openfile.save()

    openfile.save()



if __name__ == "__main__":



    if len(sys.argv) < 4 or \

       not os.path.exists(sys.argv[1]):

        sys.stderr.write("""

usage example: python autotranslate.py <lang.po> en fr

""")

        sys.exit(1)

    else:

        in_pofile = os.path.abspath(sys.argv[1])

        from_lang = sys.argv[2]

        to_lang = sys.argv[3]

        print('Translating %s to %s' %(from_lang,  to_lang))

        translate_po(in_pofile, from_lang, to_lang)

        print('Translation done')



dragontree

  • Guest
Re: Automatic translation with Google Translate
January 25, 2010, 10:46:31 pm
Thinking with a clear head is so good  ;D

Marking everything as fuzzy is quite simple actually.
Just add one more line: unit.markfuzzy()

Code: [Select]
moves += 1
unit.markfuzzy()
unit.msgstr = ['"%s"' % htmldecode(get_translation(sl, tl, x)) for x in unit.msgid ]

After looking at some translated strings it looks really promising. Most of the shorter strings are translated correctly. Still gonna have to check them all but this should make the translating a bit faster.

Piotr Szotkowski

  • I live on this forum
  • *****
  • Posts: 1497
  • Karma: 57
Re: Automatic translation with Google Translate
January 26, 2010, 06:37:09 am
Ah, very cool! I’ll try to take a look at this sometime this week, as I’d love to have this exercised before the FOSDEM CiviCRM camp in Brussels on February 8-9th.
If you found the above helpful, please consider helping us in return – you can even steer CiviCRM’s future and help us extend CiviCRM in ways useful to you.

dragontree

  • Guest
Re: Automatic translation with Google Translate
January 28, 2010, 05:04:08 am
Some more info on that script...

There are some problems. For some reason Google messes up escaped double-quotes. It adds a space in the middle - \ "text..." - or sometimes even removes the slashes. That means that there will be syntax errors in the .po file. Although that's not an issue if you double-check every string with an editor like Poedit. After you mark the string as not fuzzy or edit it, the syntax error is fixed automatically (at least in Poedit).

These syntax errors don't really matter while you are translating the file. But they must be fixed when you want to compile the .mo file.

Pages: [1]
  • CiviCRM Community Forums (archive) »
  • Discussion »
  • Internationalization and Localization (Moderators: Michał Mach, mathieu) »
  • Automatic translation with Google Translate

This forum was archived on 2017-11-26.