CiviCRM Community Forums (archive)

*

News:

Have a question about CiviCRM?
Get it answered quickly at the new
CiviCRM Stack Exchange Q+A site

This forum was archived on 25 November 2017. Learn more.
How to get involved.
What to do if you think you've found a bug.



  • CiviCRM Community Forums (archive) »
  • Discussion »
  • Internationalization and Localization (Moderators: Michał Mach, mathieu) »
  • How to find and merge "near duplicate" strings
Pages: 1 [2]

Author Topic: How to find and merge "near duplicate" strings  (Read 2134 times)

Michael McAndrew

  • Forum Godess / God
  • I live on this forum
  • *****
  • Posts: 1274
  • Karma: 55
    • Third Sector Design
  • CiviCRM version: various
  • CMS version: Nearly always Drupal
  • MySQL version: 5.5
  • PHP version: 5.3
Re: How to find and merge "near duplicate" strings
December 01, 2014, 11:44:04 am
erm, well it is on the list...
Service providers: Grow your business, build your reputation and support CiviCRM. Become a partner today

mathieu

  • Administrator
  • Ask me questions
  • *****
  • Posts: 620
  • Karma: 36
    • Work
  • CiviCRM version: 4.7
  • CMS version: Drupal
  • MySQL version: MariaDB 10
  • PHP version: 7
Re: How to find and merge "near duplicate" strings
December 01, 2014, 12:19:36 pm
@Coleman: the "master" branch will be sent to Transifex when 4.6 is branched. If we sent strings regularly, we would risk adding/removing too many strings all the time, before review. Changes to 4.5 should be sent when new releases are done, but honestly I have not been doing it very regularly.

"Sentence case" vs "Title Case": having a style/standard would be great, but please avoid massively renaming existing strings (this will really annoy translators).
CiviCamp Montréal, 29 septembre 2017 | Co-founder / consultant / turn-key CiviCRM hosting for Quebec/Canada @ SymbioTIC.coop

Michael McAndrew

  • Forum Godess / God
  • I live on this forum
  • *****
  • Posts: 1274
  • Karma: 55
    • Third Sector Design
  • CiviCRM version: various
  • CMS version: Nearly always Drupal
  • MySQL version: 5.5
  • PHP version: 5.3
Re: How to find and merge "near duplicate" strings
December 02, 2014, 03:53:12 am
Here is a start at User interface text standards: http://wiki.civicrm.org/confluence/display/CRM/User+interface+text

I also started collecting legacy interface guidelines here: http://wiki.civicrm.org/confluence/display/CRM/Legacy+interface+standard+pages

It is ripped off from https://www.drupal.org/node/604342.

Quote
please avoid massively renaming existing strings (this will really annoy translators).

Well Choosing sentence case over Title Case would do that, right? I do think it is a lot nicer but that's only me, and if there is no way to do it automatically, and it is really going to annoy translators, then maybe we should leave.

If you want to take a quick look over what I have done and add stuff from this forum that you think is useful, that would be cool.
Service providers: Grow your business, build your reputation and support CiviCRM. Become a partner today

Coleman Watts

  • Administrator
  • I’m (like) Lobo ;)
  • *****
  • Posts: 2346
  • Karma: 183
  • CiviCRM version: The Bleeding Edge
  • CMS version: Various
Re: How to find and merge "near duplicate" strings
December 02, 2014, 01:16:44 pm
Thanks Michael for getting the ball rolling there.

My biggest concern remains that there is nothing to assist developers with this. IDEs do not understand ts() and so cannot do anything helpful like auto-suggesting existing strings as the developer types (man that would be cool!). The best I can think of would be to do some kind of PR-level testing to report on what new strings are being introduced by each commit (and ideally put up a red flag if they are very similar to any existing string). But I'm not really sure how to implement that either...
Try asking your question on the new CiviCRM help site.

totten

  • Administrator
  • Ask me questions
  • *****
  • Posts: 695
  • Karma: 64
Re: How to find and merge "near duplicate" strings
December 03, 2014, 03:00:26 am
Quote from: Michael McAndrew on December 02, 2014, 03:53:12 am
Here is a start at User interface text standards: http://wiki.civicrm.org/confluence/display/CRM/User+interface+text

Cool. A lot of these are good-points. Added some comments.

Quote from: Michael McAndrew on December 02, 2014, 03:53:12 am
Quote
please avoid massively renaming existing strings (this will really annoy translators).

Well Choosing sentence case over Title Case would do that, right? I do think it is a lot nicer but that's only me, and if there is no way to do it automatically, and it is really going to annoy translators, then maybe we should leave.

Agree we shouldn't change things on a whim. But it seems like this would be a common enough problem ... that someone would have written a re-keying mechanism for use when the English-language string changes in trivial ways ...

Quote from: Coleman Watts on December 02, 2014, 01:16:44 pm
The best I can think of would be to do some kind of PR-level testing to report on what new strings are being introduced by each commit (and ideally put up a red flag if they are very similar to any existing string). But I'm not really sure how to implement that either...

Trying to break that down into smaller questions...

  • What's the baseline? -- The tool should report "new" strings, but compared to what? We could compare to "the current official code in the target branch" or "the strings in the last release" or "the strings currently known to Transifex" or "all of the above".
  • How to report? -- It's straight-forward to add a new item to the Jenkins navbar (e.g. like "CiviBuild" in https://test.civicrm.org/job/CiviCRM-Core-Matrix/CIVIVER=4.4,label=test-debian6-1/433/ ) or to mark a build as *failed* (red). It's a bit more involved to display detailed messages in Github's UI.

Regarding PHPUnit -- suppose we had a rule like "strings must not include markup." A simple unit test might look like this (pseudocode):

Code: [Select]
class StringConformanceTest {
  function testNoMarkup() {
    system("bin/extract-strings > /tmp/all-strings.txt");
    foreach (read("/tmp/all-strings.txt") as $string) {
      $this->assertFalse(containsMarkup($string));
    }
  }
}

If one wanted to run the same test but focus only on changed files, then maybe pseudocode like:

Code: [Select]
$ git diff $targetBranch..HEAD > /tmp/changes.diff
$ env DIFF=/tmp/changes.diff ./scripts/phpunit StringConformanceTest

class StringConformanceTest {
  function testNoMarkup() {
    if ($diffFile = getenv('DIFF')) {
      system("bin/extract-strings $diffFile > /tmp/all-strings.txt");
    } else {
      system("bin/extract-strings > /tmp/all-strings.txt");
    }
    foreach (read("/tmp/all-strings.txt") as $string) {
      $this->assertFalse(containsMarkup($string));
    }
  }
}

joanne

  • Administrator
  • Ask me questions
  • *****
  • Posts: 852
  • Karma: 83
  • CiviCRM version: 4.4.16
  • CMS version: Drupal 7
Re: How to find and merge "near duplicate" strings
December 03, 2014, 02:42:18 pm
I realise the discussion has moved on from this, but I am sure I have met situations where having Title Case instead of Sentence case offered more flexibility in terms of word replacement.

Pages: 1 [2]
  • CiviCRM Community Forums (archive) »
  • Discussion »
  • Internationalization and Localization (Moderators: Michał Mach, mathieu) »
  • How to find and merge "near duplicate" strings

This forum was archived on 2017-11-26.