CiviCRM Community Forums (archive)

*

News:

Have a question about CiviCRM?
Get it answered quickly at the new
CiviCRM Stack Exchange Q+A site

This forum was archived on 25 November 2017. Learn more.
How to get involved.
What to do if you think you've found a bug.



  • CiviCRM Community Forums (archive) »
  • Old sections (read-only, deprecated) »
  • Developer Discussion (Moderator: Donald Lobo) »
  • MySQL implementation of CRM_Utils_String::munge() ?
Pages: [1]

Author Topic: MySQL implementation of CRM_Utils_String::munge() ?  (Read 1868 times)

JoeMurray

  • Administrator
  • Ask me questions
  • *****
  • Posts: 578
  • Karma: 24
    • JMA Consulting
  • CiviCRM version: 4.4 and 4.5 (as of Nov 2014)
  • CMS version: Drupal, WordPress, Joomla
  • MySQL version: MySQL 5.5, 5.6, MariaDB 10.0 (as of Nov 2014)
MySQL implementation of CRM_Utils_String::munge() ?
June 12, 2013, 08:17:39 am
We are moving more and more to supporting machine name columns in our tables, which will facilitate migrating data between instances. Core has a good practice of calling a munge function as a standard way to consistently convert a string so that it is MySQL compatible as a table or field name:
Code: [Select]
/**
   * given a string, replace all non alpha numeric characters and
   * spaces with the replacement character
   *
   * @param string $name the name to be worked on
   * @param string $char the character to use for non-valid chars
   * @param int    $len  length of valid variables
   *
   * @access public
   *
   * @return string returns the manipulated string
   * @static
   */
  static function munge($name, $char = '_', $len = 63) {
    // replace all white space and non-alpha numeric with $char
    // we only use the ascii character set since mysql does not create table names / field names otherwise
    // CRM-11744
    $name = preg_replace('/[^a-zA-Z0-9]+/', $char, trim($name));

    if ($len) {
      // lets keep variable names short
      return substr($name, 0, $len);
    }
    else {
      return $name;
    }
  }

It would be useful for extensions to have a similar function that could be used for their install and uninstall sql functions. While it is possible to do this by using php to create and run the SQL, it's not perfect.

Wouldn't it be nice to have a MySQL function that could do this for us? I don't think we can rely on being able to have a library installed that provides preg_replace functionality in MySQL (like https://github.com/mysqludf/lib_mysqludf_preg#readme).

However, most of the time there is not a need to convert all characters, just the ones that are commonly found in a label that should be converted to a name. Here is a version that implements a very restricted set of replacements, and could be expanded a bit if this is deemed a useful idea:

Code: [Select]
mysql> create function CIVICRM_MUNGE (s CHAR(255)) RETURNS CHAR(255) LANGUAGE SQL DETERMINISTIC RETURN LEFT(REPLACE(REPLACE(TRIM(s),' ','_'),'-','_'),63);

mysql> SELECT CIVICRM_MUNGE('hello joe'), CIVICRM_MUNGE('my-variable'), CIVICRM_MUNGE('This doesn''t work!');
+----------------------------+------------------------------+--------------------------------------+
| CIVICRM_MUNGE('hello joe') | CIVICRM_MUNGE('my-variable') | CIVICRM_MUNGE('This doesn''t work!') |
+----------------------------+------------------------------+--------------------------------------+
| hello_joe                  | my_variable                  | This_doesn't_work!                   |
+----------------------------+------------------------------+--------------------------------------+
1 row in set (0.00 sec)

mysql>

A few more REPLACE functions could be added to handle more commonly used characters quite easily. A full MySQL implementation of preg_replace for our simple regex here would be longer but very easy to implement.

This requires the MySQL CREATE ROUTINE privilege. As we already are relying on this privilege for some BETA functionality in logging as well as CREATE VIEW privilege for i18n, I don't think it is out of the question to consider making this a dependency for some extensions, but I would welcome feedback on whether it is commonly available with hosting providers used by CiviCRM installs.

Co-author of Using CiviCRM https://www.packtpub.com/using-civicrm/book

xavier

  • Forum Godess / God
  • I’m (like) Lobo ;)
  • *****
  • Posts: 4453
  • Karma: 161
    • Tech To The People
  • CiviCRM version: yes probably
  • CMS version: drupal
Re: MySQL implementation of CRM_Utils_String::munge() ?
June 12, 2013, 09:20:29 am
Hi,

What is the limitation of using the php version? On the extension that have custom data, I'm calling the sql commands from the php and I haven't experienced problems.
-Hackathon and data journalism about the European parliament 24-26 jan. Watch out the result

Coleman Watts

  • Administrator
  • I’m (like) Lobo ;)
  • *****
  • Posts: 2346
  • Karma: 183
  • CiviCRM version: The Bleeding Edge
  • CMS version: Various
Re: MySQL implementation of CRM_Utils_String::munge() ?
June 12, 2013, 09:22:28 am
White lists are safer than black lists. Can't the function use RLIKE [_a-z0-9]
Try asking your question on the new CiviCRM help site.

JoeMurray

  • Administrator
  • Ask me questions
  • *****
  • Posts: 578
  • Karma: 24
    • JMA Consulting
  • CiviCRM version: 4.4 and 4.5 (as of Nov 2014)
  • CMS version: Drupal, WordPress, Joomla
  • MySQL version: MySQL 5.5, 5.6, MariaDB 10.0 (as of Nov 2014)
Re: MySQL implementation of CRM_Utils_String::munge() ?
June 12, 2013, 11:07:41 am
Xavier:

It's possible to use the PHP function, but we have lots of sql upgrade files like https://github.com/civicrm/civicrm-core/blob/master/CRM/Upgrade/Incremental/sql/4.4.alpha1.mysql.tpl that could benefit from this sort of a function. If we're mainly just executing insert statements, it's cleaner and easier to maintain if it is mysql statements rather than mysql wrapped in PHP.

Coleman:
I was thinking of looping through characters and doing the Replace when they were found. I could use RLIKE in that more complicated implementation.
Co-author of Using CiviCRM https://www.packtpub.com/using-civicrm/book

totten

  • Administrator
  • Ask me questions
  • *****
  • Posts: 695
  • Karma: 64
Re: MySQL implementation of CRM_Utils_String::munge() ?
June 29, 2013, 01:51:11 am
If you do this, I suggest writing a test case to ensure that the two munge functions always behave the same way, e.g. pseudocode:

Code: [Select]
<?php
/**
 * A list of example inputs/outputs for munge()
 * @return array (array(0=>input, 1=>expectedMungeOutput))
 */
function mungeValues() {
  return array(...);
}

/** 
 * @dataProvider mungeValues
 * @param string $input
 * @param string $expected expected munge output
 */
function testMungePHP($input, $expected) {
 
$this->assertEquals($expected, CRM_Utils_String::munge($input));
}

/** 
 * @dataProvider mungeValues
 * @param string $input
 * @param string $expected expected munge output
 */
function testMungeSQL($input, $expected) {
 
$this->assertEquals($expected, CRM_Core_DAO::singleValueQuery(
    
'SELECT civicrm_munge(%1)',
    array(
1 => array($input, 'String'))
  ));
}

Pages: [1]
  • CiviCRM Community Forums (archive) »
  • Old sections (read-only, deprecated) »
  • Developer Discussion (Moderator: Donald Lobo) »
  • MySQL implementation of CRM_Utils_String::munge() ?

This forum was archived on 2017-11-26.