CiviCRM Community Forums (archive)

*

News:

Have a question about CiviCRM?
Get it answered quickly at the new
CiviCRM Stack Exchange Q+A site

This forum was archived on 25 November 2017. Learn more.
How to get involved.
What to do if you think you've found a bug.



  • CiviCRM Community Forums (archive) »
  • Old sections (read-only, deprecated) »
  • Developer Discussion (Moderator: Donald Lobo) »
  • New cache, file based
Pages: [1]

Author Topic: New cache, file based  (Read 2357 times)

xavier

  • Forum Godess / God
  • I’m (like) Lobo ;)
  • *****
  • Posts: 4453
  • Karma: 161
    • Tech To The People
  • CiviCRM version: yes probably
  • CMS version: drupal
New cache, file based
October 13, 2012, 02:59:55 pm
As discussed, created a new file based cache.
I have installed the performance logging drupal module, that logs for each page how much memory/time/sql request are needed (max and average)
https://drupal.org/project/performance

So the obvious problem is that once everything is in memory and mysql got everything cached, it's super fast the second time you reload the page, but don't reflect real usage.

However, something that I find weirder is that I'm switching between the default cache (ArrayCache)
the new cache
define ('CIVICRM_DB_CACHE_CLASS', 'FileCache');
or the NoCache class
define ('CIVICRM_DB_CACHE_CLASS', 'NoCache');


Loading the summary view of a contact or the edit of a contact basically generates the same number of sql requests, no matter the cache (and same memory and same everything more or less)

I am confused and either the cache isn't doing anything on these two screens (unlikely, it sets about 20 keys), or my methodoloy is wrong.

Is there a way to be sure civicrm queries are properly logged? do I need a special config?

X+

P.S I'm using serialize, quickly tried var_export, will dig to compare

-Hackathon and data journalism about the European parliament 24-26 jan. Watch out the result

totten

  • Administrator
  • Ask me questions
  • *****
  • Posts: 695
  • Karma: 64
Re: New cache, file based
October 13, 2012, 03:15:36 pm
Quote from: xavier on October 13, 2012, 02:59:55 pm
As discussed, created a new file based cache.

Cool!

Quote from: xavier on October 13, 2012, 02:59:55 pm
Is there a way to be sure civicrm queries are properly logged? do I need a special config?

Have you tried CIVICRM_DEBUG_LOG_QUERY? It won't provide metrics -- you'd have to do that yourself, e.g
 * Before a test run, delete the log file
 * After a test run, use "grep" and "wc" to determine #queries

Quote from: xavier on October 13, 2012, 02:59:55 pm
P.S I'm using serialize, quickly tried var_export, will dig to compare

I don't have any data on performance, but I believe the var_export() approach can take advantage of the opcode caching while the serialize() approach requires an extra round of parsing for each page request.

To make the var_export approach work, you'd need something like this:

Code: [Select]
<?php
/**
 * Save $data to $cache_file
 */
function save($cache_file, $data) {
  
$code = "<?php
    return " 
+ var_export($data, TRUE) + ";
  "
;
  
file_put_contents($cache_file, $code);
}
/**
 * Read $data from $cache_file
 */
function load($cache_file) {
  return include(
$cache_file);
}

xavier

  • Forum Godess / God
  • I’m (like) Lobo ;)
  • *****
  • Posts: 4453
  • Karma: 161
    • Tech To The People
  • CiviCRM version: yes probably
  • CMS version: drupal
Re: New cache, file based
October 14, 2012, 02:05:15 am
thx for log, will try.

The problem with var_export is that when it's a class, it generates code like that:

Code: [Select]
CRM_Core_Config::__set_state(array(
   'dsn' => 'mysql://trunk:trunk@localhost/trunk',
   'userFramework' => 'Drupal',
   'userFrameworkURLVar' => 'q',
   'userFrameworkDSN' => 'mysql://trunk:trunk@localhost/trunk',
   'userSystem' =>
  CRM_Utils_System_Drupal::__set_state(array(
     'is_drupal' => true,
     'is_joomla' => false,
     'is_wordpress' => false,
     'supports_form_extensions' => true,
  )),
   'templateCompileDir' => '/var/www/drupal/trunk/sites/default/files/civicrm/templates_c/en_US/',
   'configAndLogDir' => '/var/www/drupal/trunk/sites/default/files/civicrm/ConfigAndLog/',
   'initialized' => 0,
   'DAOFactoryClass' => 'CRM_Contact_DAO_Factory',
   'componentRegistry' =>
  CRM_Core_Component::__set_state(array(
...

So the __set_state method had to be added to each class and set the right attributes as they are in the param.

Checked it quickly and we don't have a common parent for each of the classes.

X+
-Hackathon and data journalism about the European parliament 24-26 jan. Watch out the result

xavier

  • Forum Godess / God
  • I’m (like) Lobo ;)
  • *****
  • Posts: 4453
  • Karma: 161
    • Tech To The People
  • CiviCRM version: yes probably
  • CMS version: drupal
Re: New cache, file based
October 14, 2012, 06:23:58 am
So
Code: [Select]
grep 'Query = string' -c  ConfigAndLog/md5logfile.log
edit a contact (cleared cache)
ArrayCache: 81 queries
SerializeCache: 81
after cache
ArrayCache: 162 query (twice, expected)
SerializeCache: 131 (so 50 queries in subsequent calls instead of 81)

Seems to generate plenty of cache files, some of them have super long keys (had to md5 it otherwise too long)


Existing code issue:
Several places call a flushCache, with a param (presumably to clear only a part of the cache?)
CRM_Utils_System::flushCache('CRM_SMS_DAO_Provider');

but the System doesn't have any param, so clears all caches all the time
CRM/Utils/System.php:  static function flushCache( )

Was it done on purpose?

X+
-Hackathon and data journalism about the European parliament 24-26 jan. Watch out the result

xavier

  • Forum Godess / God
  • I’m (like) Lobo ;)
  • *****
  • Posts: 4453
  • Karma: 161
    • Tech To The People
  • CiviCRM version: yes probably
  • CMS version: drupal
Re: New cache, file based
October 14, 2012, 06:34:04 am
With the performance module, seems that the memory usage is the same between array and serialize.

As for the execution speed. I mostly think it mesures how things work when the usage is a single user on a dedicated machine loading a simple page. So works fast when everything is in memory already and the disk isn't touched ;)

Don't know how to do a useful benchmark.

X+
-Hackathon and data journalism about the European parliament 24-26 jan. Watch out the result

mathieu

  • Administrator
  • Ask me questions
  • *****
  • Posts: 620
  • Karma: 36
    • Work
  • CiviCRM version: 4.7
  • CMS version: Drupal
  • MySQL version: MariaDB 10
  • PHP version: 7
Re: New cache, file based
October 15, 2012, 05:16:05 am
It's not really a benchmark, but I find that tracing requests with php-xdebug can be useful to see where CPU time is spent in PHP: https://wiki.koumbit.net/XdebugEn

(it's how I found the gettext performance issue for localised sites)
CiviCamp Montréal, 29 septembre 2017 | Co-founder / consultant / turn-key CiviCRM hosting for Quebec/Canada @ SymbioTIC.coop

xavier

  • Forum Godess / God
  • I’m (like) Lobo ;)
  • *****
  • Posts: 4453
  • Karma: 161
    • Tech To The People
  • CiviCRM version: yes probably
  • CMS version: drupal
Re: New cache, file based
October 15, 2012, 06:23:54 am
Cool, seems to be a great too, never tried xdebug profiler so far, only the "normal" xdebug

Here the load is spread between two processes. Having the cache on file is likely going to increase the CPU/io on the php process, but it decreases the CPU/io on the mysql one.

The goal is to see if it's a worthwhile.

Here you only see the php side, right?
-Hackathon and data journalism about the European parliament 24-26 jan. Watch out the result

totten

  • Administrator
  • Ask me questions
  • *****
  • Posts: 695
  • Karma: 64
Re: New cache, file based
October 15, 2012, 12:08:27 pm
1. To what extent do we expect the performance to be consistent across different server environments?

For example, given one caching strategy (serialized records stored in MySQL), two different hosts may perform differently depending on (a) whether WWW and DBMS are on the same server, rack, data-center, region (b) whether the DBMS uses HDDs/SSDs, RAID-0/1/10, etc (c) the size of internal caches/buffers within DBMS, (d) the actual workloads (eg #concurrent users; #databases; #competing processes), etc. Given multiple caching strategies (serialized records in MySQL; serialized records in files; var_export; memcache; etc), you get a combinatorial set of considerations. It seems quite likely that one environment/workload would show MySQL to be much slower than files -- but another environment/workload would show a tie or opposite results.

Put another way: if a developer uses his laptop to demonstrate that page-loading with "read-file-and-deserialize" is faster than with "read-mysql-and-deserialize", then... what have we really proven? Is there (or is there not) a way to frame the test so that it provides useful conclusions for other environments?

2. Short of defining specific workloads and environments, the best I can think of is to isolate very small parts of the process and benchmark each, e.g.

 * Run unserialize($data) 10,000 times with 1k/10k/100k datasets -- and count milliseconds (php.net/microtime)
 * Run unserialize(SQL SELECT) 10,000 times times with 1k/10k/100k datasets -- and count milliseconds (php.net/microtime)
 * Run unserialize(file_get_contents()) 10,000 times with 1k/10k/100k datasets -- and count milliseconds (php.net/microtime)
 * Run "include" (with files produced by var_export) 10,000 times with 1k/10k/100k datasets -- and count milliseconds (php.net/microtime)

(To mitigate interference from OS/DBMS caches, you might use 10,000 different files or 10,000 different database rows -- or, if you want to assume optimistically that the OS/DBMS caches are working, then you could choose to reuse the same files/database-rows.)

Pages: [1]
  • CiviCRM Community Forums (archive) »
  • Old sections (read-only, deprecated) »
  • Developer Discussion (Moderator: Donald Lobo) »
  • New cache, file based

This forum was archived on 2017-11-26.