CiviCRM Community Forums (archive)

*

News:

Have a question about CiviCRM?
Get it answered quickly at the new
CiviCRM Stack Exchange Q+A site

This forum was archived on 25 November 2017. Learn more.
How to get involved.
What to do if you think you've found a bug.



  • CiviCRM Community Forums (archive) »
  • Old sections (read-only, deprecated) »
  • Support »
  • Using CiviCRM »
  • Using Import (Moderator: Yashodha Chaku) »
  • Importing two record CSV file causes fatal memory limit error
Pages: [1]

Author Topic: Importing two record CSV file causes fatal memory limit error  (Read 2077 times)

dalin

  • I post occasionally
  • **
  • Posts: 89
  • Karma: 8
  • CiviCRM version: many
  • CMS version: Drupal 6
  • MySQL version: 5.0
  • PHP version: 5.2
Importing two record CSV file causes fatal memory limit error
January 23, 2011, 10:59:25 pm
The CSV is very simple and contains only two records.  Fields:
email - home
first
last
address - home
city - home
state - home
postal code - home
gender
source
1 custom field

Memory limit is 96M.  Error happens almost immediately upon starting the import (not the preview) and regardless of which dedupe option is chosen.  Geo-coding on import was not chosen. Installation contains about 700k contacts.   If I enable the devel module it shows that steps 1-3 are only using ~15-20MB.

I then hard-coded the following at the end of CRM_Import_ImportJob->__construct()
ini_set('memory_limit', '192M');

On my laptop the import (Using "fill" deduping) proceeds, but it pins the CPU and takes > 15 mins.  top and mytop shows no active queries, it's all on the PHP side. But on the server it still gets a memory limit fatal error. On the server I had to increase to 512M!

Both the strict and default fuzzy dedupe rules are just email address.

CiviCRM 3.3.1
--
Dave Hansen-Lange
Web Developer
Advomatic LLC
http://advomatic.com
Hong Kong office

xavier

  • Forum Godess / God
  • I’m (like) Lobo ;)
  • *****
  • Posts: 4453
  • Karma: 161
    • Tech To The People
  • CiviCRM version: yes probably
  • CMS version: drupal
Re: Importing two record CSV file causes fatal memory limit error
January 23, 2011, 11:08:01 pm
hi,

Imports don't scale that well, but never had problems under a few 1000th records.

Is there something different in your install (eg. chinese as default) or in the file ?

Can you anonymise the records and try on demo so we can exclude a config problem?

-Hackathon and data journalism about the European parliament 24-26 jan. Watch out the result

dalin

  • I post occasionally
  • **
  • Posts: 89
  • Karma: 8
  • CiviCRM version: many
  • CMS version: Drupal 6
  • MySQL version: 5.0
  • PHP version: 5.2
Re: Importing two record CSV file causes fatal memory limit error
January 23, 2011, 11:51:31 pm
Well I assume that the problem is somehow related to the large number of contacts already in CiviCRM (700k).  Note that the CSV file that I'm importing is only 2 contacts. 

Unless I'm missing something obvious I don't see how I could replicate this on the demo site.  I'd need to anonymize the 700k records, split them into a few hundred CSV files, and import the files one-by-one. 
--
Dave Hansen-Lange
Web Developer
Advomatic LLC
http://advomatic.com
Hong Kong office

xavier

  • Forum Godess / God
  • I’m (like) Lobo ;)
  • *****
  • Posts: 4453
  • Karma: 161
    • Tech To The People
  • CiviCRM version: yes probably
  • CMS version: drupal
Re: Importing two record CSV file causes fatal memory limit error
January 24, 2011, 12:39:14 am
Hi,

What version are you on? The dedupe process has been improved recently.

(ie. can you make a normal dedupe without freezing the system)?

Can you log the slow queries on your laptop?

X+

P.S. Yeap, replied to you before my first coffee and missed the k after the 700 ;(
-Hackathon and data journalism about the European parliament 24-26 jan. Watch out the result

dalin

  • I post occasionally
  • **
  • Posts: 89
  • Karma: 8
  • CiviCRM version: many
  • CMS version: Drupal 6
  • MySQL version: 5.0
  • PHP version: 5.2
Re: Importing two record CSV file causes fatal memory limit error
January 24, 2011, 12:44:44 am
CiviCRM 3.3.1

There's no slow queries, it's all on the PHP side.

Yes I can do a dedupe with the default rules with no problems.  If I run a dedupe using the group that holds just the two contacts in this import it takes < 30 seconds.  If I run a dedupe of all 700k contacts it also runs in about 30 seconds. 

So it's not anything to do with deduping, it's the import that's the issue.
--
Dave Hansen-Lange
Web Developer
Advomatic LLC
http://advomatic.com
Hong Kong office

xavier

  • Forum Godess / God
  • I’m (like) Lobo ;)
  • *****
  • Posts: 4453
  • Karma: 161
    • Tech To The People
  • CiviCRM version: yes probably
  • CMS version: drupal
Re: Importing two record CSV file causes fatal memory limit error
January 24, 2011, 01:52:50 am
No idea then (btw, I use innotop instead of mytop, find the output more useful).

Can you do a last test I can think of is to do the import without doing any dedupe check?

The import is fairly simple, it creates a temporary table with the data to be imported, read them, check the dedupe and insert the record. Fill is obviously the most complicated of the modes, but not that complicated. Trying to exclude the obvious.

Do you have xdebug installed on your laptop ? The profiling might give you ideas of what the process is doing
X+
-Hackathon and data journalism about the European parliament 24-26 jan. Watch out the result

dalin

  • I post occasionally
  • **
  • Posts: 89
  • Karma: 8
  • CiviCRM version: many
  • CMS version: Drupal 6
  • MySQL version: 5.0
  • PHP version: 5.2
Re: Importing two record CSV file causes fatal memory limit error
January 24, 2011, 11:05:00 pm
It was quite difficult to get profiling working with CiviCRM until I discovered that you could initiate it with a cookie rather than just the usual URL query parameter.  Once I got profiling working it was even tougher to work with a grind of a 15 min process.  The grind was > 1.2GB and every click in wincachegrind took 10 mins to process. 

I did manage to narrow down the problem:
- PEAR DB is kinda crappy
- The import process rebuilds the ACL cache both before and after the import even if ACLs are not being used. 
- The ACL cache needs an architectural overhaul - it won't scale.

http://issues.civicrm.org/jira/browse/CRM-7444
--
Dave Hansen-Lange
Web Developer
Advomatic LLC
http://advomatic.com
Hong Kong office

xavier

  • Forum Godess / God
  • I’m (like) Lobo ;)
  • *****
  • Posts: 4453
  • Karma: 161
    • Tech To The People
  • CiviCRM version: yes probably
  • CMS version: drupal
Re: Importing two record CSV file causes fatal memory limit error
January 24, 2011, 11:17:56 pm
I did manage to narrow down the problem:
Quote from: dalin on January 24, 2011, 11:05:00 pm
- PEAR DB is kinda crappy

Can't say I really disagree ;). Did you send them your patch already ?

Quote from: dalin on January 24, 2011, 11:05:00 pm
- The import process rebuilds the ACL cache both before and after the import even if ACLs are not being used. 
- The ACL cache needs an architectural overhaul - it won't scale.
http://issues.civicrm.org/jira/browse/CRM-7444

I'm impressed by your work on that one. What's the time/memory used after all these improvments ?

X+
-Hackathon and data journalism about the European parliament 24-26 jan. Watch out the result

dalin

  • I post occasionally
  • **
  • Posts: 89
  • Karma: 8
  • CiviCRM version: many
  • CMS version: Drupal 6
  • MySQL version: 5.0
  • PHP version: 5.2
Re: Importing two record CSV file causes fatal memory limit error
January 24, 2011, 11:58:37 pm
Quote
I'm impressed by your work on that one. What's the time/memory used after all these improvments ?

Time was down to a few seconds.  I didn't check the memory usage.
--
Dave Hansen-Lange
Web Developer
Advomatic LLC
http://advomatic.com
Hong Kong office

Donald Lobo

  • Administrator
  • I’m (like) Lobo ;)
  • *****
  • Posts: 15963
  • Karma: 470
    • CiviCRM site
  • CiviCRM version: 4.2+
  • CMS version: Drupal 7, Joomla 2.5+
  • MySQL version: 5.5.x
  • PHP version: 5.4.x
Re: Importing two record CSV file causes fatal memory limit error
January 25, 2011, 06:47:32 am

so was rebuilding the ACL cache taking 10+ mins? Any thoughts on what are the major bottlenecks in the ACL cache code?

D7's DBTNG is being made independent of Drupal. It already has its own independent github repo.

Would be great if we can replace PEAR DB/DBO with DBTNG. Any interest in leading this effort? Would be great for a group of community developers to take this on and start working on the switch.

lobo


A new CiviCRM Q&A resource needs YOUR help to get started. Visit our StackExchange proposed site, sign up and vote on 5 questions

dalin

  • I post occasionally
  • **
  • Posts: 89
  • Karma: 8
  • CiviCRM version: many
  • CMS version: Drupal 6
  • MySQL version: 5.0
  • PHP version: 5.2
Re: Importing two record CSV file causes fatal memory limit error
January 25, 2011, 04:05:14 pm
Quote from: Donald Lobo on January 25, 2011, 06:47:32 am
so was rebuilding the ACL cache taking 10+ mins? Any thoughts on what are the major bottlenecks in the ACL cache code?

Yes it was taking about 8mins to rebuild * 2.  The major performance bottleneck is the fact that we have an array with 700k elements and we do a loop to break pieces of 200 off the front.  This is very CPU intensive, especially for a large array.  It would be much faster to simply read from the array rather than continually change it.  The memory problem is caused by simply having the huge array.  It would be fastest to do the whole process purely in SQL (but of course that would be difficult to maintain).  Second best might be to encapsulate the whole thing in another loop to iterate over batches of say 1000 or so.  But this brings up a bigger point that CiviCRM really needs Batch and Queue APIs a la Drupal. 

Quote
D7's DBTNG is being made independent of Drupal. It already has its own independent github repo.

Would be great if we can replace PEAR DB/DBO with DBTNG. Any interest in leading this effort? Would be great for a group of community developers to take this on and start working on the switch.

That would be real sweet.  I've really enjoyed working with DBTNG so far.  Just the fact that it's very well documented would be a huge improvement.  But I don't think I'm able to increase my current commitment level (which currently averages to be about a patch every month or two). 
--
Dave Hansen-Lange
Web Developer
Advomatic LLC
http://advomatic.com
Hong Kong office

Donald Lobo

  • Administrator
  • I’m (like) Lobo ;)
  • *****
  • Posts: 15963
  • Karma: 470
    • CiviCRM site
  • CiviCRM version: 4.2+
  • CMS version: Drupal 7, Joomla 2.5+
  • MySQL version: 5.5.x
  • PHP version: 5.4.x
Re: Importing two record CSV file causes fatal memory limit error
January 25, 2011, 05:13:03 pm

I dont have the time to test it, but the ACL DB code can potentially be optimized by this patch. This moves the work to the mysql server and hence also saves memory.

Code: [Select]
Index: CRM/Contact/BAO/Contact/Permission.php
===================================================================
--- CRM/Contact/BAO/Contact/Permission.php (revision 32023)
+++ CRM/Contact/BAO/Contact/Permission.php (working copy)
@@ -133,25 +133,12 @@
         $from       = CRM_Contact_BAO_Query::fromClause( $whereTables );
 
         $query = "
-SELECT DISTINCT(contact_a.id) as id
+DELETE FROM civicrm_acl_contact_cache WHERE user_id = $userID;
+INSERT INTO civicrm_acl_contact_cache ( user_id, contact_id, operation )
+SELECT $userID, contact_a.id, '{$operation}'
        $from
 WHERE $permission
 ";
 
-        $values = array( );
-        $dao = CRM_Core_DAO::executeQuery( $query );
-        while ( $dao->fetch( ) ) {
-            $values[] = "( {$userID}, {$dao->id}, '{$operation}' )";
-        }
-
-        // now store this in the table
-        while ( ! empty( $values ) ) {
-            $processed = true;
-            $input = array_splice( $values, 0, self::NUM_CONTACTS_TO_INSERT );
-            $str   = implode( ',', $input );
-            $sql = "REPLACE INTO civicrm_acl_contact_cache ( user_id, contact_id, operation ) VALUES $str;";
-            CRM_Core_DAO::executeQuery( $sql );
-        }
         CRM_Core_DAO::executeQuery('DELETE FROM civicrm_acl_contact_cache WHERE contact_id IN (SELECT id FROM civicrm_contact WHERE is_deleted = 1)');
 
         $_processed[$userID] = 1;

2. We are on the lookout for folks interested in helping decide/lead/participate in the DBTNG conversion process. It is a fair amount of work and we do need help and support from consulting orgs and the community to make this happen

lobo
A new CiviCRM Q&A resource needs YOUR help to get started. Visit our StackExchange proposed site, sign up and vote on 5 questions

Pages: [1]
  • CiviCRM Community Forums (archive) »
  • Old sections (read-only, deprecated) »
  • Support »
  • Using CiviCRM »
  • Using Import (Moderator: Yashodha Chaku) »
  • Importing two record CSV file causes fatal memory limit error

This forum was archived on 2017-11-26.