CiviCRM Community Forums (archive)

*

News:

Have a question about CiviCRM?
Get it answered quickly at the new
CiviCRM Stack Exchange Q+A site

This forum was archived on 25 November 2017. Learn more.
How to get involved.
What to do if you think you've found a bug.



  • CiviCRM Community Forums (archive) »
  • Old sections (read-only, deprecated) »
  • Support »
  • Using CiviCRM »
  • Using Import (Moderator: Yashodha Chaku) »
  • command-line CSV import uses ridiculous RAM
Pages: [1]

Author Topic: command-line CSV import uses ridiculous RAM  (Read 1326 times)

JonGold

  • Ask me questions
  • ****
  • Posts: 638
  • Karma: 81
    • Palante Technology
  • CiviCRM version: 4.1 to the latest
  • CMS version: Drupal 6-7, Wordpress 4.0+
  • PHP version: PHP 5.3-5.5
command-line CSV import uses ridiculous RAM
January 08, 2015, 01:04:13 pm
I'm a big fan of the command-line CSV import.  As someone who specializes in migrations, the ability to script my imports is crucial.

I've noticed that on large imports, the amount of memory PHP uses gets VERY high - I'm at the 2GB mark about 24,000 contacts into an import.  I'm happy to hack on this code (I'm probably one of its heaviest users) but I was wondering if anyone else has taken a look, or could give me some tips?  I haven't tried to optimize PHP scripts for memory before.
Sign up to StackExchange and get free expert CiviCRM advice: https://civicrm.org/blogs/colemanw/get-exclusive-access-free-expert-help

Donald Lobo

  • Administrator
  • I’m (like) Lobo ;)
  • *****
  • Posts: 15963
  • Karma: 470
    • CiviCRM site
  • CiviCRM version: 4.2+
  • CMS version: Drupal 7, Joomla 2.5+
  • MySQL version: 5.5.x
  • PHP version: 5.4.x
Re: command-line CSV import uses ridiculous RAM
January 08, 2015, 02:05:55 pm

i've not seen the script, but an easy way to reduce memory requirements is:

a: Modify the script to allow "OFFSET, LIMIT" parameters (i.e. process LIMIT rows from OFFSET line)

b: Write a wrapper script to "exec" the modified script with the approproiate OFFSET, LIMIT parameters based on the number of rows in the file

Seems like something like 2048 or 4096 might be good values for LIMIT

The other option will be to see where memory is being leaked (most likely DB_DataObject) and then ensure you free them at the right time (which might mean some fairly deep dives into parts of the codebase)

lobo
A new CiviCRM Q&A resource needs YOUR help to get started. Visit our StackExchange proposed site, sign up and vote on 5 questions

JonGold

  • Ask me questions
  • ****
  • Posts: 638
  • Karma: 81
    • Palante Technology
  • CiviCRM version: 4.1 to the latest
  • CMS version: Drupal 6-7, Wordpress 4.0+
  • PHP version: PHP 5.3-5.5
Re: command-line CSV import uses ridiculous RAM
January 08, 2015, 03:42:13 pm
Thank you Lobo!  I was thinking about something along the lines of the OFFSET, LIMIT approach.  I think I remember that the script bypasses SQL - calling fgetcsv() and then the API - but I'm glad to hear from someone more experienced that the general approach is correct.
Sign up to StackExchange and get free expert CiviCRM advice: https://civicrm.org/blogs/colemanw/get-exclusive-access-free-expert-help

xavier

  • Forum Godess / God
  • I’m (like) Lobo ;)
  • *****
  • Posts: 4453
  • Karma: 161
    • Tech To The People
  • CiviCRM version: yes probably
  • CMS version: drupal
Re: command-line CSV import uses ridiculous RAM
January 09, 2015, 04:19:09 am
We try to unset the dao objects in the api call, might be one missing? what entity are you using? contact?

With xdebug on, you might get a better view of where the memory is used/what to unset to improve.

Quickly looking at bin/cli.class.php, nothing stands out as variable that wouldn't be released, might be something in the lower levels (api/bao/dao...)

X+
-Hackathon and data journalism about the European parliament 24-26 jan. Watch out the result

JonGold

  • Ask me questions
  • ****
  • Posts: 638
  • Karma: 81
    • Palante Technology
  • CiviCRM version: 4.1 to the latest
  • CMS version: Drupal 6-7, Wordpress 4.0+
  • PHP version: PHP 5.3-5.5
Re: command-line CSV import uses ridiculous RAM
January 09, 2015, 08:32:21 am
Xavier,

I've noticed this behavior on both Contact and Contribution import.  That suggests to me that perhaps the DAO is the place to look, since the auto-generated code is more likely to have a shared problem - not sure if that thinking is sound!

I've used XDebug for step-by-step debugging, but not for profiling.  I'll put this on my list for when work quiets down.
Sign up to StackExchange and get free expert CiviCRM advice: https://civicrm.org/blogs/colemanw/get-exclusive-access-free-expert-help

xavier

  • Forum Godess / God
  • I’m (like) Lobo ;)
  • *****
  • Posts: 4453
  • Karma: 161
    • Tech To The People
  • CiviCRM version: yes probably
  • CMS version: drupal
Re: command-line CSV import uses ridiculous RAM
January 09, 2015, 09:00:21 am
grr, not sure then, the api is supposed to take extra care of releasing the memory in these fairly often used apis. This being said pear dataobject has an unhealthy love with stuffing things into global variables...
-Hackathon and data journalism about the European parliament 24-26 jan. Watch out the result

magnolia61

  • I post occasionally
  • **
  • Posts: 37
  • Karma: 0
  • CiviCRM version: 4.5.5
  • CMS version: Drupal 7.34 / Joomla 3.3.6
  • MySQL version: MySQL 5.5.40
  • PHP version: PHP 5.5.19
Re: command-line CSV import uses ridiculous RAM
March 16, 2015, 05:43:56 am
Hello Xavier,
I run into the similar problems with importing from the gui.
Would you have an example of an csv command line import script that we can use to jump from?

Regards, RIchard

JonGold

  • Ask me questions
  • ****
  • Posts: 638
  • Karma: 81
    • Palante Technology
  • CiviCRM version: 4.1 to the latest
  • CMS version: Drupal 6-7, Wordpress 4.0+
  • PHP version: PHP 5.3-5.5
Re: command-line CSV import uses ridiculous RAM
March 16, 2015, 02:30:08 pm
Richard,

Check this out, and then let us know if you have any other questions!
https://civicrm.org/blogs/xavier/api_batch_tools
Sign up to StackExchange and get free expert CiviCRM advice: https://civicrm.org/blogs/colemanw/get-exclusive-access-free-expert-help

Pages: [1]
  • CiviCRM Community Forums (archive) »
  • Old sections (read-only, deprecated) »
  • Support »
  • Using CiviCRM »
  • Using Import (Moderator: Yashodha Chaku) »
  • command-line CSV import uses ridiculous RAM

This forum was archived on 2017-11-26.