CiviCRM Community Forums (archive)

*

News:

Have a question about CiviCRM?
Get it answered quickly at the new
CiviCRM Stack Exchange Q+A site

This forum was archived on 25 November 2017. Learn more.
How to get involved.
What to do if you think you've found a bug.



  • CiviCRM Community Forums (archive) »
  • Old sections (read-only, deprecated) »
  • Support »
  • Using CiviCRM »
  • Using Import (Moderator: Yashodha Chaku) »
  • Writing custom import program.
Pages: [1]

Author Topic: Writing custom import program.  (Read 3543 times)

nmiracle

  • Guest
Writing custom import program.
August 01, 2008, 02:53:48 pm
We are a new user of CiviCRM and have a database of about a million records we need to import.   Our database is in mySQL on the same server where we ae running civiCRM. 

Initially, I wrote a program that dumped the DB into csv files that were just a bit smaller than 2Mb and we have proven that those easily import.  Unfortunately, there are almost 300 of them and bringing them in is REALLY boring and very slow.

I ended up writing a little script that would read our database and insert the records directly into the civiCRM tables civicrm_contact, civicrm_address, civicrm_phone  and civicrm_email as well as the tables of custom fields that we added.  The script was written in perl and my original question to the group here was about the field civicrm_contact.hash (specifically, what should be there -- it looked like a random value) 

For whatever it is worth, I ended up using Digest::MD5::md5_hex on a field that held our original file key (and which is known to be unique) and the records seem to work fine in the application.

The import process, done this way, seems to move along pretty well -- about a thousand records per minute.  It won't be finished until morning, but it does seem to be quite a bit faster than our earlier approach with the csv files.

Best,
Nancy Miracle
« Last Edit: August 01, 2008, 08:56:07 pm by nmiracle »

Donald Lobo

  • Administrator
  • I’m (like) Lobo ;)
  • *****
  • Posts: 15963
  • Karma: 470
    • CiviCRM site
  • CiviCRM version: 4.2+
  • CMS version: Drupal 7, Joomla 2.5+
  • MySQL version: 5.5.x
  • PHP version: 5.4.x
Re: Need information to write a custom import program.
August 01, 2008, 08:58:13 pm

I deleted the duplciate forum post since this is a better place

1. Your understanding of what is going on is correct

2. yes, we only care about uniqueness. We use this hash in civimail and in profile when generating urls

There are a couple of other forum topics about optimizing import

http://forum.civicrm.org/index.php/topic,4148.html
http://forum.civicrm.org/index.php/topic,4105.msg18178.html#msg18178

I think the option of making import driven by a shell script is relatively quite easy and makes things like your case more manageable.

lobo
A new CiviCRM Q&A resource needs YOUR help to get started. Visit our StackExchange proposed site, sign up and vote on 5 questions

nmiracle

  • Guest
Re: Writing custom import program.
August 01, 2008, 09:24:57 pm
Thank you very much!   I wasn't quite sure where to put the post and I apologize for having it two places.

That first forum topic (4148) sounds interesting, but I'm not enough of a php coder to understand all the niceties of what is being discussed there.  The idea of copying from another mySQL table to a temporary table in a known standard format and then using a standard processing module seems to be very reasonable (if I understand what is being said there)   

I'm not sure I understand 'the new system will apply transformations to the entire table at once whenever possible'  Is this (for instance) the sort of case where if the source database was using char(1) fields with "Y" or "N" and the equivalent field in CiviCRM is Boolean, to use the SQL statements such as "UPDATE database.tablename SET targetField="1" where targetField="Y"' or something like that?

Again, thanks very much for the help.
Nancy

jcims

  • Guest
Re: Writing custom import program.
September 17, 2008, 05:34:32 am
Hi Nancy,

Did you find it relatively straightforward to import the records into the database directly?  I'm having some significant performance (and resulting reliability) issues with import and since i'm just 'bootstrapping' the database, i have the freedom of ignoring dupe checking and other features that the import process provides.  Thank you for sharing your notes, the hash detail will be very useful.

jcims

  • Guest
Re: Writing custom import program.
September 17, 2008, 05:40:32 am
Nevermind Nancy,

I found your comments in the linked thread above...  Thank you for all the detail you've provided.

kevcol

  • I’m new here
  • *
  • Posts: 9
  • Karma: 0
Re: Need information to write a custom import program.
September 21, 2009, 12:53:20 pm
Quote from: Donald Lobo on August 01, 2008, 08:58:13 pm


2. yes, we only care about uniqueness. We use this hash in civimail and in profile when generating urls


lobo

So is there a good way to guarantee uniqueness?  We know the data we're importing is unique (using MD5 to create 32-character hash id), but when we add new contacts in the future, does CiviCRM check to make sure it's not generating a number we manually generated already?  Odds are pretty slim on that, I suppose, but I imagine things could get screwy if we had duplicates.


Donald Lobo

  • Administrator
  • I’m (like) Lobo ;)
  • *****
  • Posts: 15963
  • Karma: 470
    • CiviCRM site
  • CiviCRM version: 4.2+
  • CMS version: Drupal 7, Joomla 2.5+
  • MySQL version: 5.5.x
  • PHP version: 5.4.x
Re: Writing custom import program.
September 21, 2009, 01:57:41 pm

the column (hash and external identifier) in the civicrm_contact table have a unique contraint against it. So the DB takes care of this for us :)

Most of the code does check that no duplicate exists before inserting a record

lobo
A new CiviCRM Q&A resource needs YOUR help to get started. Visit our StackExchange proposed site, sign up and vote on 5 questions

kevcol

  • I’m new here
  • *
  • Posts: 9
  • Karma: 0
Re: Writing custom import program.
September 21, 2009, 02:04:30 pm
Thank you, sir.  So we will proceed with confidence.  ;D

Pages: [1]
  • CiviCRM Community Forums (archive) »
  • Old sections (read-only, deprecated) »
  • Support »
  • Using CiviCRM »
  • Using Import (Moderator: Yashodha Chaku) »
  • Writing custom import program.

This forum was archived on 2017-11-26.