CiviCRM Community Forums (archive)

*

News:

Have a question about CiviCRM?
Get it answered quickly at the new
CiviCRM Stack Exchange Q+A site

This forum was archived on 25 November 2017. Learn more.
How to get involved.
What to do if you think you've found a bug.



  • CiviCRM Community Forums (archive) »
  • Old sections (read-only, deprecated) »
  • Developer Discussion »
  • APIs and Hooks (Moderator: Donald Lobo) »
  • contact API - dedupe
Pages: [1] 2

Author Topic: contact API - dedupe  (Read 4444 times)

lcdweb

  • Forum Godess / God
  • I live on this forum
  • *****
  • Posts: 1620
  • Karma: 116
    • www.lcdservices.biz
  • CiviCRM version: many versions...
  • CMS version: Joomla/Drupal
  • MySQL version: 5.1+
  • PHP version: 5.2+
contact API - dedupe
August 19, 2011, 08:01:45 am
we added a param in the v2 api so that you could specify what dedupe rule was to be used during contact create with dedupe. it looks like that option was lost with v3 api
support CiviCRM through 'make it happen' initiatives!
http://civicrm.org/mih

Eileen

  • Forum Godess / God
  • I’m (like) Lobo ;)
  • *****
  • Posts: 4195
  • Karma: 218
    • Fuzion
Re: contact API - dedupe
August 20, 2011, 01:47:11 am
Hmm - that does sound like a useful option. But if we're going to re-add it we'll need a test on it I think.

NB - we would like at some point to add dedupe actions to the contact api
Make today the day you step up to support CiviCRM and all the amazing organisations that are using it to improve our world - http://civicrm.org/contribute

Michael McAndrew

  • Forum Godess / God
  • I live on this forum
  • *****
  • Posts: 1274
  • Karma: 55
    • Third Sector Design
  • CiviCRM version: various
  • CMS version: Nearly always Drupal
  • MySQL version: 5.5
  • PHP version: 5.3
Re: contact API - dedupe
February 14, 2012, 10:03:03 am
Hey guys,

Exposing dedupe via the API is definitley a super cool idea.  One for a code sprint maybe?!
Service providers: Grow your business, build your reputation and support CiviCRM. Become a partner today

fen

  • I post frequently
  • ***
  • Posts: 216
  • Karma: 13
    • CivicActions
  • CiviCRM version: 3.3-4.3
  • CMS version: Drupal 6/7
  • MySQL version: 5.1/5.5
  • PHP version: 5.3/5.4
Re: contact API - dedupe
February 15, 2012, 09:03:11 am
trunk now has civicrm_api3_contact_merge() in api/v3/Contact.php and civicrm_api3_process_batch_merge() in api/v3/Job.php

I'm starting to look at these (and their associated batch dedupe/merge facility in the GUI, see CRM-9312 for more) for use by one of my clients, but am currently feeling at a loss wrt to how to use these excellent sounding features.

Might anyone be able to:
  • describe what situations these are/are not best suited for?
  • suggest best practices for their use?

For my part, I can:
  • test these features on our dataset
  • document methods of use and possible issues

lcdweb

  • Forum Godess / God
  • I live on this forum
  • *****
  • Posts: 1620
  • Karma: 116
    • www.lcdservices.biz
  • CiviCRM version: many versions...
  • CMS version: Joomla/Drupal
  • MySQL version: 5.1+
  • PHP version: 5.2+
Re: contact API - dedupe
February 15, 2012, 09:37:22 am
the api is available if you want to write a script that will bulk merge your contacts. it basically does the same thing as triggering a bulk merge from the interface.

bulk merge works by reviewing the dupe contacts identified by a dedupe rule, and then doing a "safe" merge -- a safe merge is one where there are no fields in direct conflict -- i.e. it operates using a fill process only.

you can use the merge hook to impact that behavior. for example, if job_title is in conflict, and you don't really care which value is retained, you can unset the conflict and allow the bulk merge to proceed. you can view our implementation of bulk merge rules here: https://github.com/nysenate/Bluebird-CRM/blob/dev/modules/nyss_massmerge/nyss_massmerge.module
support CiviCRM through 'make it happen' initiatives!
http://civicrm.org/mih

daven

  • I’m new here
  • *
  • Posts: 5
  • Karma: 0
  • CiviCRM version: 3.4
  • CMS version: Drupal 6
  • MySQL version: 5.1
  • PHP version: 5.3 / 5.2
Re: contact API - dedupe
February 15, 2012, 06:30:31 pm
Thanks lcdweb, that's a helpful example module. The hook offers a lot of options for customizing the merging.

It wasn't immediately obvious to me how to actually make use of the batch merge. There's a new button called "Batch Merge Duplicates" at the bottom of the list of duplicates when you do a dupe search. Simple! So far, very impressed with the new version.

Michael McAndrew

  • Forum Godess / God
  • I live on this forum
  • *****
  • Posts: 1274
  • Karma: 55
    • Third Sector Design
  • CiviCRM version: various
  • CMS version: Nearly always Drupal
  • MySQL version: 5.5
  • PHP version: 5.3
Re: contact API - dedupe
February 16, 2012, 11:01:53 am
Hey there,

Wondering how merge and import would interact.  To my mind they are seperate use cases.

import = something like

Check to see if the contact is there already.

If they are, do a get

If they are NOT, do an add (or maybe a merge!)

But the check bit is the bit that isn't covered by merge.  Or am I missing something? (Admittedly, I haven't looked at the merge code).
Service providers: Grow your business, build your reputation and support CiviCRM. Become a partner today

fen

  • I post frequently
  • ***
  • Posts: 216
  • Karma: 13
    • CivicActions
  • CiviCRM version: 3.3-4.3
  • CMS version: Drupal 6/7
  • MySQL version: 5.1/5.5
  • PHP version: 5.3/5.4
Re: contact API - dedupe
March 06, 2012, 03:13:41 pm
The batch/merge is (so far) working great, though still in testing mode.  I made a small change to the api call doc (and switched the main_id, other_id params which were backward) to make more sense and attach the patch here (didn't feel it was significant enough to create a new ticket).

fen

  • I post frequently
  • ***
  • Posts: 216
  • Karma: 13
    • CivicActions
  • CiviCRM version: 3.3-4.3
  • CMS version: Drupal 6/7
  • MySQL version: 5.1/5.5
  • PHP version: 5.3/5.4
Re: contact API - dedupe
March 07, 2012, 01:42:30 pm
I know this code hasn't been released yet, but I've found an issue that (currently) has me stumped: I'm calling $result = civicrm_api("Contact",'merge',$params); with two existing (duplicate) IDs and usually it works, but sometimes it silently fails; that is, it does nothing and returns with $result['is_error'] == 0.

Would be better if it failed with some sort of error message or at least is_error != 0.

I'm digging, but if you have any ideas where to look for where/how it may be failing, I'd appreciate the suggestions.

fen

  • I post frequently
  • ***
  • Posts: 216
  • Karma: 13
    • CivicActions
  • CiviCRM version: 3.3-4.3
  • CMS version: Drupal 6/7
  • MySQL version: 5.1/5.5
  • PHP version: 5.3/5.4
Re: contact API - dedupe
March 07, 2012, 01:57:26 pm
Found the issue: CRM_Dedupe_Merger::merge returns $resultStats and since the API checks for $result['is_error'] == 0 but since $result['is_error'] is not set, the API wrapper considers this a pass.

xavier

  • Forum Godess / God
  • I’m (like) Lobo ;)
  • *****
  • Posts: 4453
  • Karma: 161
    • Tech To The People
  • CiviCRM version: yes probably
  • CMS version: drupal
Re: contact API - dedupe
March 07, 2012, 02:04:40 pm
Great,

and this API sounds like a great addition, thanks to all involved.

On a separate thread, had a comment about dedupe_check not working. Have you used/added a api.contact.CheckDedupe or something along that line?

X+
-Hackathon and data journalism about the European parliament 24-26 jan. Watch out the result

fen

  • I post frequently
  • ***
  • Posts: 216
  • Karma: 13
    • CivicActions
  • CiviCRM version: 3.3-4.3
  • CMS version: Drupal 6/7
  • MySQL version: 5.1/5.5
  • PHP version: 5.3/5.4
Re: contact API - dedupe
March 26, 2012, 09:24:26 am
I don't know the dedupe_check() function, so I can't help there :(

I've been working on a script to take a list of dupes returned from a mailhouse (which used data exported from CiviCRM) and merging them.  One question I am currently up against is the merging of addresses.  Even if all the values in the source (other_id) address == all the values in destination (main_id) address (but their address ids are different) we end up with a merged contact with two addresses, the src_address being placed in location_type_id == 2 for the dest contact.

I'm thinking of checking for address_match in the batchmerge hook and removing the older address before the merge, but this is dangerous because if the merge fails then I've lost an address.  Is there some way to make this happen more safely that I am missing?

Also, and perhaps more importantly, the merge process keeps the destination primary address as primary, whereas our experience is that whichever address was most recently created (or changed) is generally the correct address and should be primary.  This should be easier to add to the batchmerge_hook, though perhaps it should be default functionality?
« Last Edit: March 26, 2012, 09:57:17 am by fen »

xavier

  • Forum Godess / God
  • I’m (like) Lobo ;)
  • *****
  • Posts: 4453
  • Karma: 161
    • Tech To The People
  • CiviCRM version: yes probably
  • CMS version: drupal
Re: contact API - dedupe
March 26, 2012, 09:37:36 am
Quote from: fen on March 26, 2012, 09:24:26 am
One question I am currently up against is the merging of addresses.  Even if all the values in src_address == all the values in dest_address (but their address_ids are different) we end up with a merged contact with two addresses, the src_address being placed in location_type_id == 2 for the dest contact.

It might be that some stuff (eg the geo data) are different, even if not displayed?

Or it might be that the dedupe code has a bug. Would be great if you could fix it, might be easier as well and more stable than hooking/patching your way through it ;)

X+
-Hackathon and data journalism about the European parliament 24-26 jan. Watch out the result

lcdweb

  • Forum Godess / God
  • I live on this forum
  • *****
  • Posts: 1620
  • Karma: 116
    • www.lcdservices.biz
  • CiviCRM version: many versions...
  • CMS version: Joomla/Drupal
  • MySQL version: 5.1+
  • PHP version: 5.2+
Re: contact API - dedupe
March 26, 2012, 09:48:01 am
yes, this is a consequence of the "safe" batch merge.
i've chewed over options as well, as it impacts email/phone/etc. -- any of the 1-to-many records on the contact object.

my current thought is to create a cleanup script to be run post merge where I compare and remove duplicate records on a contact. the batch merge is pretty slow as it is. if we have to also do a lookup a comparison during the merge, it will really slow things down. a post import script could be done with straight sql and be much faster.

on the positive side --
because it adds these records instead of comparing and merging, you pick up a lot more merges. if we were comparing them, there'd typically be a ton more conflicts that would block the merge.
support CiviCRM through 'make it happen' initiatives!
http://civicrm.org/mih

fen

  • I post frequently
  • ***
  • Posts: 216
  • Karma: 13
    • CivicActions
  • CiviCRM version: 3.3-4.3
  • CMS version: Drupal 6/7
  • MySQL version: 5.1/5.5
  • PHP version: 5.3/5.4
Re: contact API - dedupe
March 26, 2012, 10:00:00 am
Thanks for both your answers!  I just added this to the original question (as I had not updated my window to see your quick replies):
Quote
Also, and perhaps more importantly, the merge process keeps the destination primary address as primary, whereas our experience is that whichever address was most recently created (or changed) is generally the correct address and should be primary.  This should be easier to add to the batchmerge_hook, though perhaps it should be default functionality?

Pages: [1] 2
  • CiviCRM Community Forums (archive) »
  • Old sections (read-only, deprecated) »
  • Developer Discussion »
  • APIs and Hooks (Moderator: Donald Lobo) »
  • contact API - dedupe

This forum was archived on 2017-11-26.