CiviCRM Community Forums (archive)

*

News:

Have a question about CiviCRM?
Get it answered quickly at the new
CiviCRM Stack Exchange Q+A site

This forum was archived on 25 November 2017. Learn more.
How to get involved.
What to do if you think you've found a bug.



  • CiviCRM Community Forums (archive) »
  • Old sections (read-only, deprecated) »
  • Support »
  • Using CiviCRM »
  • Using Core CiviCRM Functions (Moderator: Yashodha Chaku) »
  • Automatic Dedupe exceptions
Pages: [1]

Author Topic: Automatic Dedupe exceptions  (Read 330 times)

rocxa

  • I post occasionally
  • **
  • Posts: 40
  • Karma: 4
  • CiviCRM version: 4.5.5
  • CMS version: Drupal 7.34
  • MySQL version: 5.1.71
  • PHP version: 5.3.3
Automatic Dedupe exceptions
February 03, 2015, 07:45:05 am
Deduping records in CiviCRM is powerful with the rules/weights, but we have encountered a scenario where some extra options may help:

It would be very handy to choose to only consider matches where email matches but external id doesn't match.

i.e.
test@test.com 123456
&
test@test.com 123458

should not be considered a match.  With the dedupe rules this doesn't appear to be possible at the moment unless I am missing something obvious?

JonGold

  • Ask me questions
  • ****
  • Posts: 638
  • Karma: 81
    • Palante Technology
  • CiviCRM version: 4.1 to the latest
  • CMS version: Drupal 6-7, Wordpress 4.0+
  • PHP version: PHP 5.3-5.5
Re: Automatic Dedupe exceptions
February 03, 2015, 08:49:06 am
While I agree that generally speaking, dedupes don't work the way you describe, in your case they shouldn't need to.  External identifier is defined at the database level as unique - you can't possibly have two external identifiers that are the same.  In which case you should only need to match on e-mail. Or am I missing something?
Sign up to StackExchange and get free expert CiviCRM advice: https://civicrm.org/blogs/colemanw/get-exclusive-access-free-expert-help

rocxa

  • I post occasionally
  • **
  • Posts: 40
  • Karma: 4
  • CiviCRM version: 4.5.5
  • CMS version: Drupal 7.34
  • MySQL version: 5.1.71
  • PHP version: 5.3.3
Re: Automatic Dedupe exceptions
February 03, 2015, 09:26:25 am
Sorry if the example wasn't clear.

External identifiers will never be the same as you point out (unless you have custom fields setup for multiple external identifiers and don't make the fields unique (not an option in CiviCRM))

The example I was trying to demonstrate was how to avoid false positives in the dedupe list. (marking them all as not duplicates is tedious when an additional rule could exclude them automatically :-))

Perhaps a better example is::
If we find 100 pairs of contacts with the same email addresses, but know that 98 of them aren't duplicates because they have different external ID's to each other, it would be good to be able to exclude these from appearing as duplicates in the first place. 

In theory matches on contact records which have different external ID's should be excluded by default as they shouldn't ever be matches unless you have a data problem in the external system.

There appear to be hooks to add custom rule queries to CiviCRM but it seems the user interface for creating rules could use this feature too?

JonGold

  • Ask me questions
  • ****
  • Posts: 638
  • Karma: 81
    • Palante Technology
  • CiviCRM version: 4.1 to the latest
  • CMS version: Drupal 6-7, Wordpress 4.0+
  • PHP version: PHP 5.3-5.5
Re: Automatic Dedupe exceptions
February 03, 2015, 10:05:59 am
Ah!  I understand now.

It's actually quite common in my use of dedupes for there to be duplicates with different external identifiers.  Sometimes there are multiple external systems, and a contact might be in both.  Sometimes there's a legacy system with poor duplicate detection.

Now that I understand your use case though - I see where this can be a useful feature for some.  To be honest though, I don't think this will be prioritized by the core team, since I don't think many people will use it.  Most organizations don't have 100 contacts with the same e-mail, and if they do, they're likely to not use e-mail as their dedupe criterium.  So this would most likely have to come from a client who's invested enough to develop or fund it.

That said - the "civicrm_dedupe_exception" table is very simple, and I wouldn't hesitate to manualy update that via SQL.  So if your data has a lot of records that need to be dedupe exceptions, you can probably handle them with a single line of SQL.  You can do a "Cartesian Join" of the e-mail table to itself.  Something like:
Code: [Select]
INSERT INTO civicrm_dedupe_exception contact_id1, contact_id2 SELECT t1.contact_id, t2.contact_id FROM civicrm_email t1 JOIN civicrm_email t2 ON t1.email = t2.email WHERE t1.email = 'notdupe@email.org';
Sign up to StackExchange and get free expert CiviCRM advice: https://civicrm.org/blogs/colemanw/get-exclusive-access-free-expert-help

Pages: [1]
  • CiviCRM Community Forums (archive) »
  • Old sections (read-only, deprecated) »
  • Support »
  • Using CiviCRM »
  • Using Core CiviCRM Functions (Moderator: Yashodha Chaku) »
  • Automatic Dedupe exceptions

This forum was archived on 2017-11-26.