CiviCRM Community Forums (archive)

*

News:

Have a question about CiviCRM?
Get it answered quickly at the new
CiviCRM Stack Exchange Q+A site

This forum was archived on 25 November 2017. Learn more.
How to get involved.
What to do if you think you've found a bug.



  • CiviCRM Community Forums (archive) »
  • Old sections (read-only, deprecated) »
  • Support »
  • Using CiviCRM »
  • Using Core CiviCRM Functions (Moderator: Yashodha Chaku) »
  • Deduping in 1.8 - lack of contextual info to help decide which record 2 keep
Pages: [1]

Author Topic: Deduping in 1.8 - lack of contextual info to help decide which record 2 keep  (Read 1853 times)

petednz

  • Forum Godess / God
  • I’m (like) Lobo ;)
  • *****
  • Posts: 4899
  • Karma: 193
    • Fuzion
  • CiviCRM version: 3.x - 4.x
  • CMS version: Drupal 6 and 7
Deduping in 1.8 - lack of contextual info to help decide which record 2 keep
August 07, 2007, 04:55:02 pm
This is from recollection as I can't implement the problem but when I tried out the 'find matching contact' i found that when it did find two similar contacts the system didn't show me the other information (eg address phone etc) about the two contacts for me to be able to decide whether the 'new contact' was indeed a replicate contact.
Also - having set the rules for Individual to be
First Name (length 1) Weight 5
Surname Weight 7
Email (lenght 0)

Weight threshold to be match 10

I would expect that given I have a David Smith and a Letitia Smith, when I add a D Smith, it would warn me of the potential duplicate of John Smith (and show me other details about that person) - but it actually says "2 matching contacts were found. You can edit them here:John Smith, Mary Smith"

Is it because it is finding the 'a' in both names?

If so how can I run a check for someone where I have their initial on one doc and the full name in the database
Sign up to StackExchange and get free expert advice: https://civicrm.org/blogs/colemanw/get-exclusive-access-free-expert-help

pete davis : www.fuzion.co.nz : connect + campaign + communicate

petednz

  • Forum Godess / God
  • I’m (like) Lobo ;)
  • *****
  • Posts: 4899
  • Karma: 193
    • Fuzion
  • CiviCRM version: 3.x - 4.x
  • CMS version: Drupal 6 and 7
Re: Deduping in 1.8 - lack of contextual info to help decide which record 2 keep
August 07, 2007, 04:58:32 pm
Nuggh - just test with V Smith which of course would only match with David but it still brought up David and Letitia. Where have i gone wrong?
Sign up to StackExchange and get free expert advice: https://civicrm.org/blogs/colemanw/get-exclusive-access-free-expert-help

pete davis : www.fuzion.co.nz : connect + campaign + communicate

Dave Greenberg

  • Administrator
  • I’m (like) Lobo ;)
  • *****
  • Posts: 5760
  • Karma: 226
    • My CiviCRM Blog
Re: Deduping in 1.8 - lack of contextual info to help decide which record 2 keep
August 08, 2007, 10:30:56 am
Peter - The person who designed and wrote the dedupe code (Piotr) is back from leave tonight. I've forwarded him this thread and asked him to look at your issues / use cases with you. Thx for working with us on this!
Protect your investment in CiviCRM by  becoming a Member!

Piotr Szotkowski

  • I live on this forum
  • *****
  • Posts: 1497
  • Karma: 57
Re: Deduping in 1.8 - lack of contextual info to help decide which record 2 keep
August 09, 2007, 05:15:46 am
Quote from: peterd on August 07, 2007, 04:55:02 pm
This is from recollection as I can't implement the problem but when I tried out the 'find matching contact' i found that when it did find two similar contacts the system didn't show me the other information (eg address phone etc) about the two contacts for me to be able to decide whether the 'new contact' was indeed a replicate contact.

Hmm, the merge screen (with the two contacts side-by-side) should show all the info that differs between the two contacts, as well as all their locations. If that’s not so, that’s clearly a bug; can you please replicate it on the demo and file an issue.

Quote from: peterd on August 07, 2007, 04:55:02 pm
Also - having set the rules for Individual to be
First Name (length 1) Weight 5
Surname Weight 7
Email (lenght 0)

Weight threshold to be match 10

What’s the weight for email?

Quote from: peterd on August 07, 2007, 04:55:02 pm
I would expect that given I have a David Smith and a Letitia Smith, when I add a D Smith, it would warn me of the potential duplicate of John Smith (and show me other details about that person) - but it actually says "2 matching contacts were found. You can edit them here:John Smith, Mary Smith"

I got a bit lost in all the Davids, Johns, Letitias and Marys above ;) but I created a similar rule in my sandbox, added David, Letitia and D Smiths and the Find Duplicate Contacts action displays David and D Smiths as potential duplicates (of each other).

Note: There is an ‘agile’ check on contact creation that tries to guess whether a newly not-yet-created contact might be a potential dupe, but it’s using the Contact Matching rules (the same ones used for import matching), not the new dedupe ones (this will be changed in CiviCRM 2).

In general, I’d be most grateful if you replicated the issue on demo; I could work on that. :)
If you found the above helpful, please consider helping us in return – you can even steer CiviCRM’s future and help us extend CiviCRM in ways useful to you.

petednz

  • Forum Godess / God
  • I’m (like) Lobo ;)
  • *****
  • Posts: 4899
  • Karma: 193
    • Fuzion
  • CiviCRM version: 3.x - 4.x
  • CMS version: Drupal 6 and 7
Re: Deduping in 1.8 - lack of contextual info to help decide which record 2 keep
August 10, 2007, 02:19:55 pm
Well I think the problem is that I am changing rules for DeDupe and expecting them to function on 'matching'. So why am I getting confused (in case it isn't only me)!
In fact why can't we avoid any potential confusion and either use the DeDupe rules for 'matching' at point of data entry and imports since they give more flexibility or at least provide the same level of flexibility esp re length.
So here is the scenario:-
I already have David Smith in the system and then get something from a D Smith which I think is potentially a new person so I begin to enter them and type in D Smith and hit "find matching contact" in the hope that it will then show me all people called D......... Smith and show me their addresses and emails so I can then determine whether i need to continue to enter the new contact. But instead it says - no matching contact found.

Also - have to ask if i set length in the Dedupe rules at 1, does it count from the beginning of the text string or anywhere in the text string.

Finally, in terms of deduping, having entered D Smith, Davy Smith, Dave Smith and David Smith I then use 'Find duplicate contacts' with the fields set at First Name (Length = 1) (Weight = 5) and Surname (Weight = 7) (threshold =12) (no great reason for the weighting - I just want to pull up everyone with similar names to explain the next problem) i then get a window showing "potentially duplicate contacts" with all my four D...... Smiths - but it provides me with no other information on which to base which ones might be the duplicates eg city, email, phone.
Even if i do select one and progress to the next screen I still don't get any 'clues'. It is only once i pick one and click merge that some addressing info comes through.
I think it is essential this info is shown at an earlier screen otherwise we are working blind - well at least entirely reliant on what fields we used in setting up the rules.
« Last Edit: August 10, 2007, 07:04:49 pm by peterd »
Sign up to StackExchange and get free expert advice: https://civicrm.org/blogs/colemanw/get-exclusive-access-free-expert-help

pete davis : www.fuzion.co.nz : connect + campaign + communicate

Piotr Szotkowski

  • I live on this forum
  • *****
  • Posts: 1497
  • Karma: 57
Re: Deduping in 1.8 - lack of contextual info to help decide which record 2 keep
August 11, 2007, 12:58:09 am
Quote from: peterd on August 10, 2007, 02:19:55 pm
Well I think the problem is that I am changing rules for DeDupe and expecting them to function on 'matching'. So why am I getting confused (in case it isn't only me)!
In fact why can't we avoid any potential confusion and either use the DeDupe rules for 'matching' at point of data entry and imports since they give more flexibility or at least provide the same level of flexibility esp re length.

That’s our target for CiviCRM 2.0; in general, we tend to roll in improvements gradually, in small chunks, so we can both keep our fast development/release cycle and be able to get new features in every new version.

I agree that having two mechanisms for considering contacts duplicate is a step back in general and introduces unnecessary confusion. The new dedupe mechanism is developed to be extensible and ‘pluggable’ (so that other parts can easily plug in to it – matching on import, contact creation, API calls, etc.), and we simply weren’t able to plug into it all the places we wished to in CiviCRM 1.8; we also prefer to have the mechanism tested and all its major flaws (if any) ironed out before using it everywhere across CiviCRM – we know the old mechanism is tried and works, so we’d rather stick with that (and the unfortunate confusion) for this release.

Quote from: peterd on August 10, 2007, 02:19:55 pm
So here is the scenario:-
I already have David Smith in the system and then get something from a D Smith which I think is potentially a new person so I begin to enter them and type in D Smith and hit "find matching contact" in the hope that it will then show me all people called D......... Smith and show me their addresses and emails so I can then determine whether i need to continue to enter the new contact. But instead it says - no matching contact found.

One of the issues with simply plugging in the new dedupe mechanism to the contact creation screen is that dedupe is not currently hooked into our ACL system. The new mechanism is currently accessible to users with the ‘CiviCRM admin’ permission and does the matching across the whole contact database (for a given domain_id); making it available for the contact creation screen means also hooking it to ACLs, so the search gets limited only to the contacts accessible to the user who tries to create a contact.

Quote from: peterd on August 10, 2007, 02:19:55 pm
Also - have to ask if i set length in the Dedupe rules at 1, does it count from the beginning of the text string or anywhere in the text string.

From the beginning. The dedupe mechanism is feasible for bigger contact sets only when the queried columns have associated indices, so that the database can quickly retrieve the possible matching contacts.

Quote from: peterd on August 10, 2007, 02:19:55 pm
Finally, in terms of deduping, having entered D Smith, Davy Smith, Dave Smith and David Smith I then use 'Find duplicate contacts' with the fields set at First Name (Length = 1) (Weight = 5) and Surname (Weight = 7) (threshold =12) (no great reason for the weighting - I just want to pull up everyone with similar names to explain the next problem) i then get a window showing "potentially duplicate contacts" with all my four D...... Smiths - but it provides me with no other information on which to base which ones might be the duplicates eg city, email, phone.
Even if i do select one and progress to the next screen I still don't get any 'clues'. It is only once i pick one and click merge that some addressing info comes through.

From your real-life cases – would it be enough if the dedupe showed a preview of the primary location of the given contact on these two screens (i.e., the initial selection and the set of potential duplicates)?
« Last Edit: August 11, 2007, 01:00:13 am by Piotr Szotkowski »
If you found the above helpful, please consider helping us in return – you can even steer CiviCRM’s future and help us extend CiviCRM in ways useful to you.

petednz

  • Forum Godess / God
  • I’m (like) Lobo ;)
  • *****
  • Posts: 4899
  • Karma: 193
    • Fuzion
  • CiviCRM version: 3.x - 4.x
  • CMS version: Drupal 6 and 7
Re: Deduping in 1.8 - lack of contextual info to help decide which record 2 keep
August 11, 2007, 02:33:24 am
Good morning Piotr - probably easier to continue this via a skype chat but a couple of comments.

hadn't realised that Dedupe feature came after the Matching Contacts which explains some of the issues.

And was going to make some suggestions for additions to the Edit Rule for Matching Contacts but it still won't show the contact details I am hoping for.

So maybe I was being an eeejut but I realise the solution is just for us to set the QUICK SEARCH up to be the thing people use if entering 'possible' new contacts, eg info gathered from a petition where data can be ambiguous.

Simply typing 'surname, initial'  and hit enter at which point eg "Smith, D" will point all D**** Smith plus address email data shows and we can ascertain if we have a 'match' or whether we go ahead and input the person as a new contact.

Similarly typing 'davis%, p' will bring up Peter Davis and Paul Davison!

This is what I have been needing - and it was there all the time. So will just tweak so that having done this we have 'add new contact' as the next easy step if the data does indeed belong to a new contact. Phew.
Sign up to StackExchange and get free expert advice: https://civicrm.org/blogs/colemanw/get-exclusive-access-free-expert-help

pete davis : www.fuzion.co.nz : connect + campaign + communicate

Pages: [1]
  • CiviCRM Community Forums (archive) »
  • Old sections (read-only, deprecated) »
  • Support »
  • Using CiviCRM »
  • Using Core CiviCRM Functions (Moderator: Yashodha Chaku) »
  • Deduping in 1.8 - lack of contextual info to help decide which record 2 keep

This forum was archived on 2017-11-26.