CiviCRM Community Forums (archive)

*

News:

Have a question about CiviCRM?
Get it answered quickly at the new
CiviCRM Stack Exchange Q+A site

This forum was archived on 25 November 2017. Learn more.
How to get involved.
What to do if you think you've found a bug.



  • CiviCRM Community Forums (archive) »
  • Old sections (read-only, deprecated) »
  • Support »
  • Using CiviCRM »
  • Using Core CiviCRM Functions (Moderator: Yashodha Chaku) »
  • Dedupe Orgs return nothing on obvious fuzzy match
Pages: [1]

Author Topic: Dedupe Orgs return nothing on obvious fuzzy match  (Read 1358 times)

alanski

  • I post frequently
  • ***
  • Posts: 216
  • Karma: 5
  • Cup of tea? Yes please
    • Joomkit
  • CiviCRM version: Version in post
  • CMS version: Joomla
  • MySQL version: 5.0
Dedupe Orgs return nothing on obvious fuzzy match
April 24, 2012, 05:01:56 am
Joomla 2.5.4
CiviCRM 4.1.1

Fuzzy dedupe rule is set for Org name value = 10
Threshold match value = 5
[tried reverse values too]

3 records
A: The Cardinal Wiseman
B: The Cardinal Wiseman School
C: Cardinal Wiseman

No dupes found.

What am I missing? Surely this would give a match?




ctarascio

  • I post frequently
  • ***
  • Posts: 334
  • Karma: 30
    • American Friends Service Committee
  • CiviCRM version: 4.1.3
  • CMS version: Drupal 6.26
  • MySQL version: 5.5.20
  • PHP version: 5.3.13
Re: Dedupe Orgs return nothing on obvious fuzzy match
April 24, 2012, 05:54:22 am
hi,
i think that the reason no dupes were found is because the fuzzy dedupe rule you are using is examining the entire organization name instead of just part of it. if you edit that rule and set the "length" for organization name to e.g. 8, you will get a match for the two records that begin with "The Cardinal". you will not, however, get a match for the "cardinal wiseman" record because there is no way (that i can see) that it will ever match the two records that begin with "the".

hope this was helpful,
cynthia

alanski

  • I post frequently
  • ***
  • Posts: 216
  • Karma: 5
  • Cup of tea? Yes please
    • Joomkit
  • CiviCRM version: Version in post
  • CMS version: Joomla
  • MySQL version: 5.0
Re: Dedupe Orgs return nothing on obvious fuzzy match
April 24, 2012, 06:28:08 am
With all the good will in the world [which I am finding hard to extend to some aspects of Civi] that doesn't make business sense.
The business meaning of a dupe has to be on the two keywords 'Cardinal' and 'Wiseman'.

Length just checks the beginning of the string for a match (as far as I can tell from the FLOSS manual).

Its fuzzy alright but not logical. It almost makes Civi unusable in Organisational Membership  profile creation if you cant dedupe new Organisations with existing data.

Yes its OS and we are all free to code solutions for the platform / submit patches but this is basic stuff done wrong as far as I can tell.



ctarascio

  • I post frequently
  • ***
  • Posts: 334
  • Karma: 30
    • American Friends Service Committee
  • CiviCRM version: 4.1.3
  • CMS version: Drupal 6.26
  • MySQL version: 5.5.20
  • PHP version: 5.3.13
Re: Dedupe Orgs return nothing on obvious fuzzy match
April 24, 2012, 06:49:00 am
alanski,
i do hear your frustration but in this particular instance i am not sure i agree.

under the best of circumstances, deduping data is a daunting task. that is why some organizations pay thousands of dollars for software that is devoted to that task. and even with that special software nothing, in my opinion, replaces the human aspect of eyeballing data and handling each instance with care so as not to merge records that are not really duplcates. i think this is true regardless of whether or not the software is "open source".

the specific examples you listed might be able to be handled by the code if it were modified say, to ignore the word "The". but there are hundreds of other examples where it would be difficult to apply logic to determine if records were duplicates. this, in my opinion, is why standardization of data entry is so important. having said that, i realize that standardizing data is easier/doable if data entry is done by staff and not done directly by "members".

i hope you can figure out a way around these issues. post your ideas and if the community can help you i am sure it will.

cynthia

Donald Lobo

  • Administrator
  • I’m (like) Lobo ;)
  • *****
  • Posts: 15963
  • Karma: 470
    • CiviCRM site
  • CiviCRM version: 4.2+
  • CMS version: Drupal 7, Joomla 2.5+
  • MySQL version: 5.5.x
  • PHP version: 5.4.x
Re: Dedupe Orgs return nothing on obvious fuzzy match
April 24, 2012, 07:00:57 am

a couple of thoughts and comments:

1.I do think that "normalization" of various strings is the next phase of dedupe. We have quite a few hooks built into the system that people can do so today by writing their own query code (the NYSS does this). Things like stripping common words and using either abbreviations or full names when doing a match (Street instead of St or St. etc)

2. Contributing back to the platform helps the platform grow and makes it more robust. The "basic stuff done wrong as far as u can tell" is used by quite a few people successfully in quite a few places.

lobo
A new CiviCRM Q&A resource needs YOUR help to get started. Visit our StackExchange proposed site, sign up and vote on 5 questions

alanski

  • I post frequently
  • ***
  • Posts: 216
  • Karma: 5
  • Cup of tea? Yes please
    • Joomkit
  • CiviCRM version: Version in post
  • CMS version: Joomla
  • MySQL version: 5.0
Re: Dedupe Orgs return nothing on obvious fuzzy match
April 24, 2012, 07:38:19 am
DOnt get me wrong Civi is awesome :). But there are just some very painful issues with it {eg CHange a button text comes to mind....eg write a Joomla plugin to hijack the buildform event and manipulate a button name value - cmon!}

I havent fully explained case scenario behind this 'whinge'

Membership sign ups on behalf of create a new record 'Cardinal Wiseman' which is a duplicate of  existing record 'The Cardinal Wiseman School'

The original record has loads of other data attached to it, geo stuff and custom data.

So the good admin wants to dedupe memberships/org records.
What are the choices?
Edit membership/trasnfer to another Org contact? Nope cant do that

Only way I can see without development is to dedupe and merge.

But the dedupe can't match the two words that are the same for three contacts.

Ergo duplicates exist and cant be changed without heavier code solution.

My 'solution' to this is to ajax autocomplete the Org onBehalfOf field which should catch most but not all before dupes are created.

But it doesn't guarantee resolve the issue.

Pages: [1]
  • CiviCRM Community Forums (archive) »
  • Old sections (read-only, deprecated) »
  • Support »
  • Using CiviCRM »
  • Using Core CiviCRM Functions (Moderator: Yashodha Chaku) »
  • Dedupe Orgs return nothing on obvious fuzzy match

This forum was archived on 2017-11-26.