CiviCRM Community Forums (archive)

*

News:

Have a question about CiviCRM?
Get it answered quickly at the new
CiviCRM Stack Exchange Q+A site

This forum was archived on 25 November 2017. Learn more.
How to get involved.
What to do if you think you've found a bug.



  • CiviCRM Community Forums (archive) »
  • Old sections (read-only, deprecated) »
  • Developer Discussion (Moderator: Donald Lobo) »
  • Deduplication code possibly being run twice.
Pages: [1]

Author Topic: Deduplication code possibly being run twice.  (Read 1459 times)

dougall

  • Guest
Deduplication code possibly being run twice.
July 22, 2008, 08:04:18 am
Hi,

We have 30000+ contacts and many duplicates.

Not surprisingly we've been hitting PHP maximum execution time 60s every time when trying to use the Admin Contact deduper on all these.

We also have a large number of groups (1000+) so deduping by group (as suggested here: http://wiki.civicrm.org/confluence/display/CRMDOC/Find+Duplicate+Contacts) is not feasible.

So instead I've set the execution time much higher and limited the deduper to stop searching after getting the first 100 results (which is an acceptable workaround for us for now).

As part of this process I also inserted some trace statements to see what was happening, including an error_log(...) statement in /CRM/Dedupe/Finder.php at the start of the function findDupes(...) line 130.

When I subsequently look in the server log I can see that this function appears to be called twice when the Use Rule button is initially pressed on the Find Contact Dupes page. It seems to me this expensive function should only be called once?

Apologies if I've misunderstood something here.

We're using CiviCRM 2.0.2 although I've just tried 2.0.5 and it seems the behaviour is still the same there.

Thanks,
  Dougall

Donald Lobo

  • Administrator
  • I’m (like) Lobo ;)
  • *****
  • Posts: 15963
  • Karma: 470
    • CiviCRM site
  • CiviCRM version: 4.2+
  • CMS version: Drupal 7, Joomla 2.5+
  • MySQL version: 5.5.x
  • PHP version: 5.4.x
Re: Deduplication code possibly being run twice.
July 22, 2008, 11:29:11 am

we've rearchitected the dedupe engine completely in v2.1 (different algorithm etc). as such we expect it to be more efficient and use less memory etc. Would be great if you could test your use case when 2.1 hits alpha

The 2.0 version has a fair number of issues. We are not planning on upgrading / fixing the dedupe code in that version

lobo
A new CiviCRM Q&A resource needs YOUR help to get started. Visit our StackExchange proposed site, sign up and vote on 5 questions

xavier

  • Forum Godess / God
  • I’m (like) Lobo ;)
  • *****
  • Posts: 4453
  • Karma: 161
    • Tech To The People
  • CiviCRM version: yes probably
  • CMS version: drupal
Re: Deduplication code possibly being run twice.
July 23, 2008, 06:44:59 am
Hi,

Out of curiosity, how your install behave with such a large number of groups ? Not too long on some pages ?

X+

-Hackathon and data journalism about the European parliament 24-26 jan. Watch out the result

dougall

  • Guest
Re: Deduplication code possibly being run twice.
July 23, 2008, 07:20:36 am
Hi Xavier,

It does take quite a while on some pages, but it's totally manageable :)

The two most obvious consequences are largely superficial:
- enormously long lists to select groups from in some combo boxes
- these combo boxes are also very wide due to some of the group names being ridiculously long

The reason we have so many (and why some of the names are so long) is that we are synchronizing many (but not all) of the contacts, groups & group-memberships with another database via a combination of civicrm's hooks and a SOAP server.

...Dougall

Pages: [1]
  • CiviCRM Community Forums (archive) »
  • Old sections (read-only, deprecated) »
  • Developer Discussion (Moderator: Donald Lobo) »
  • Deduplication code possibly being run twice.

This forum was archived on 2017-11-26.