CiviCRM Community Forums (archive)

*

News:

Have a question about CiviCRM?
Get it answered quickly at the new
CiviCRM Stack Exchange Q+A site

This forum was archived on 25 November 2017. Learn more.
How to get involved.
What to do if you think you've found a bug.



  • CiviCRM Community Forums (archive) »
  • Old sections (read-only, deprecated) »
  • Support »
  • Using CiviCRM »
  • Using Core CiviCRM Functions (Moderator: Yashodha Chaku) »
  • Duplicate Search Hangs
Pages: [1]

Author Topic: Duplicate Search Hangs  (Read 1352 times)

FredJones

  • Guest
Duplicate Search Hangs
June 29, 2008, 08:18:34 am
When I run a duplicate search, my system hangs. My DB is not so big, and all other functions seem to work fine. When I examine my query log I see a large set of queries like this. I of course replaced the actual email addresses with nonsense letters:

Code: [Select]
      6 Init DB     fred_civicrm_2
      6 Query       SELECT param.contact_id AS contact_id FROM civicrm_email param WHERE param.email = 'aaa@bbb.edu'
      6 Init DB     fred_civicrm_2
      6 Query       SELECT param.email AS 'match' FROM civicrm_email param
WHERE param.contact_id = 6131
      6 Init DB     fred_civicrm_2
      6 Query       SELECT param.contact_id AS contact_id FROM civicrm_email param WHERE param.email = 'ccc@ddd.edu'
      6 Init DB     fred_civicrm_2
      6 Query       SELECT param.email AS 'match' FROM civicrm_email param
WHERE param.contact_id = 6132
      6 Init DB     fred_civicrm_2
      6 Query       SELECT param.contact_id AS contact_id FROM civicrm_email param WHERE param.email = 'ee@ff.edu'
      6 Init DB     fred_civicrm_2
      6 Query       SELECT param.email AS 'match' FROM civicrm_email param
WHERE param.contact_id = 6133
      6 Init DB     fred_civicrm_2
      6 Query       SELECT param.contact_id AS contact_id FROM civicrm_email param WHERE param.email = 'ggg@hh.com'
      6 Init DB     fred_civicrm_2
      6 Query       SELECT param.email AS 'match' FROM civicrm_email param
WHERE param.contact_id = 6134
      6 Init DB     fred_civicrm_2
      6 Query       SELECT param.contact_id AS contact_id FROM civicrm_email param WHERE param.email = 'ii@jj.com'
080629 10:05:57       6 Init DB     fred_civicrm_2
      6 Query       SELECT param.email AS 'match' FROM civicrm_email param
WHERE param.contact_id = 6135

Is this normal/expected?

This is for an Individual search and my matching rules there (civicrm/admin/deduperules?action=update&id=1) are just Email with weight 5 and "Weight Threshold to Consider Contacts 'Matching':   " is 4.

When I say it hangs, I mean the browser never returns anything. And my php max_execution_time is 1200 seconds, about 20 minutes. :)

Any ideas? Using Drupal 5.7 and CiviCRM 2.0.4

Thanks

petednz

  • Forum Godess / God
  • I’m (like) Lobo ;)
  • *****
  • Posts: 4899
  • Karma: 193
    • Fuzion
  • CiviCRM version: 3.x - 4.x
  • CMS version: Drupal 6 and 7
Re: Duplicate Search Hangs
June 30, 2008, 02:01:14 pm
Could try a smaller subset - eg create a group of everyone whose surname begins with A and try the Duplicate Match on just that group. Can't recall where the instructions were on how to set the duplicate match to restrict to a group but am sure a search on 'duplicate and group' will probably find you it.
Sign up to StackExchange and get free expert advice: https://civicrm.org/blogs/colemanw/get-exclusive-access-free-expert-help

pete davis : www.fuzion.co.nz : connect + campaign + communicate

tonyg

  • Guest
Re: Duplicate Search Hangs
June 30, 2008, 03:01:34 pm
Here is how to find duplicate contacts one group at a time. I had this problem before and this worked for me.

hope it helps

http://wiki.civicrm.org/confluence/display/CRMDOC/Find+Duplicate+Contacts

FredJones

  • Guest
Re: Duplicate Search Hangs
July 01, 2008, 09:17:51 am
This trick of using a group does work, but in the case of 20 groups, it's a bit of a hack. Not to mention that I would prefer to fix the problem, if possible.

Here are my findings on my local dev server running Windows 2000, a dual core 2Ghz Intel CPU and 2G RAM. My local DB has on the order of 22K contacts. When I run the duplicate search I watched the CPU monitor. It never went past 60%. In the processes list, the top two items vacillated between the "System Idle Process" and httpd.exe, each taking between 47 to 50% in turn. MySQL never had more than 2%.

This went on for approximately 5 minutes, until the browser returned a blank page. No errors, just a completely blank page. I see that my php.ini had:

Code: [Select]
max_execution_time = 300;

which is indeed 5 minutes.

I then checked

Code: [Select]
select count(*) as c from civicrm_email;

and the result is 30697. Given that the original SQL I posted takes 5 lines of log code per email, I would expect the log to go for 150,000 lines before finishing this operation. Mine only went 19,000, about 1/7 of that.

So I applied:

Code: [Select]
max_execution_time = 6000;

which is 20 times what it was and ran the Duplicate Search again. I left the PC and came back after an hour and it completed successfully, with 35 pages of results in my browser and about 230,000 lines in my log. Not sure why so many lines in the log, but at least on this server, adjusting max_execution_time indeed fixed the problem.

Pages: [1]
  • CiviCRM Community Forums (archive) »
  • Old sections (read-only, deprecated) »
  • Support »
  • Using CiviCRM »
  • Using Core CiviCRM Functions (Moderator: Yashodha Chaku) »
  • Duplicate Search Hangs

This forum was archived on 2017-11-26.