CiviCRM Community Forums (archive)

*

News:

Have a question about CiviCRM?
Get it answered quickly at the new
CiviCRM Stack Exchange Q+A site

This forum was archived on 25 November 2017. Learn more.
How to get involved.
What to do if you think you've found a bug.



  • CiviCRM Community Forums (archive) »
  • Old sections (read-only, deprecated) »
  • Support »
  • Using CiviCRM »
  • Using CiviMail (Moderator: Piotr Szotkowski) »
  • Cronjob interrupted, couldn't be resumed. Rogue table lock was the cause.
Pages: [1]

Author Topic: Cronjob interrupted, couldn't be resumed. Rogue table lock was the cause.  (Read 1055 times)

nkinkade

  • I post occasionally
  • **
  • Posts: 56
  • Karma: 5
Cronjob interrupted, couldn't be resumed. Rogue table lock was the cause.
July 13, 2009, 06:19:14 pm
Earlier today we created and scheduled a fairly large mailing (4000+ receipients).  It was then triggered by manually accessing the civimail.cronjob.php via a web browser.  The person who ran it contacted me to say that the script never returned, that the page never loaded.  They waited for over an hour before they gave up and closed the tab that was accessing the script.  But afterward, the mailing was nowhere to be found in the CiviCRM UI, neither as a Draft or as Scheduled or completed.  It's not clear why the script never returned, but I somewhat suspect there was a network outage or brief loss of connectivity.

I poked around in the database and found that the is_completed flag was set in the civicrm_mailing table, but that the status was still "Running" in the civicrm_mailing_job table.  Re-running the script did nothing.  It returned no errors, but neither did it finish the mailing.

On #civicrm Lobo determined the cause to be that the script was unable to get a lock on a table.  (Thanks, Lobo!)  I found the culprit MySQL thread and killed it, and this allowed the job to finish by running the cronjob again.

However, this raises several questions for me:

1) Why did the table lock get held open?  If it's the job of civimail.cronjob.php to both open and release the lock, then there needs to be some safeguard to prevent the lock from staying open permanently if the script is interrupted.

2) Should the is_completed flag have been set if the status of the job was still Running?  That seems counterintuitive.  Should the code not have allowed that to happen or is that just normal?

3) Why did the job totally disappear from the UI while it was in that state?

Thanks,

Nathan

Donald Lobo

  • Administrator
  • I’m (like) Lobo ;)
  • *****
  • Posts: 15963
  • Karma: 470
    • CiviCRM site
  • CiviCRM version: 4.2+
  • CMS version: Drupal 7, Joomla 2.5+
  • MySQL version: 5.5.x
  • PHP version: 5.4.x
Re: Cronjob interrupted, couldn't be resumed. Rogue table lock was the cause.
July 13, 2009, 08:04:27 pm

1. mysql is supposed to manage the locks and release them if the php process gets killed or aborts etc. Seems like mysql did not know that the php process had long since died and it still had a process running. I think this is a bit beyond CiviCRM

2 and 3 are related. Since the is_completed flag was set but the job status was not matching, the mailing was suppressed (since its an unexpected condition). we'll need to do some research and figure out why this happened. I did check the code and the only place is_completed is set is the same place the job status is set to complete. so a bit lost here :)

lobo
A new CiviCRM Q&A resource needs YOUR help to get started. Visit our StackExchange proposed site, sign up and vote on 5 questions

Pages: [1]
  • CiviCRM Community Forums (archive) »
  • Old sections (read-only, deprecated) »
  • Support »
  • Using CiviCRM »
  • Using CiviMail (Moderator: Piotr Szotkowski) »
  • Cronjob interrupted, couldn't be resumed. Rogue table lock was the cause.

This forum was archived on 2017-11-26.