So I set up a simple script to send an email alert when a certain web service stops running.
It has a simple flow of:
test = $( curl [address] | grep [a certain string in response] | wc -l )
if [ $test -ne 1 ]; then
echo "there has been an error" | mail -s "Error" -t "[my-mail-address]"
fi
and in crontab it is set to do the check once every five minutes:
*/5 * * * * sh /path/to/script/
It was working well for a couple of days, but suddenly about ten minutes ago, almost hundred e-mails from the server were received simultaneously. It doesn't seem possible at all since there aren't even any loops in the script.
Syslog:
Jan 26 01:05:01 sv1 CRON[23310]: (munin) CMD (if [ -x /usr/bin/munin-cron ]; then /usr/bin/munin-cron; fi)
Jan 26 01:10:01 sv1 CRON[23815]: (munin) CMD (if [ -x /usr/bin/munin-cron ]; then /usr/bin/munin-cron; fi)
Jan 26 01:12:12 sv1 kernel: [5962667.417178] [ 1106] 0 1106 5914 168 17 0 0 cron
Jan 26 01:12:12 sv1 kernel: [5962667.417250] [27493] 0 27493 14949 224 34 0 0 cron
Jan 26 01:12:12 sv1 kernel: [5962667.417252] [27939] 0 27939 14949 224 34 0 0 cron
Jan 26 01:12:12 sv1 kernel: [5962667.417254] [28436] 0 28436 14948 224 34 0 0 cron
Jan 26 01:12:12 sv1 kernel: [5962667.417256] [28943] 0 28943 14949 224 34 0 0 cron
Jan 26 01:12:12 sv1 kernel: [5962667.417258] [29408] 0 29408 14949 224 34 0 0 cron
...
* this continues for about 800+ lines with similar timestamp (until 01:12:24). The timestamp of these 800+ lines coincide with the simultaneous mails. It is odd as the cron is scheduled to run every 5 mins, hence the first 2 lines. The lines starting from 01:12:12 are the fishy ones.
Update:
Just brought the service down again and let cron and the script do their job. A single mail was sent.
As the test is a very simple true/false, I am struggling to figure out what kind of special circumstances would result in multiple mails being sent simultaneously.
59 */6 * * * script.sh | mail -s "Subject of Mail" someother@address.comhttp://www.nixtutor.com/linux/sending-email-alerts-through-cron/? – 030 Jan 25 '15 at 17:31man 5 crontaband look for MAILTO. example. Does not explain the number of emails - logically if there were 124 emails received and this is the only script sending emails - it was called 124 times. 123 not via cron. – AD7six Jan 25 '15 at 17:37tail -f /var/log/cron. What time did you receive 124 emails? Could you checks the log around that time? – 030 Jan 25 '15 at 17:41the response from the API takes a long time. Is this caused by the API or the grep command? The grep command time could be shortened (see updated answer) – 030 Jan 25 '15 at 22:27echo "there has been an error"toecho "there has been an error at \date +'%d-%m-%y %H:%M'`"` (or whatever format you like) to avoid such problems in future. – Nehal Dattani Jan 26 '15 at 09:37