Kernel_Killer
December 18th, 2007, 02:19
I was wondering if anyone had ever dealt with any sort of script that could make sure an app was running constantly. I figure I could do a cron job to check every 5 minutes, and if it wasn't running, to run the application again. Just wondering if there was a better way of doing this. Normal user process by the way.

molotov
December 18th, 2007, 06:08
Do you have access to the source of the process you want to check? And when you say "running constantly", do you want to check if the application is simply still running or still working? (i.e. parsing ps |grep apache vs netcat script to HTTP GET).

The cron job I think would be the easist, and best that I can think of, but I can think of one alternative: If the processes creates a pid/lockfile anywhere, you can monitor the status of that file with a couple file system utilities (I think Strog/bmw/elmore/someone mentioned one 6 months or so ago), and trigger events off the deletion of that file. If it doesn't and you have access to the source, you can make it create such a file.

If you're going to go the timed check route, and you want to check very often, it might make more sense to write something small in C to just do a syscall on the target process directly instead of a shell script to parse ps/whatever.

bmw
December 18th, 2007, 13:48
What molotov said. But assuming you merely want to see if a simple process is still active (ie would be seen in ps output), then a handy method to run every 5 mins is:


get the app's process-id, often from /var/db; let's call it $pid
do a /bin/kill -0 $pid and check the error return from kill (either the commandline version or the syscall). [That kill option is dash-zero BTW.]
if the error return is ESRCH (kill(2)) or non-zero (/bin/kill) then the process doesn't exist -- it's no longer running.


A further check is that if either there's no pid-file in /var/db (and you expected one), or the above check suggests that there's no process, before you launch your replacement process, you might want to run ps and grep the output to check in case something simply happened to your pidfile. If that check detects the process, it might be wise to kill it and restart anyway to restore sanity.

If you're coding up a cron-launched shell-script to do this, an alternative method is to check the error return from killall -0 appname . This works well if your app is not well-behaved and doesn't leave its pid in a /var/db file.

Kernel_Killer
December 18th, 2007, 15:59
Here's what I have, and for some reason, it's having issues:

#!/usr/local/bin/bash
tag=`date | cut -d " " -f 4 | sed s/:/./g`
ascent="/home/vile/ascent-current/bin"
aproc=`ps ax | grep ascent | wc -l`
lproc=`ps ax | grep logonserver | wc -l`

if [ $aproc -lt 1 ]; then
mv $ascent/ascent.core $ascent/cores/$tag.core
cd $ascent
./ascent &
fi

if [ $lproc -lt 1 ]; then
mv $ascent/logonserver.core $ascent/cores/ls$tag.core
cd $ascent
./logonserver &
fi

exit



Now, if I run the command, from the directory, everything is fine. If I let the cron job start it, it seems to fail. It starts the process, and I'm not sure if it's starting it or what, crashing, and starting again...... Going to add an echo to be send to mail.

bmw
December 19th, 2007, 14:57
KK, I'm suspicious about those apps: are they really properly coded for being daemons? Try an experiment: launch one from the terminal, then logout that terminal. Does the app keep running? If that works, then launching it from cron should also work.

True daemons do things like closing all open file descriptors (especially 0, 1 and 2 [stdin, stdout and stderr]) and relinquish their controlling terminal. This requires two forks and some assorted futzing around.

An app that needs to be run as a daemon but that wasn't completely coded to do that can be launched using the "daemon" command -- see "man 8 daemon". Eg:

$ daemon -cf /home/vile/ascent-current/bin/ascent

Kernel_Killer
December 19th, 2007, 19:27
They aren't true daemons at all. I rather have to put them in the background, or run from screen. Regardless, something is not right when starting them from the script.

Kernel_Killer
December 20th, 2007, 07:10
Thanks a ton. Going to look at this some more.

Kernel_Killer
December 20th, 2007, 07:27
Here are few notes:

Running as a daemon didn't work. I of course had to run with:

daemon -cf ./ascent

EDIT: Even with full path it doesn't work

The reason being, is that some of the files it looks for is in the PWD. I'll test it some more, to see if it will work another way. The app itself does create a pid file for each process though, so I think I'll get rid of the ugly grep, and use that. Thanks again. :)

Kernel_Killer
December 20th, 2007, 12:13
Well, I feel stupid. I've looked through lots of bash scripts, and can't seem to make the 'pidof' function work at all. Sigh. Any examples? I'm going to look some more.

bmw
December 21st, 2007, 12:54
What's "pidof" attempting to do exactly?

BTW: if you are using daemon and your app needs to stay in the current working directory, leave off the -c option. That might fix your problem above.

Kernel_Killer
December 22nd, 2007, 06:24
I was trying to use pidof with a if statement, using a few examples. I'm trying to remember what they were though. The examples I found didn't work though.


ascent=ascent
pid1=`pidof $ascent`



There are a few other ways as well, but I can't remember, and my history logs is far beyond short.

Kernel_Killer
December 22nd, 2007, 08:02
Well, it seems to be working, somewhat, but the daemon command is keeping it alive after crashes. I'll check the man page some more, but I'm guessing I'll have to initiate the server without the daemon command, since when it crashes, it stays alive. :\

EDIT: This is getting quite annoying. lol. Even if the script is set to run as './ascent &', when it crashes, it stays alive. If I start in screen, once it crashes, it dies.