Although we generally use Keepalive to monitor AOLserver-based Web services, it will work fine to monitor any HTTP service on a Unix machine.
cd /web
gunzip < keepalive.tar.gz | tar xvf - (creates /web/keepalive)
cp /web/keepalive/modules/nsperm/*.dat /[ns_home]/servers/keepalive/modules/nsperm/
cd /web/keepalive/modules/nsperm/
cp passwd perms
/[ns_home]/servers/keepalive/modules/nsperm
WARNING: If an intruder gains unauthorized access to your keepalive server they will be able to execute arbitrary commands by setting the failure action. To restrict the Keepalive pages to SSL make sure that the following line in defs.tcl is uncommented:
ns_register_filter preauth GET /keepalive/* ad_restrict_to_https
Direct your web browser to the host name specified in your AOLserver configuration. You will be redirected to the main Keepalive page. Initially there won't be any servers listed. This can be rectified by clicking on the add a service link. When adding a new service you will have to specify the following:
This is important because it will act as a 'primary key'. It must be unique and will be used to identify which service you want to modify/deactivate later on.
This is the URL that Keepalive will request to check the status of the web server it is monitoring. This URL should query the database in order to make sure the database connection is working and return a value based on whether or not the database is available. At ArsDigita we place dbtest.tcl in the /www/SYSTEM directory where /www/ is the web page root.
Keepalive will compare the value that the test URL returns with the expected result to determine if the web server is functioning properly.
This is a shell command that will presumably restart a hung web server. Read the 'Which Shell Command?' section for details. At ArsDigita we use the restart-aolserver script. Note that to get restart-aolserver working on linux the command ps -ef should be replaced with ps auxww
This is a list of email addresses to notify when there are problems with the web service being monitored. Just enter in as many email addresses as necessary, seperating them with spaces.
Same as admin emails only the message to be sent is smaller.
How many times may a web service fail the Keepalive test before Keepalive runs the failure action script, presumably restarted the web server.
How many times may a web service fail before Keepalive starts sending out email.
Before restarting a web server Keepalive will attempt to retreive this page. The page specified should be a Tcl script that will write to the log file the current status of the broken web server. If the failed web server is not accepting any connections this won't do any good. However in the case that the database has failed this is often a useful debugging tool. At ArsDigita we place log-monitor.tcl in the /www/SYSTEM directory where /www is the web page root.
It sometimes happens that it is necessary to temporarily stop monitoring a web service, perhaps during scheduled downtime. In this case you could remove the web service from the configuration file but then it would be necessary to re-enter all the information later. Therefore it is possible to 'deactivate' a service. It will still be in the system's config file but it will not be actively monitored.
Also please note that many of the error pages are quite rudimentary, sometimes only displaying what went wrong. In almost all cases simply press the back button in your browser and correct the problem.
which tells Unix to restart nsd if it should die for any reason. Thus keepalive just needs to kill the existing nsd process. The problem is that Web servers must be owned by root if they are to grab Port 80 and Keepalive can't kill a Web server unless it runs as root (a security risk). The solution at ArsDigita is to build a setuid Perl script that Keepalive can call:nsjw:34:respawn:/home/nsadmin/bin/nsd -i -c /home/nsadmin/nsd.ini
restart-aolserver
#!/usr/local/bin/perl
## Restarts an AOLserver. Takes as its only argument the name of the server to kill.
## This is a perl script because it needs to run setuid root,
## and perl has fewer security gotchas than most shells.
$ENV{'PATH'} = '/sbin:/bin';
# uncomment this stuff if you're at an installation where a server
# takes a long time to restart or keeps important state
# if (scalar(@ARGV) == 0) {
# die "Don't run this without any arguments!";
# }
$server = shift;
$< = $>; # set realuid to effective uid (root)
sub getpids {
## get the PIDs of all jobdirect servers
my $ps_output = `/usr/bin/ps -ef`;
my @pids;
foreach (split(/\n/, $ps_output)) {
next unless /^\s*\S+\s+(\d+).*nsd.*$server.ini/;
push(@pids, $1);
}
@pids;
}
@pids = &getpids;
print "Killing ", join(" ", @pids), "\n";
kill 'KILL', @pids;
lappend keepalive_monitor_list [new_monitor "arsdigita" "http://www.arsdigita.com/SYSTEM/dbtest.tcl" "success" "/usr/local/bin/restart-aolserver arsdigita" [list email_1@arsdigita.com email_2@arsdigita.com] [list your_pager@arsdigita.com] 2 2] http://www.arsdigita.com/SYSTEM/log-monitor.tcl
arsdigita http://www.arsdigita.com/SYSTEM/dbtest.tcl success {/usr/local/bin/restart-aolserver arsdigita} {email_1@arsdigita.com email_2@arsdigita.com} your_pager@arsdigita.com 2 2 http://www.arsdigita.com/SYSTEM/log-monitor.tcl 1