tarballed
April 21st, 2004, 12:36
Hello everyone.

Very interesting thing happened this morning when I came into work. One of my FreeBSD 4.9 servers that is running samba, mostly for a WINS server was powered off this morning when I came in...

After doing some research, I found this entry when I typed 'last' at the command line:

[code:1:9cffeeb039]jwilliams ttyp0 192.168.1.90 Tue Apr 20 11:28 - crash (21:07)[/code:1:9cffeeb039]

Which did not make me full warm and fuzzy.

So now I need to track down why this server crashed. For reference, the IP address listed above is the IP address listed for my workstation, not the servers.

So what I wanted to know is how to do some research and see why this server crashed.
What are the suggested steps to take?
What should I be looking for?
..and so forth.

Any help is greatly appreciated.

Thanks,

Tarballed

frisco
April 21st, 2004, 14:54
Read through logfiles (/var/log/* and your app's logfiles if separate) from around 8:35am on Wednesday morning (11:28+21:07). Check dmesg for anything strange. What do you do to record system status (load ave, vmstat, etc)? Look through those too. Was there a power failure at the site? Microwaves, coffee machines might help (flashing 12:00). Consider setting up auto-reboot in OS/BIOS if necessary. Consider remote monitoring via nagios or other product. Is the server in a secure location, or could someone have "accidentally" powered it off? Secure location includes all things necessary for the machine to function - if the server is in a locked down server room but power mains is accessible to anyone outside, then the server is not fully secure.

tarballed
April 29th, 2004, 21:04
Found out the problem...you guys wont believe what it is...

I'll post in detail this weekend. :)

Tarballed

tarballed
April 30th, 2004, 12:39
Ok...here is what the problem was,.

I have 3 FreeBSD servers that are all plugged into a APC. The APC plugs into a big power surge which then plugs into the outlet in the wall.

The outlet was just recently put in place.

Anyway, like I said. I would come in early in the morning and I would see that my 3 servers were all powered down. I checked everything; logs, power supply, swapped out RAM...everything I could think of on the server side I checked...

Well, I came in very early one morning, walked into my office, flipped on the lights to the suite, then walked into my office. When I walked into my office, I noticed the servers started booting up....

The problem: The outlet was wired completely wrong and when you would shut off the lights to the office, it would shut down that outlet where I had my servers plugged into...

Fixed the problem, but lost some sleep...

Thougth you guys would get a kick out of it.

Tarballed

ealwen
April 30th, 2004, 22:26
oops. I can hear the call to the electrician now, "Ya that was the problem, and you are gonna pay for it this time."