Check your assumptions

I just went through one of those random Linux trial-by-fire exercises.  I had two web servers, one cloned from the other, behaving differently: One would send emails, the other wouldn’t.

After walking through the tree of all possible things that could be the problem: user installed software, system software, system configurations, and right down to logs and individual files…  and then realized the most obvious source of the difference:  The processes that handle mail on one server had died, but not on the other.

Yes, 2 hours of debugging mail handling on a linux machine only to discover that I could have figured this out with “ps aux | grep mail” in about 10 seconds, had I known what to look for.

Well, that’s the nature of troubleshooting – the answer is always obvious after you find it.

On the bright side, it means my lab has a shiny new blog to play with – and it seems like everything is working now.


