Search This Blog

Loading...

Tuesday, 23 March 2010

Remembering the importance of server reboot.

Too often in a support situation people poke fun  and make sarcastic comments about the requests for screenshots and reboots to clear problems, seen as being either delaying tactics or lazy troubleshooting, so sometimes the humble server reboot is overlooked, as we try too hard to find another fix for the problems at hand.

Earlier this week a client got in touch about a problem they encountered following some issues with their Cognos service account on their Windows 2003 servers.  Initially all of the Contributor applications were unavailable, each giving a message on the web page stating that:

"The application definition is being updated on the server.  Please try again in a little while"

Very polite, but not too helpful.  As far as we knew the application definition was in no need of being updated.  But try again in a little while we did.

Four out of five applications did indeed become available but the last one refused.  So we viewed the error logs only to find there was no specific help there, we tried to GTP, Synchronise, GTP again, each time the GTP successful, unless a reconcile job was required, at which point the reconcile failed to complete even a single e.list item.

"The application must be corrupted" we declared.  "I suggest that you restore the database from a backup file".  And so within 30 minutes a backup file from Friday night was restored and we tried again.

"The application definition is being updated on the server.  Please try again in a little while" the server responded.

By now time was getting on, nobody wanted to stay on the phone so we offered that it may be worthwhile getting the databases off-site to try some testing on our own servers, to limit the drain on the customers time of course.  And so we suggested the following:
  1. Try a reboot of all the servers if possible
  2. If the reboot is unsuccessful, please upload the backup files so we can try them ourselves
An hour later we received an email saying the servers had been rebooted and the application was now working.  The customer quite rightly said it was a bit of a shame in a way, that we had not tried this earlier in the day.  Though I have to wonder if we had suggested rebooting the servers as a first stab at finding a fix for the solution if it would have been greeted so warmly.

So maybe we should remember that rebooting the server is not always the lazy option, but simply a way of ensuring that all the cobwebs have been blown out before resorting to more thorough investigations.

0 comments:

Post a Comment