PNG TIME

ipblocker

4/10/2012

crashing fixed

a while back we decided to upgrade our primarily file server to Windows
2008 Server R2. As soon as we did, it started crashing, randomly, once
a week. Suddenly the file and printer shares went offline.

Each week we'd try something new and wait.
Then we virtualized it. In preparation to rebuild it... then we
questioned 'maybe virtualizing it made it worse'

We tried everything, not excluding writing back to friends in the U.S.
for ideas.

Randomly throughout the week, ALL our printers and ALL our files would
go offline. 5TB worth. And the only fix was a total reboot.

In my head I was building the unix file server already. We were at our
wits' end on this issue. Having read every forum, every article, every
bit of data we could to find the issue. No error logs, no errors, no
symptom other than file sharing going offline and the service crashing.

And then.... we were able to isolate the problem. After the crashes
increased, we were able to run perf mon tools and found that right
around the time of each crash, someone had accessed the same folder on
the same drive.

We scanned the folder and found 4.5million tiny files in that folder.
Sometime, over 6 years ago, someone had run a program that fragmented
files into millions of tiny little files. No idea why this was done or
what was supposed to be achieved by it.

So delete the files right?

How? If we access the files through windows, file sharing crashed, AND
they were totally un-deletable.

the only way we could effectively delete them was from the command line
prompt.

It took more than a full day to run the scan, and delete all the
files... but it did. And as soon as they were all gone... the server
was happy again and has been for weeks.

I'm not bashing Windows.... but I am saying... command line still has
it's place.

Why do I mention this?
Because often in the world of computing, answers come immediately. One
of the skill sets necessary to excel in mission tech is patience and
longevity. We had been working this particular problem for weeks.

Another example of this is the VSAT that recently came online. I've
been working that since October of last year.

Patience as defined in my context. A computer guy moved to a third
world country. Someone who valued speed and quick resolution... getting
things done... moved to a world that does not value speed. Patience as
defined by me is... the ability to not get angry when something that
should take minutes to accomplish, takes months. To not let it drive
your blood pressure up, to not let it eat at you, and to never utter the
phrase 'this would be SO MUCH FASTER where I come from!".

My God is teaching me patience, He has been for 5 years, if not my whole
life.

I have also recently learned another survival technique for living here.
HAVE something in your life, where at least once a day you can have success.
For some that's cooking. No matter how many things at work break....
when they go home, they know they can cook a decent meal.

Find something you can do daily, to be successful at, otherwise the
daily failures will overwhelm you.
For me that thing is repair.
If I can fix something... a broken toy, or a burned out light bulb....
just one thing, then I feel I've somehow stemmed the onrush of broken
things coming at me... and at least fixed 1 thing today.

-chad