Tuesday, January 09, 2007

I know what you did last summer...

For most this summer holiday was time to get away from home. For me it was time to get to my geek roots :)

My home network needed a major overhaul and I couldn't think about a better time to do it then xmass break. The main idea was to rebuild the storage servers that hold all my legal Linux isos stuff. The dual one terrabyte servers I had were clearly not enough any more. They were too slow, too hot, too noisy, too power hungry. The solution was to create a single server with more storage, less individual drives. So I went shopping. I already had lots of parts I could re-use from old machines. All I needed were the new drives and the RAID controller to suit my requirements.

My fist requirement was that I needed over 2TB of redundant storage. I was shooting for around 3TB of total storage. After doing some basic maths on price/gigabyte I settled on 8x400GB drives and a 8 port RAID5 SATA controller. This would give me just over 2.5TB of redundant storage. The rest could be made up of non redundant drives as more space is required.


The second requirement was - the server had to have the grunt to stream HD Video content over the network to my media center and other PCs on the network. So this time I couldn't settle on some budget PCI RAID Controller. After reading lots of reviews I had to go up market a little and get a PCIex4 card as well as find a decent PCIe based motherboard. At first I was planning on getting a proper server motherboard as standard boards usually lack PCIex4 slots but then I spotted this. To my surprise this board from DFI featured a full speed x4 port and was a fraction of the price I would pay for a proper server board. So I grabbed one. I already had a spare Athlon 64 3700+, some RAM, videocard and a couple of 10,000RPM SATA drives to use as a system drive.

The last requirement was to make the system as quiet as possible while still making sure nothing overheats (constant problem with the old servers). I've already ended up with less then half of the drives I used to have in the old servers so that would make it quite a bit quieter and save some power too.
Now efficiently cooling this system was always going to be a problem. I wanted to use as many large, low RPM fans as possible (fast smaller fans make more noise for little effect). I settled for 2x120mm intake fans, 3x80mm exaust fans and a 120mm cpu fan on top of a heatpipe based cooler.


So here is my final solution in all its glory. Well sorta, some assembly was still required. This is where all the problems began.


Firstly, the CPU cooler ended up being way too massive to fit inside my server case so I had to revert back to standard cooling and a smaller 80mm fan. Second and by far the worst - while disassembling the old server (while it was still running) I miscalculated exactly how many screws were holding the had drive cage to the actual case and accidentally removed the last screw. This was a bad idea because it sent two hard drives into a 1/2 a meter freefall to the floor. Upon impact one of the drives died instantly while the other one suffered some mortal wounds. In other words it refused to spin up properly giving out lots of clicking and crunching sounds as it tried. This was a problem. Both drives were part of the same RAID5 array so loosing one drive wasn't a problem but I couldn't lose both as that would lose all the data on the array. Now I donut know where from, but I knew a simple trick that had a chance of bringing my drive back to life for a short amount of time. A shot of adrenaline so to speak to give the drive just that little strength if only even for a
short time (hopefully long enough to copy all the data off the array). The trick involves deep freezing the drive. No "Deep Freezing" is not some technical geeky term for anything. You take a drive, wrap it in some plastic non static bag and chuck it into a freezer over night. I have never tried this before in my life, but I heard and read lots of success stories about this method and I was willing to try it. So I present to you exibit A.

While the injured drive was chilling it was time for me build the new server. Whenever building a new machine I always prefer to lay out all the parts on a desk/floor and connect them all right there and not inside a case. This allows for much better access to all parts in case I need to move things around.

This was a great idea this time as I kept hitting more problems. The original plan was to run two 10,000 rpm drives in RAID1 of the motherboard onboard controller as a system drive and 8x400 off the PCIe card. So I plugged it all in, defined all the arrays, popped in the OS CD, started the install, popped in the RAID driver floppy. Everything was going smoothly until the first reboot. BAM! "Windows could not read from the boot disk" error. I was kinda surprised as it had no problem detecting the RAID0 array during the install process. I double checked I was giving it the right RAID driver floppy, all the cables, everything. Tried the process a couple more times - same result. At this stage any respectable tech will tell you - to find the cause of the problem strip the pc down to most basic parts and try the process again, add one extra part, repeat until the problem occurs. So it begins - installing windows over and over again is probably the most boring thing you can imagine but I had to do it.

First try: no PCIe card, single sata system drive - works fine
Second try: no PCIe card, RAID0 system drive - works fine to my surprise. Finished windows install. Hmm.. I thought. It must be the raid card. So I popped that in connected all the drives and, again, to my amazement it booted fine. Something was missing but I just couldn't put my finger on it. I thought I'd just go on with the rest of the process and see where I get. Rebooted and defined the array on the raid card, rebooted again - BAM! same error. Rebooted, deleted the array - no error. Defined array again - error. WTF! I thought. I tried several other combinations and eventually came up with another combination that worked - single system drive with PCIe card having an array defined. There seemed to be some sort of problem with having 2 arrays defined while booting of the first one. I didn't know why and at this stage I didnt care so I just went with this setup. Finished the windows install, drivers, patches, everything. I felt like I had done enough for a day. The next day was going to be the real test to see if the injured drive would spring back to life.

The next day I woke up sometime in the afternoon and got straight back into it. Ran down the the fridge to see how the injured drive was doing. Did a stupid thing by taking the drive out of the bag with bare hands (cold metal object tend to stick to your skin). Plugged it back in the old server and to my amazement the drive spun up. It was still making random clicking noises depending on which side it was sitting on. After playing with it for a while I found a position where it seemed to have the least clicks.

It was still not healthy as it was reading really slow, but it was reading!! Started copying everything. Estimated time - 8 hours :( So I met up with some friends, played some pool, you know, all the boring stuff :) Had another friend over for drinks. Ksenia was even nice enough to bring me dinner (If you are reading this, thanks, you are the best).


By the next morning all the data was copied, including all the other arrays of the old servers. In fact the rest of the process went without a glitch. I got everything finished before the 30th which is when I planned to go visit my family for new years. As always, they had lots of other Russians over for the celebrations. Lots of food and drink had been consumed. Got back on the 2nd and decided to clean up the flat and get rid of lots of useless junk from my room. Managed to fill 4 large boxes of stuff I didn't need any more just from my room.

Now I'm back at work still not realizing just how another year has passed by. Happy new year everyone, hope you had as much fun as I did during your break :)

Oh I also promise I will make another post with some more pics of what happened to me between this time and my last post (I know I slacked and didn't update my blog)

3 comments:

Anonymous said...

Ksenia was sitting on the couch at your place and your farting around with your PC? Idiot. :)

Anonymous said...

You can build me a server any day :-)
If you move to sweden that is!

Andrey said...

Hmm I had to turn on comment moderation cos I got some spam before. I didn't know I would have to approve every single comment from now on. And the stupid thing didn't notify me. Sigh