Jump to content

Congrats on the Sponsor Forum at Head-fi!


Fud

Recommended Posts

Just saw it up today, and I'm looking forward to seeing it full of threads in the future. Hope to see many of you early adopters posting initial impressions on the line of earphones...hopefully before I make my purchase =)

I'd also like to say, very creative banner ad =P

Link to comment
Share on other sites

This is what you get if you click on the Head-Fi hyper link which is with your error message, this would appear catastrophic for the website for sure. I'm thinking about now they have been having those total redundancy for our network meetings I will say this is the first glitch I've ever encountered out at Head-Fi[^o)] Everyone with a large network suffers a blow every now and then but the gap is being closed further and further all the time.

hflogo.gif

2007-11-13 0857 EST

This is obviously our worst outage in the history of Head-Fi.org. What happened was that we had Head-Fi.org's files and backups moved to a multi-terabyte network attached storage (NAS) unit while we continued to work on the proper implementation of a true clustering configuration for Head-Fi.org. From what we can tell, this particular NAS unit--with a reputation for being ultra-reliable--had one of its 12-channel RAID controllers malfunction.

This particular NAS unit is a 24-drive unit, made up of two 12-drive arrays, each array with two parity drives (RAID 6). Maybe we put too much faith in it, but we thought were safe housing everything on it for the time being. From what we're being told, when the controller card malfunctioned, it messed up the NAS unit's logical volume, which is where we're at now. We are working closely with the vendor and the technical support team in Europe to restore the logical volume and get the NAS back up again. We feel confident we will be able to restore Head-Fi to its state just before its outage, but won't know for sure if we'll have to fall back to a back-up, of which there are many on the NAS. Unfortunately, the only off-NAS backups we have of Head-Fi.org's databases are quite old, meaning we'd potentially lose thousands of posts, so I will not put Head-Fi.org back up until we know for sure the status of the logical volume restoration.

The repair was well under way yesterday when the repair process ran out of RAM. (The NAS has four gigabytes of RAM.) Since, for a number of reasons, the repair process was being run almost entirely from RAM, the four gigs was apparently not enough. I have ordered 16 gigabytes of RAM, which will arrive this morning, immediately after which I will head to the datacenter to install it; then the team in Europe can commence with the remotely administered repair process(es). The repair has gone slower than we anticipated, and running out of RAM yesterday was an unfortunate setback. But, once again, 16 gigabytes of RAM (versus the four gigabytes in there now) should be arriving this morning. whereupon we will immediately call our friends in Europe so they can continue with the repair work. We already know some data was lost, but hope and pray that what we do retrieve will be enough to let us get the site back up this evening.

We know we should have been more diligent about keeping more backups off the NAS, but running two 12-drive arrays, each array in RAID 6 (two parity drives, for a total of four)--and the fact that our previous, self-built NAS units ran without problems for over seven years--we felt we were safe in keeping them there until we were finally through with the proper clustering we've intended for months.

All I can do at this point (other than what we're doing above) is to apologize to you all for the outage. Though Head-Fi isn't what I do for a living, it is very important to me as a gathering place for friends, and I know it is for many of you, too. Once again, I'm sorry about this extended outage, and will continue to work on it until we're back up (hopefully tonight).

Link to comment
Share on other sites

"This is obviously our worst outage in the history of Head-Fi.org." He's not kidding!

2007-11-14 1025 EST

"We are still at the datacenter, working with the vendor and OS team in Europe to restore the NAS. Even with 16 gigabytes of memory, the file system repair ran out of memory. The OS team in Europe has since upgraded our version of the file system repair program, which is apparently more efficient and less memory-intensive. It is in process right now, but getting it re-configured (including installing still another drive to provide ~160GB of swap) took most of the night. I have, for the most part, been at this datacenter since Saturday night. I have my portable rig to help me block out the din of the thousands of servers behind me in this datacenter, but I'm so ready to go home, and I'm so ready to get back to Head-Fi'ing with the rest of you.

Thank you for your patience, and, once again, sorry."

There must be some serious withdrawel happening over there[:o]

Link to comment
Share on other sites

  • 2 weeks later...

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...