No announcement yet.

Viewer Feedback: IBM Supercomputer to use Linux OS

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Viewer Feedback: IBM Supercomputer to use Linux OS

    I always like to get viewer feedback regarding news posts. It just so happens that "Brett" saw our post and decided to respond with a bit of a humerous situation!

    Thanks, Brett!


    Hi...

    I'm a consultant in New York. I have a client who houses their systems in
    an IBM data center in New Jersey. In the cage next to my client's is the
    current IBM Linux supercomputer called the Shark. I'm not sure of the
    specs, but it is the precursor to the one that you discuss in your post. IBM
    uses it as a giant multiprocessor time share Linux machine. Basically you
    can "buy" as many simultaneous instances of Linux with as much ram and disk
    (arranged however you like) as you need. The CPU and ram is in one cage,
    and the disks are in another--all connected via some type of fiber
    connections. (If you want 500 virtual Linux boxes each with 2gb of ram and
    500gb of disk, it's only a few keystrokes and mouse clicks away--that's the
    theory anyway.)

    Anyway, the reason I'm writing to you is to tell you of a BIG screw-up that
    happened on Friday. Although the data center is IBM's, the actual facility
    is owned and mostly operated by AT&T. IBM occupies about 85% of the
    gigantic facility, but things like HVAC, security and power are handled by
    AT&T. At about 1pm here in New York it seems that all of IBM's space in the
    data center lost all electrical power. Under no circumstances should this
    have happened as there are 5 different levels of power redundancy built into
    the facility. (multiple outside power sources, multiple UPS's and multiple
    diesel generators).

    The IBM Shark, at 11pm that evening was still totally down. There were tons
    of IBM technicians and managers bleary-eyed and very frustrated. Although
    the power was restored minutes after it was lost, the system was totally
    mangled. They just couldn't get the overarching OS that controls each
    instance of Linux to load. Also, the disks were in terrible condition with
    corruption all over the place. The ramifications of this were (and maybe
    still are) that none of the many existing customers who buy capacity on this
    system were operational and surely will have problems when things get sorted
    out. (Imagine having to manually fix THOUSANDS of individual disks?)

    One more thing...The reason for the power outage was that AT&T were trying
    to save some money by turning off power distribution units that were not
    being used. The technician who was doing it never bothered to determine
    which PDU's actually had loads on them. He simply switched them all off.
    When I walked into the data center to fix my client's systems one of the IBM
    people said to me that he had never seen so many "OK" prompts in one day.
    Basically everything in the place just came to a screaching halt. It was a
    disaster but it was funny too.

    Just thought that you might like to know.

    Brett
    Chris "Raven"
    News Crew - TweakTown
    <!--
    <font size=1>
    <font color=green>Main Beast:</font>
    - Athlon XP 1800+@1701MHz | EPoX 8KHA+ | Corsair 512MB XMS3200C2 | GeForce3 Ti200
    - 2x80GB WD 7200RPM | 40x12x48 Sony CDRW | Pioneer 16X DVD
    - Swiftech MCX462+ / Tt Smart Case Fan 2 | Antec 1030SX case w/ 431W Enermax PSU
    </font>
    // -->
    "Look at life like your morning cup of coffee. You might have one every day, yet you still enjoy it."

    How to ask a good question

  • #2
    Ohh Man...that had to have hurt!

    I would think they had to have had some sort of backup system but still...it's gonna be a huge job getting things back up and running.

    Wonder if the tech that shut the power off is still working :shoot3:

    Comment


    • #3
      Holy crap! :eek:

      That's got to hurt. :(

      Comment

      Working...
      X