February 17, 2003  

Expanded storage helps U.S. Energy Department lab maintain scientific research projects

By Stephen Lawton

Imagine yourself at the foot of Niagara Falls. Now imagine that all that falling water was data, and your job was to collect, store and process that data. A daunting task, to say the least, but it's not as uncommon as you might think.

Today, laboratories throughout the country are collecting, storing and processing terabytes of streaming data from a plethora of scientific experiments, ranging from sonar scans of the ocean's depths and images from the Hubble Space Telescope to monitoring subatomic particles generated at the nation's many research centers. The sheer volume of data is mind-boggling.

The U.S. Department of Energy's Brookhaven National Laboratory alone is collecting and processing multiple terabytes of data in its Relativistic Heavy Ion Collider (RHIC) experiments.

A lot to ask
To gain a better understanding of what actually goes on at Brookhaven, located in Upton, N.Y., let's look at how it describes its activities: "Hundreds of physicists from around the world use RHIC to study what the universe may have looked like in the first few moments after its creation. RHIC drives two intersecting beams of gold ions head-on, in a subatomic collision. What physicists learn from these collisions may help us understand more about why the physical world works the way it does, from the smallest subatomic particles, to the largest stars."

That's a lot to ask from a single laboratory. Managing all this computing resource is Maurice Askinazi, group leader of the general computing environment at RHIC. Askinazi's IT department consists of some 20 staffers, four of whom manage the overall operations, with the others helping out on specific lab projects. The primary platforms they use are Linux-based servers from Sun Microsystems, disk storage from MTI Technology Corp. and tape storage from Storage Technology Corp.

Currently, the facility is operating four key experiments:

* The Solenoidal Tracker at RHIC (STAR), a detector that specializes in tracking the thousands of particles produced by each ion collision at RHIC.

* The PHENIX detector, which records many different particles emerging from RHIC collisions, including photons, electrons, muons and quark-containing particles called hadrons.

* Broad Range Hadron Magnetic Spectrometer (BRAHMS), a device that studies particles called charged hadrons as they pass through detectors called spectrometers.

* PHOBOS, an experiment based on the premise that, according to the RHIC, "Interesting collisions will be rare, but that when they do occur, new physics will be readily identified."

Collider data is collected by sensors, transferred to the MTI disk arrays, archived to tape, then brought back to the disk subsystems in much smaller chunks for analysis, says Askinazi. Today, RHIC has 800 dual-processor servers crunching on nearly 100 terabytes of data.

Performance, reliability, price
Recently, Askinazi needed to expand his disk storage. He had three criteria that had to be met before he would even consider looking at a storage vendor's offerings. First, the new system needed to meet the performance specifications to read and write data. Second, it needed to meet the reliability requirements set forth by the managers of the various experiments. Last, it needed to hit a price point predetermined by the IT budget.

Askinazi realized that he might have to give up reliability or performance for price, or price for performance or reliability.

After wading through the proposals, Askinazi reduced the candidates to two. The least expensive proposal met all the criteria on paper, he says, but could not perform in RHIC's environment. The second candidate, MTI's Vivant D100 direct-attached storage system, not only provided the required performance and reliability, he says, but also met the laboratory's budget target.

The Vivant D100 was so flexible, Askinazi was able to reconfigure the system so that three RAID 5 subsystems (four data disks plus one parity disk per RAID 5 subsystem) could be configured as a single logical unit number (LUN) to provide even greater performance than MTI promised.

Stretching performance
Originally, the unit was configured so that each RAID 5 subsystem was a separate LUN consisting of five drives. In Askinazi's configuration, the system can write across 12 data drives and three parity drives while using only a single LUN, effectively increasing the number of potential LUNs by a factor of three. "It turns out that the aggregate data stream almost maxes out the Fibre Channel connection," he explains.

In fact, Askinazi documented his modifications and submitted the document to MTI, where it can help other customers get this higher level of performance from their SANs. Reliability was key because Askinazi doesn't have time or personnel to waste on storage systems that need lots of personnel oversight. His goal was to install a storage system that he could plug in, configure and then essentially forget about. "There are so few of us here, we can't absorb the hit of something not working," he notes.

The lab, which already was using an older MTI 3600 storage system, recently added 13 Vivant D100 systems - one with 96 drives and the others with 48 drives each - for a total of 43 terabytes of storage. Askinazi expects to do more business with MTI in the future, as RHIC is "consolidating the number of vendors we work with," he says. The recent deal was a joint effort that involved MTI's Federal Systems Group and regional operations in New York, as well as A&T Systems, a GSA-approved reseller that served as the prime contractor.

Expanding the system
RHIC, in fact, already is planning its next rollouts of MTI SAN systems. The lab is considering upgrading its infrastructure to 2Gbps links between machines in the server farm. It already has 1Gbps links out to the workstations, so Askinazi wants to make sure the internal 1Gbps links don't become a bottleneck. RHIC has a 2Gbps fabric in place using technology from Brocade Communications Systems, which also happens to be one of MTI's business partners.

With a SAN in place that supports 2Gbps speed, there is considerable back-end work that can be done with a second 1Gbps link, which would eliminate potential overhead problems from slowing the primary links to the workstations. Additionally, upgrading the disk farm so that logical arrays are in the same physical cabinet will also make the farm easier to maintain.

Next on Askinazi's to-do list is the Vivant S200 SAN solution - which handles up to 8 terabytes of data storage and offers an end-to-end Fibre Channel data path throughout - and the Vivant 400, which can accommodate nearly 26 terabytes of storage and offers the end-to-end 2Gbps Fibre Channel data path.

One area where the Vivant 400 and the lower end S200 differ is in the management software they use. The S200 uses management software provided by Mylex Corp., which manufactures the disk controllers for the system, while the Vivant 400 uses a more robust package from FalconStor Software that the larger SAN requires, says MTI.

Today, storage is being measured in 25 to 30 terabytes in a single rack. In the next five years, Askinazi expects to see 10 to 20 times as much storage with storage subsystems ranging in the petabytes.

Now, remember that waterfall of data? At Brookhaven National Laboratory, Maurice Askinazi is taking that flood of photons, electrons, muons and hadrons from the collider's sensors and turning it into an organized stream to be stored and processed. From the way he describes it, the scientists fishing for a breakthrough won't have to worry about being overcome by too much input, as long as they continue to have the resources to store and manage their data.

About the author
Stephen Lawton (Lawton@reporters.net) is a freelance writer and former editor-in-chief of MicroTimes and other technology publications.

Copyright 2001-2002 Stephen M. Lawton
All trademarks are the property of their respective companies.