Trestles, a new supercomputer using flash-based memory and launched earlier this year by the San Diego Supercomputer Center (SDSC) at the University of California, San Diego, has made this year’s Graph500 list, a new ranking that measures how well supercomputers handle data-intensive challenges.
Based on runs using less than half of its overall compute capabilities, Trestles took the 42nd spot on the latest Graph500 ranking, announced during the SC11 (Supercomputing 2011) conference in Seattle this week.
Trestles solved the second largest size problem in this ranking, while using a relatively small number of cores. Only two other flash based systems used fewer cores to solve size 2^36, or scale 36, problems.
The Trestles runs, performed last month after about two months of smaller experiments, were made using just 144 nodes and 4,608 cores, or about 45% of the overall cluster. The scale 36 run submitted requires about 12 terabytes to store. For comparison, one terabyte equals one trillion bytes of information, which in printed form would consume paper from about 50,000 trees.
“Trestle’s overall performance, especially among only a handful of other flash-memory based systems on the latest Graph500 list, demonstrates that flash systems are ideally suited for large data-intensive problems,” said Allan Snavely, associate director of SDSC and co-principal investigator for the SDSC system. “In fact, some non-flash based systems used more than 130,000 cores in their entry.”
“We are looking forward to how Gordon performs in the next Graph500 ranking,” said Richard Moore, SDSC’s deputy director. “While Trestles is ideally suited for modest-scale applications from researchers who require quick turnaround times, Gordon is scaled up to excel at solving some of the most data-intensive challenges facing scientists today.”
Developers of the new ranking say the Graph500 is designed to complement the widely regarded Top500 list of the world’s most powerful supercomputers. Industry experts contend that the Graph500 ranking seeks to quantify how much work a supercomputer can do based on its ability to analyze very large graph-based datasets that have millions of disparate points, while the Top500 ranking uses LINPACK software to determine sheer speed – or how fast a supercomputer can perform linear algebraic calculations.
In a separate run, SDSC’s new Gordon supercomputer, a much larger system with more than 300 trillion bytes of high-performance flash-memory solid state drives, made its debut by ranking among the top 50 fastest supercomputers in the world, according to the latest Top500 list that also was announced at SC11. Gordon ranked 48th on the latest version of that widely watched ranking. Full details can be found here.
“There is no doubt that we are entering the era of data-intensive computing, and the ability of supercomputers that use new architectures to rapidly sift through massive data sets to find that elusive ‘needle in the haystack’ will be a critical advantage for many researchers,” said SDSC Director Michael Norman. “Trestles is just one of several new systems at SDSC that have been specifically designed to handle such data-intensive queries, saving time and accelerating scientific discovery.”
“Graph algorithms are fundamental to a wide variety of domains including bioinformatics, cybersecurity, and social network analysis,” said Maya Gokhale, a computer scientist in the Center for Applied Scientific Computing at the Lawrence Livermore National Laboratory (LLNL), and principal investigator for the recent Graph500 runs on Trestles. “There is a growing realization in the HPC community that new architectures are needed to efficiently process these sorts of data intensive workloads, and Trestles and the LLNL Hyperion Data Intensive Testbed represent a new approach to such data-intensive architectures.”
Specifically, the algorithm caches portions of the graph in main memory on each node. The algorithm exploits thread-level concurrency on the compute node to maximize the number of in-flight random read requests to the flash array, thus reducing the overall I/O latency – or increasing the speed of the analysis.
"The flash array is much faster than disk partly because of much lower latency and partly because of much higher parallelism and its ability to handle lots of thread-level concurrent I/O," said Gokhale, adding that “other approaches that don't use flash have to store the entire graph in main memory, thus requiring a very large number of nodes for large graphs.
With 10,368 processor cores, 324 nodes, and a peak speed of 100 teraflops per second (TFlop/s), Trestles has already been used for more than 200 separate research projects since its launch last January, with research areas ranging from astrophysics to molecular dynamics. Designed and built by SDSC researchers and Appro, a leading developer of supercomputing solutions, Trestles features 120GB (gigabytes) of flash solid-state drives (SSDs) in each node, for a total system capacity of 38TB (terabytes) of flash memory, and users have already demonstrated substantial performance improvements for many applications when compared to spinning disk.
Jan Zverina, SDSC Communications,
858 534-5111 or email@example.com
Warren R. Froelich, SDSC Communications,
858 822-3622 or firstname.lastname@example.org
Maria McLaughlin, Appro Marketing,
408 941-8100 x250, or email@example.com