UC San Diego News Center


Experts Explain Why Big Data is a Big Deal

Photos by Erik Jepsen/UC San Diego Publications

Turns out even the experts have difficulty wrapping their heads around the concept of how quickly – and drastically – what’s come to be called ‘big data’ has changed our daily lives.

Defined as the gathering, storing and analysis of massive amounts of computerized data, both unstructured and multi-structured, big data keeps growing at almost unimaginable rates of speed.

“It’s like saying you’ve spent your whole life on an exponential escalator,” said Larry Smarr, director of the California Institute for Telecommunications and Information Technology (Calit2), at a seminar titled “Big Data: A Conversation with the Experts” held on campus March 12. “It just keeps getting bigger and bigger.”

Amid intermittent references to exabytes and petabytes – computer shorthand for untold billions of bytes of digital information – each speaker expressed wonderment at big data’s impact on scientific inquiry and everyday lives.

Larry Smarr, director of the California Institute for Telecommunications and Information Technology (Calit2)

Smarr cited the example of a typical Google search on a smart phone, a now-ubiquitous device that brims with more computer wizardry than all of the Apollo space missions put together. “It’s unbelievable how quickly this field has changed,” he said. “It’s been breathtaking.”

Take another example, he suggested: “You know Angry Birds, the app? There have been one billion downloads. And when you hit ‘Like’ on Facebook, there are five billion of those each day around the world…This is a totally new world, in which the generation of big data has gotten out of the hands of researchers and exploded across the planet to our society as a whole.”

How to begin to comprehend such growth? Start by placing one grain of rice on a chessboard, said Smarr, then double it exponentially square by square.

“By the time you get to the 64th [and final] square, you now have a pile of rice that’s big as Mount Everest,” he said, “which is a 1,000 times the production of rice on the planet.”

Added Smarr: “Never in our history have we had a sustained period of this kind of exponential growth [in computer science]. What we’re talking about is something humanity has never tried to deal with before.”

With that mind-bending prelude, trying to fathom what the future holds was the central theme of the 90-minute seminar held in Atkinson Hall at UC San Diego’s Qualcomm Institute.

Presented by UC San Diego Extension, the event was recorded by UCSD-TV for a future airing.

Another speaker, Dr. Michael Norman, director of the campus’ San Diego Supercomputer Center (SDSC), spoke glowingly of its massive supercomputer, launched nearly two years ago and dubbed “Gordon.”

Rated as one of the world’s fastest large-capacity supercomputers, Gordon shines as a big data repository. In simple terms, Gordon is designed to move, store and analyze data.

“We were going to name it ‘Flash Gordon,’ but we didn’t want to get sued by Marvel Comics,” joked Norman.

In truth, the computer is named for its massive amounts of flash-based memory. Gordon’s role in keeping tabs on the far-flung fields of climatology, finance (think Wall Street), food production, big industry, physics, biological science, government – the list goes on and on – can never be underestimated.

Norman outlined the three major categorical functions of big data: 1) volume, the amount of data; 2) velocity, the speed of information generated; and 3) variety, the kind of data that’s readily available.

Michael Zeller, founder and chief executive officer of San Diego-based Zementis, a leading big data firm, pointed out a fourth “V” category, that of value.

“Is there big hype?” asked Zementis. “Absolutely. But big data transcends all typical boundaries, and we’re just now scratching the surface of big data’s ultimate value in business opportunities. The hype has alerted the executive level about big data. Cutting through the noise is a bigger issue.”

Another speaker, Stefan Savage, professor of computer science and engineering at UC San Diego, spoke of the inherent security risks associated with big data. Unrelenting threats – from pranksters and criminals alike – are out there.

“We are building an enormous structure of stored big data,” said Savage, “and that centralization creates risk.”

He pointed to revelations of massive data leaks from Target and, more troubling, from the National Security Council, as prime examples of big data’s vulnerability.

His admonition: “Security is very much a data-driven field. The goal is to understand the environment better, faster and more efficiently than your adversaries.”