The San Diego Supercomputer Center (SDSC) at the University of California, San Diego is kicking off its 2014 series of “Data Mining Boot Camps” aimed at helping business professionals and academic research scientists gain a clear understanding of how to rapidly translate the burgeoning amounts of data and learn how to design, build, verify and evaluate predictive models.
Developed and organized by SDSC’s Predictive Analytics Center of Excellence, or PACE, the two-day sessions provide conceptual and hands-on training with critical predictive analytic tools and techniques that non-computer science professionals can use to detect patterns and relationships in what is now referred to as “Big Data.”
Each day, our society creates 2.5 quintillion bytes of data (that’s 2.5 followed by 18 zeros). The need among researchers to make sense of all this information has become more acute, and as a result demand for data scientists has increased steadily. According to the Harvard Business Journal and Fortune magazine, a career as a data scientist is “the” job to have in the 21st century. The McKinsey Global Institute’s Big Data Report notes that by 2018, the U.S. alone could face a shortage of 140,000 to 190,000 professional with deep analytical skills, as well as 1.5 million managers and analysts with the know-how to properly analyze Big Data to make more effective decisions.
“Conventional statistical analysis and business intelligence software, although useful, are not designed to capture, curate, manage and process large quantities of data generated by most enterprises,” according to Natasha Balac, PACE’s director. “Data mining and predictive modeling, now commonly referred to as data science, are capable of automatic extraction of meaningful value hidden in this data, enabling discovery of new insights and providing a competitive edge.”
Launched in October 2012, the PACE Boot Camps assist organizations by expanding the analytical skills of their own subject matter experts to develop a built-in pool of talented data scientists, as well as preparing managers and analysts to perform in-depth analyses of massive or multiple data sets. The workshops have generated a tremendous amount of interest from both industry and academia, including many UC San Diego colleagues. The sessions have attracted a wide range of industry participants, including some unanticipated business sectors such as utilities, food services, and gaming industry.
The comprehensive, hands-on curriculum used during the PACE Data Mining Boot Camps is an outgrowth of the data mining certification courses offered through the UC San Diego Extension. The instructors for the data certification courses also lead the training. Specifically, the PACE Boot Camps cover basic data mining, data analysis, pattern recognition concepts, and predictive modeling algorithms so that participants can explore and implement analyses on their data.
Participants also will have access to a comprehensive set of data mining tools available on SDSC’s Gordon, one of the world’s most innovative supercomputers with 300 terabytes of flash memory. Participants will be able to apply data mining algorithms to real data and interpret the results. The classroom setting allows the instructors to work one-on-one with participants during the hands-on training. By area, Boot Camp training includes:
- Overview: Data Mining, Machine Learning and Statistics
- Overview of CRISP-DM: Cross Industry Standard Process for Data Mining
- Introduction to Data Mining Tools
- Data Preparation
- Learning Algorithms Implementations
- Model Evaluation and Validation
- Data Mining Trends, Applications and Guidelines
Participants can register for the 2014 series of PACE Boot Camps at pace.sdsc.edu/boot-camps or ask for more information by emailing firstname.lastname@example.org or calling (858) 534-8321. PACE also conducts onsite training tailored to meet an organization’s specific core research and business objectives.