Injecting Machine Learning And Bayesian Optimization Into HPC – The Next Platform
Posted: December 3, 2020 at 4:58 am
No matter what kind of traditional HPC simulation and modeling system you have, no matter what kind of fancy new machine learning AI system you have, IBM has an appliance that it wants to sell you to help make these systems work better and work better together if you are mixing HPC and AI.
It is called the Bayesian Optimization Accelerator, and it is a homegrown statistical analytics stack that runs on one or more of Big Blues Witherspoon Power AC922 hybrid CPU-GPU supercomputer nodes the ones that are used in the Summit supercomputer at Oak Ridge National Laboratories and the Sierra supercomputer used at Lawrence Livermore National Laboratory.
IBM has been touting the ideas behind the BOA system for more than two years now, and it is finally being commercialized after some initial testing in specific domains that illustrate the principles that can be modified and applied to all kinds of simulation and modeling workloads. Dave Turek, now retired from IBM but the longtime executive steering the companys HPC efforts, walked us through the theory behind the BOA software stack, which presumably came out of IBM Research, way back at SC18 two years ago. As far as we can tell, this is still the best English language description of what BOA does and how it does it. Turek gave us an update on BOA at our HPC Day event ahead of SC19 last year, focusing specifically on how Bayesian statistical principles can be applied to ensembles of simulations in classical HPC applications to do better work and get to results faster.
In the HPC world, we tend to try to throw more hardware at the problem and then figure out how to scale up frameworks to share memory and scale out applications across the more capacious hardware, but this is different. With BOA, the ideas can be applied to any HPC system, regardless of vendor or architecture. This is not only transformational for IBM in that it feels more like a service encapsulated in an appliance and will have an annuity-like revenue stream across many thousands of potential HPC installations. It is also important for IBM in that the next generation exascale machines in the United States, where IBM won the big deals for Summit and Sierra, are not based on the combination of IBM Power processors, Nvidia GPU accelerators, and Mellanox InfiniBand interconnects. The follow-on Frontier and El Capitan systems at these labs are rather using AMD CPU and GPU compute engines and a mix of Infinity Fabric for in-node connectivity and Cray Slingshot Ethernet (now part of Hewlett Packard Enterprise) for lashing nodes together. Even these machines might benefit from BOA, which gives Big Blue some play across the HPC spectrum, much as its Spectrum Scale (formerly GPFS) parallel file system is often used in systems where IBM is not the primary contractor. BOA is even more open in this sense, although like GPFS, the underlying software stack used in the BOA appliance is not open source anymore than GPFS is. This is very unlikely to change, even with IBM acquiring Red Hat last year and becoming the largest vendor of support contracts for tested and integrated open source software stacks in the world.
So what is this thing that IBM is selling? As the name suggests, it is based on Bayesian optimization, a field of mathematics that was created by Jonas Mockus in the 1970s and that has been applied to all kinds of algorithms including various kinds of reinforcement learning systems in the artificial intelligence field. But it is important to note that Bayesian optimization does not itself involve machine learning based on neural networks, but what IBM is in fact doing is using Bayesian optimization and machine learning together to drive ensembles of HPC simulations and models. This is the clever bit.
With Bayesian optimization, you know there is a function in the world and it is in a black box (mathematically speaking, not literally). You have a set of inputs and you see how it behaves through its outputs. The optimization part is to build a database of inputs and outputs and to statistically infer something about what is going on between the two, and then create a mathematical guess about what a better set of inputs might be to get a desired output. The trick is to use machine learning training to watch what a database of inputs yields for outputs, and you use the results of that to infer what the next set of inputs should be. In the case of HPC simulations, this means you can figure out what should be simulated instead of trying to simulate all possible scenarios or at least a very large number of them. BOA doesnt change the simulation code one bit and that is important. It just is given a sense of the desired goal of the simulation thats the tricky part that requires the domain expertise that IBM Research can supply and watches the inputs and outputs of simulations and offers suggested inputs.
The net effect of BOA is that, over time, you need less computing to run an HPC ensemble, and you also can converge to the answer is less time as well. Or, more of that computing can be dedicated to driving larger or more fine-grained simulations because the number of runs in an ensemble is a lot lower. We all know that time is fluid money and that hardware is also frozen money depreciated one little trickle at a time through use, and add them together and there is a lot of money that can potentially be saved.
Chris Porter, offering manager for HPC cloud for Power Systems at IBM, walked us through how BOA is being commercialized and some of the data from the early use cases where BOA was deployed.
One of the early use cases was at the Texas Advanced Computing Center at the University of Texas at Austin, where Mary Wheeler, a world-renowned expert in numerical methods for partial differential equations as they apply to oil and gas reservoir models, used the BOA appliance in some simulations. To be specific, Wheelers reservoir model is called the Integrated Parallel Accurate Reservoir Simulator, or IPARS, and it has gradient descent/ascent model built within it. Using their standard technique for maximizing the oil extraction from a reservoir with the model, it would take on the order of 200 evaluations of the model to get what Porter characterized as a good result. But by injecting BOA into the flow of simulations, they could get the same result with only 73 evaluations. That is a 63.5 percent reduction in the number of evaluations performed.
IBMs own Power10 design team also used BOA in its electronic design automation (EDA) workflow, specifically to check the signal integrity of the design. To do so using the raw EDA software took over 5,600 simulations, and IBM did all of that work as it normally would do. But then IBM added BOA to the stack and redid all of the work, and go to the same level of accuracy in analyzing the signal integrity of the Power10 chips traces with only 140 simulations. That is a 97.5 percent reduction in computing needed or a factor of 40X speedup if you want to look at it that way. (Porter warns that not all simulations will see this kind of huge bump.)
In a third use case, a petroleum company that creates industrial lubricants, whom Porter could not name, was creating a lubricant that had three components. There are myriad different proportions to mix them in to get a desired viscosity and slipperiness, and the important factor is that one of these components was very expensive and the other two were not. Maximizing the performance of the lubricant while minimizing the amount of the expensive item was the task in this case, and this company ran the simulation without and then with the BOA appliance plugged in. Heres the fun bit: BOA found a totally unusual configuration that this companys scientists would have never thought of and was able to find the right mix with four orders of magnitude more certainty than prior ensemble simulations and did one-third as many simulations to get to the result.
These are dramatic speedups, and demonstrate the principle that changing algorithms and methods is as important as changing hardware to run older algorithms and methods.
IBM is being a bit secretive about what is in the BOA software stack, but it is using PyTorch and TensorFlow for machine learning frameworks in different stages and GP Pro for sparse Gaussian process analysis, all of which have been tuned to run across the IBM Power9 and Nvidia V100 GPU accelerators in a hybrid (and memory coherent) fashion. The BOA stack could, in theory, run on any system with any CPU and any GPU, but it really is tuned up for the Power AC922 hardware.
At the moment, IBM is selling two different configurations of the BOA appliance. One has two V100 GPU accelerators, each with 16 GB of HBM2 memory, and two Power9 processors with a total of 40 cores running at a base 2 GHz and a turbo boost 2.87 GHz and 256 GB of their own DDR4 memory. The second BOA hardware configuration has a pair of Power9 chips with a total of 44 cores running at a base 1.9 GHz and a turbo boost to 3.1 GHz with its own 1 TB of memory, plus four of the V100 GPU accelerators with 16 GB of HBM2 memory each.
IBM is not providing pricing for these two machines, or the BOA stack on top of it, but Porter says that it is sold under an annual subscription that runs to hundreds of thousands of dollars per server per year. That may sound like a lot, but considering the cost of an HPC cluster, which runs from millions of dollars to hundreds of millions of dollars, this is a small percentage of the overall cost and can help boost the effective performance of the machine by an order of magnitude or more.
The BOA appliance became available on November 27. Initial target customers are in molecular modeling, aerospace and auto manufacturing, drug discovery, and oil and gas reservoir modeling and a bit of seismic processing, too.
Read more from the original source:
Injecting Machine Learning And Bayesian Optimization Into HPC - The Next Platform
- Getting Started With Machine Learning: Definition and Applications - CMSWire - February 20th, 2021
- This Biotech Company Combines Single Cell Genomics with Machine Learning (ML) Algorithms To Enable High Resolution Profiling of the Immune System -... - February 20th, 2021
- Immunai Raises $60M to Decode the Immune System with Machine Learning and AI - AlleyWatch - February 20th, 2021
- Cloud Machine Learning Market: Indoor Applications Projected to be the Most Attractive Segment during 2021-2029 KSU | The Sentinel Newspaper - KSU |... - February 20th, 2021
- Machine Learning in Insurance Market: Indoor Applications Projected to be the Most Attractive Segment during 2021-2029 KSU | The Sentinel Newspaper -... - February 20th, 2021
- Carin Meier Using Machine Learning to Combat Major Illness, such as the Coronavirus - InfoQ.com - February 20th, 2021
- Moffitt Cancer Center: Why we are building the first machine learning department in oncology - The Cancer Letter - February 20th, 2021
- Machine Learning and where is it used? - Tech Guide - February 20th, 2021
- Artificial Intelligence and Machine Learning for Insurance Technology from Johnson Controls Available on the Ocean Tomo Bid-Ask Market - Yahoo Finance - February 20th, 2021
- Identifying COVID-19 Therapy Candidates With Machine Learning - Contagionlive.com - February 20th, 2021
- Machine Learning in Tax and Accounting Market gigantic revenues by 2028 with Amazon Web Services, Baidu Inc, Google, Intel, IBM, Hewlett Packard,... - February 20th, 2021
- Using AI and Machine Learning will increase in horti industry - hortidaily.com - February 13th, 2021
- The head of JPMorgan's machine learning platform explained what it's like to work there - eFinancialCareers - February 13th, 2021
- If you know nothing about deep learning with Python, start here - TechTalks - February 13th, 2021
- Mental health diagnoses and the role of machine learning - Health Europa - February 13th, 2021
- 5 Ways the IoT and Machine Learning Improve Operations - BOSS Magazine - February 13th, 2021
- There Is No Silver Bullet Machine Learning Solution - Analytics India Magazine - February 13th, 2021
- Postdoctoral Research Associate in Digital Humanities and Machine Learning job with DURHAM UNIVERSITY | 246392 - Times Higher Education (THE) - February 13th, 2021
- The Collision of AI's Machine Learning and Manipulation: Deepfake Litigation Risks to Companies from a Product Liability, Privacy, and Cyber... - February 13th, 2021
- Parascript and SFORCE Partner to Leverage Machine Learning Eliminating Barriers to Automation - GlobeNewswire - February 13th, 2021
- Rackspace Technology Study uncovers AI and Machine Learning knowledge gap in the UAE - Intelligent CIO ME - February 13th, 2021
- How Blockchain and Machine Learning Impact on education system - ABCmoney.co.uk - February 13th, 2021
- Mission Healthcare of San Diego Adopts Muse Healthcare's Machine Learning Tool - Southernminn.com - January 19th, 2021
- Deep Learning Outperforms Standard Machine Learning in Biomedical Research Applications, Research Shows - Georgia State University News - January 19th, 2021
- Project MEDAL to apply machine learning to aero innovation - The Engineer - January 19th, 2021
- Forecast On Machine Learning (ML) Intelligent Process Automation Market Witness the Growth of Great Billion by 2027 With Top Companies Like Automation... - January 19th, 2021
- Machine Learning Shown to Identify Patient Response to Sarilumab in Rheumatoid Arthritis - AJMC.com Managed Markets Network - January 19th, 2021
- Bangalore based Great Learning can help you unleash the potential of an M-Tech in Data Science & Machine - Times of India - January 19th, 2021
- CERC plans to embrace AI, machine learning to improve functioning - Business Standard - January 19th, 2021
- NTT Co-authored Papers at NeurIPS to Advance Machine Learning Efficiency and Performance - Business Wire - December 7th, 2020
- Why Intel believes confidential computing will boost AI and machine learning - VentureBeat - December 3rd, 2020
- Machine Learning Market to Grow Notably Attributed to Increasing Adoption of Analytics-driven Solutions by Developing Economies, says Fortune Business... - December 3rd, 2020
- Machine learning: The new language of data and analytics - ITProPortal - December 3rd, 2020
- QA Increasingly Benefits from AI and Machine Learning - RTInsights - December 3rd, 2020
- Everything to Know About Machine Learning as a Service (MLaaS) - Analytics Insight - December 3rd, 2020
- How the Food and Beverage Industry is Affected by Machine Learning and AI - IoT For All - December 3rd, 2020
- Amazon announces new machine learning tools to help customers monitor machines and worker safety - www.computing.co.uk - December 3rd, 2020
- Machine Learning and Location Data Applications Market 2020 Top Companies report covers, Industry Outlook, Top Countries Analysis & Top... - December 3rd, 2020
- Commentary: Chain of Demand applies AI, machine learning to retail supply chain profitability - FreightWaves - December 3rd, 2020
- Machine learning - it's all about the data - KHL Group - December 3rd, 2020
- Product Portfolio Analysis and Technological Development of Machine Learning in Medical Imaging Market during the forecasted period - Murphy's Hockey... - December 3rd, 2020
- Imaging AI and Machine Learning Beyond the Hype, Upcoming Webinar Hosted by Xtalks - PR Web - December 3rd, 2020
- Veritone aiWARE Now Supports NVIDIA CUDA for GPU-based AI and Machine Learning - Business Wire - December 3rd, 2020
- Exactech Launches Predict+, First Machine Learning-Based Software that Informs Surgeons with Patient-Specific Outcomes Predictions After Shoulder... - December 3rd, 2020
- How To Choose The Best Machine Learning Algorithm For A Particular Problem? - Analytics India Magazine - October 19th, 2020
- Lantronix Brings Advanced AI and Machine Learning to Smart Cameras With New Open-Q 610 SOM Based on the Powerful Qualcomm QCS610 System on Chip (SOC)... - October 19th, 2020
- AI and Machine Learning Technologies Expected to Play a Key Role in Expanding Multi Billion Dollar Digital Banking Sector: Report - Crowdfund Insider - October 19th, 2020
- AutoML Alleviates the Process of Machine Learning Analysis - Analytics Insight - October 19th, 2020
- Futurism Reinforces Its Next-Gen Business Commerce Platform With Advanced Machine Learning and Artificial Intelligence Capabilities - Yahoo Finance - October 19th, 2020
- Purebase Enhances Its Board of Advisors with An Expert on Machine Learning and Cheminformatics - GlobeNewswire - October 19th, 2020
- COVID-19 And The Role Of AI, Machine Learning In Logistics: A Conversation With Delhivery CTO Kapil Bharati - Mashable India - October 19th, 2020
- How to Beat Analysts and the Stock Market with Machine Learning - Knowledge@Wharton - October 19th, 2020
- AI and Machine Learning Can Help Fintechs if We Focus on Practical Implementation and Move Away from Overhyped Narratives, Researcher Says - Crowdfund... - October 19th, 2020
- Proximity matters: Using machine learning and geospatial analytics to reduce COVID-19 exposure risk - Healthcare IT News - September 20th, 2020
- PREDICTING THE OPTIMUM PATH - Port Strategy - September 20th, 2020
- What is 'custom machine learning' and why is it important for programmatic optimisation? - The Drum - September 20th, 2020
- How Machine Learning is Set to Transform the Online Gaming Community - Techiexpert.com - TechiExpert.com - September 20th, 2020
- Current and future regulatory landscape for AI and machine learning in the investment management sector - Lexology - September 20th, 2020
- Global Machine Learning Courses Market Research Report 2015-2027 of Major Types, Applications and Competitive Vendors in Top Regions and Countries -... - September 20th, 2020
- When AI in healthcare goes wrong, who is responsible? - Quartz - September 20th, 2020
- Is Wide-Spread Use of AI & Machine Intelligence in Manufacturing Still Years Away? - Automation World - September 20th, 2020
- How do we know AI is ready to be in the wild? Maybe a critic is needed - ZDNet - September 20th, 2020
- Solving the crux behind Apple's Silicon Strategy - Medium - September 20th, 2020
- Boost Your Animation To 60 FPS Using AI - Hackaday - September 20th, 2020
- 50 Latest Data Science And Analytics Jobs That Opened Last Week - Analytics India Magazine - September 20th, 2020
- Algorithms may never really figure us out thank goodness - The Boston Globe - September 20th, 2020
- Why Deep Learning DevCon Comes At The Right Time - Analytics India Magazine - September 20th, 2020
- Six notable benefits of AI in finance, and what they mean for humans - Daily Maverick - September 20th, 2020
- Twitter is looking into why its photo preview appears to favor white faces over Black faces - The Verge - September 20th, 2020
- 8 Trending skills you need to be a good Python Developer - iLounge - September 20th, 2020
- Automation Continuum - Leveraging AI and ML to Optimise RPA - Analytics Insight - September 20th, 2020
- UT Austin Selected as Home of National AI Institute Focused on Machine Learning - UT News | The University of Texas at Austin - August 27th, 2020
- Participation-washing could be the next dangerous fad in machine learning - MIT Technology Review - August 27th, 2020
- Getting to the heart of machine learning and complex humans - The Irish Times - August 27th, 2020
- Air Force Taps Machine Learning to Speed Up Flight Certifications - Nextgov - August 27th, 2020
- The Role of Artificial Intelligence and Machine Learning in the... - Insurance CIO Outlook - August 27th, 2020
- AI and Machine Learning Network Fetch.ai Partners Open-Source Blockchain Protocol Waves to Conduct R&D on DLT - Crowdfund Insider - August 27th, 2020
- AI may not predict the next pandemic, but big data and machine learning can fight this one - ZDNet - August 27th, 2020
- Machine Learning Artificial intelligence Market Size and Growth By Leading Vendors, By Types and Application, By End Users and Forecast to 2020-2027 -... - August 27th, 2020
- Explainable AI: From the peak of inflated expectations to the pitfalls of interpreting machine learning models - ZDNet - August 27th, 2020