Page 11234..»

Archive for the ‘Alphago’ Category

The Alpha of ‘Go’. What is AlphaGo? | by Christopher Golizio | Apr, 2021 | Medium – Medium

Posted: April 24, 2021 at 1:58 am

without comments

Photo by Maximalfocus on Unsplash

The game of Go is believed to be the oldest, continuously played board game in the world. It was invented in China, over 2,500 years ago, and is still wildly popular to this day. A survey taken in 2016 found there to be around 20 million Go players worldwide, though they mostly live in Asia. The game board uses a grid system, and requires two players. Each player places either black or white stones on the board, one after another. If player A surrounds player Bs stone(s), then any surrounded stone is removed from the board, and later factors in to player As score. If a player connects a series of stones, the number of squares within that connected series will be at least a portion of the final score for that player. The player with the most points wins.

Of course that explanation of the game was over-simplified, but the game itself actually appears to be simple as well. This is true, in terms of the rules and the goal of the game, however the sheer number of legal board positions is over two times the total number of atoms in the observable universe (Go: 2.1*10, Atoms: 1*10). The incomprehensible amount of moves alone add an extreme level of complexity to the game. On top of that, it was, for a long time, believed that Go required a certain level of human intuition to excel at the game, however the reigning world champion of Go inherently disagrees with this particular assessment.

It was believed not too long ago that a computer would never be able to beat a high ranking human Go player. Its happened in other similar style games, namely chess. In 1997 a computer developed by IBM named Deep Blue beat Garry Kasparov, the reigning world chess champ, using the standard regulated time. Deep Blue used a brute force approach. This involves searching every possible move of each piece (both sides of the board) before ultimately choosing which move would give it the highest probability of winning. This was more a big win for hardware; AlphaGo is something completely different.

AlphaGo, developed by artificial intelligence research company Deep Mind, is the result of combining machine learning and tree search techniques, specifically the Monte Carlo tree search, along with extensive training. Its training consisted of playing games, and was carried out by both human and computer play. The decision making is executed via a deep neural network, which implements both a value network and a policy network. These two networks guide the hand of which tree branch should be traversed, and which should be ignored due to a low probability of winning. This greatly decreases the time complexity of AlphaGos operations, while also improving itself over time.

After being acquired by Google, and a couple of new versions, AlphaGo was eventually succeeded by AlphaGo Zero. AlphaGo Zero differed from its predecessor by being completely self-taught. All versions of AlphaGo were trained in part by showing it human played games. AlphaGo Zero however, did not use any dataset fed to it by humans. Even though it pursued a goal blindly, AlphaGo Zero was able to learn and improve until it was able to surpass the all versions of the original AlphaGo in a mere 40 days. Eventually AlphaGo Zero was generalized into AlphaZero.

AlphaGo, and the familial programs which succeeded it, were a major breakthrough in the world of AI. Driven of course by the hard work of many developers, and also due to the self-improvement capability, this is far from the ceiling of AlphaGos full potential. AlphaGo Zero and Alpha Zero further this; due to their lack of human-backed training, the probability is high of a completely generalized AI algorithm, which could be applied to many different and diverse situations, and over a period of time, begin to function at a level of which humans are easily outperformed.

Two more fun facts: Along with Go and chess, MuZero, the successor to AlphaZero, is also capable of playing at least 57 different Atari games at a superhuman level. Additionally, the hardware cost of a single unit used for the AlphaGo Zero system was quoted at $25 Million.

Continue reading here:

The Alpha of 'Go'. What is AlphaGo? | by Christopher Golizio | Apr, 2021 | Medium - Medium

Written by admin

April 24th, 2021 at 1:58 am

Posted in Alphago

Why AI That Teaches Itself to Achieve a Goal Is the Next Big Thing – Harvard Business Review

Posted: at 1:58 am

without comments

Whats the difference between the creative power of game-playing AIs and the predictive AIs most companies seem to use? How they learn. The AIs that thrive at games like Go, creating never before seen strategies, use an approach called reinforcement learning a mature machine learning technology thats good at optimizing tasks in which an agent takes a series of actions over time, where each action is informed by the outcome of the previous ones, and where you cant find a right answer the way you can with a prediction. Its a powerful technology, but most companies dont know how or when to apply it. The authors argue that reinforcement learning algorithms are good at automating and optimizing in situations dynamic situations with nuances that would be too hard to describe with formulas and rules.






Buy Copies


Lee Sedol, a world-class Go Champion, was flummoxed by the 37th move Deepminds AlphaGo made in the second match of the famous 2016 series. So flummoxed that it took him nearly 15 minutes to formulate a response. The move was strange to other experienced Go players as well, with one commentator suggesting it was a mistake. In fact, it was a canonical example of an artificial intelligence algorithm learning something that seemed to go beyond just pattern recognition in data learning something strategic and even creative. Indeed, beyond just feeding the algorithm past examples of Go champions playing games, Deepmind developers trained AlphaGo by having it play many millions of matches against itself. During these matches, the system had the chance to explore new moves and strategies, and then evaluate if they improved performance. Through all this trial and error, it discovered a way to play the game that surprised even the best players in the world.

If this kind of AI with creative capabilities seems different than the chatbots and predictive models most businesses end up with when they apply machine learning, thats because it is. Instead of machine learning that uses historical data to generate predictions, game-playing systems like AlphaGo use reinforcement learning a mature machine learning technology thats good at optimizing tasks. To do so, an agent takes a series of actions over time, and each action is informed by the outcome of the previous ones. Put simply, it works by trying different approaches and latching onto reinforcing the ones that seem to work better than the others. With enough trials, you can reinforce your way to beating your current best approach and discover a new best way to accomplish your task.

Despite its demonstrated usefulness, however, reinforcement learning is mostly used in academia and niche areas like video games and robotics. Companies such as Netflix, Spotify, and Google have started using it, but most businesses lag behind. Yet opportunities are everywhere. In fact, any time you have to make decisions in sequence what AI practitioners call sequential decision tasks there a chance to deploy reinforcement learning.

Consider the many real-world problems that require deciding how to act over time, where there is something to maximize (or minimize), and where youre never explicitly given the correct solution. For example:

If youre a company leader, there are likely many processes youd like to automate or optimize, but that are too dynamic or have too many exceptions and edge cases, to program into software. Through trial and error, reinforcement learning algorithms can learn to solve even the most dynamic optimization problems opening up new avenues for automation and personalization in quickly changing environments.

Many businesses think of machine learning systems as prediction machines and apply algorithms to forecast things like cash flow or customer attrition based on data such as transaction patterns or website analytics behavior. These systems tend to use whats called supervised machine learning. With supervised learning, you typically make a prediction: the stock will likely go up by four points in the next six hours. Then, after you make that prediction, youre given the actual answer: the stock actually went up by three points. The system learns by updating its mapping between input data like past prices of the same stock and perhaps of other equities and indicators and output prediction to better match the actual answer, which is called the ground truth.

With reinforcement learning, however, theres no correct answer to learn from. Reinforcement learning systems produce actions, not predictions theyll suggest the action most likely to maximize (or minimize) a metric. You can only observe how well you did on a particular task and whether it was done faster or more efficiently than before. Because these systems learn through trial and error, they work best when they can rapidly try an action (or sequence of actions) and get feedback a stock market algorithm that takes hundreds of actions per day is a good use case; optimizing customer lifetime value over the course of five years, with only irregular interaction points, is not. Significantly, because of how they learn, they dont need mountains of historical data theyll experiment and create their own data along the way.

They can therefore be used to automate a process, like placing items into a shipping container with a robotic arm; or to optimize a process, like deciding when and through what channel to contact a client who missed a payment, with the highest recouped revenue and lowest expended effort. In either case, designing the inputs, actions, and rewards the system uses is the key it will optimize exactly what you encode it to optimize and doesnt do well with any ambiguity.

Googles use of reinforcement learning to help cool its data centers is a good example of how this technology can be applied. Servers in data centers generate a lot of heat, especially when theyre in close proximity to one another, and overheating can lead to IT performance issues or equipment damage. In this use case, the input data is various measurements about the environment, like air pressure and temperature. The actions are fan speed (which controls air flow) and valve opening (the amount of water used) in air-handling units. The system includes some rules to follow safe operating guidelines, and it sequences how air flows through the center to keep the temperature at a specified level while minimizing energy usage. The physical dynamics of a data center environment are complex and constantly changing; a shift in the weather impacts temperature and humidity, and each physical location often has a unique architecture and set up. Reinforcement learning algorithms are able to pick up on nuances that would be too hard to describe with formulas and rules.

Here at Borealis AI, we partnered with Royal Bank of Canadas Capital Markets business to develop a reinforcement learning-based trade execution system called Aiden. Aidens objective is to execute a customers stock order (to buy or sell a certain number of shares) within a specified time window, seeking prices that minimize loss relative to a specified benchmark. This becomes a sequential decision task because of the detrimental market impact of buying or selling too many shares at once: the task is to sequence actions throughout the day to minimize price impact.

The stock market is dynamic and the performance traditional algorithms (the rules-based algorithms traders have used for years) can vary when todays market conditions differ from yesterdays. We felt this was a good reinforcement learning opportunity it had the right balance between clarity and dynamic complexity. We could clearly enumerate the different actions Aiden could take, and the reward we wanted to optimize (minimize the difference between the prices Aiden achieved and the market volume-weighted average price benchmark). The stock market moves fast and generates a lot of data, giving the algorithm quick iterations to learn.

We let the algorithm do just that through countless simulations before launching the system live to the market. Ultimately, Aiden proved able to perform well during some of the more volatile market periods during the beginning of the Covid-19 pandemic conditions that are particularly tough for predictive AIs. It was able to adapt to the changing environment, while continuing to stay close to its benchmark target.

How can you tell if youre overlooking a problem that reinforcement learning might be able to fix? Heres where to start:

Create an inventory of business processes that involve a sequence of steps and clearly state what you want to maximize or minimize. Focus on processes with dense, frequent actions and opportunities for feedback and avoid processes with infrequent actions and where its difficult to observe which worked best to collect feedback. Getting the objective right will likely require iteration.

Dont start with reinforcement learning if you can tackle a problem with other machine learning or optimization techniques. Reinforcement learning is helpful when you lack sufficient historical data to train an algorithm. You need to explore options (and create data along the way).

If you do want to move ahead, domain experts should closely collaborate with technical teams to help design the inputs, actions, and rewards. For inputs, seek the smallest set of information you could use to make a good decision. For actions, ask how much flexibility you want to give the system; start simple and later expand the range of actions. For rewards, think carefully about the outcomes and be careful to avoid falling into the traps of considering one variable in isolation or opting for short-term gains with long-term pains.

Will the possible gains justify the costs for development? Many companies need to make digital transformation investments to have the systems and dense, data-generating business processes in place to really make reinforcement learning systems useful. To answer whether the investment will pay off, technical teams should take stock of computational resources to ensure you have the compute power required to support trials and allow the system to explore and identify the optimal sequence. (They may want to create a simulation environment to test the algorithm before releasing it live.) On the software front, if youre planning to use a learning system for customer engagement, you need to have a system that can support A/B testing. This is critical to the learning process, as the algorithm needs to explore different options before it can latch onto which one works best. Finally, if your technology stack can only release features universally, you need likely to upgrade before you start optimizing.

And last but not least, as with many learning algorithms, you have to be open to errors early on while the system learns. It wont find the optimal path from day one, but it will get there in time and potentially find surprising, creative solutions beyond human imagination when it does.

While reinforcement learning is a mature technology, its only now starting to be applied in business settings. The technology shines when used to automate or optimize business processes that generate dense data, and where there could be unanticipated changes you couldnt capture with formulas or rules. If you can spot an opportunity, and either lean on an in-house technical team or partner with experts in the space, theres a window to apply this technology to outpace your competition.

Read more:

Why AI That Teaches Itself to Achieve a Goal Is the Next Big Thing - Harvard Business Review

Written by admin

April 24th, 2021 at 1:58 am

Posted in Alphago

The 13 Best Deep Learning Courses and Online Training for 2021 – Solutions Review

Posted: at 1:58 am

without comments

The editors at Solutions Review have compiled this list of the best deep learning courses and online training to consider.

Deep learning is a class of machine learning algorithms that uses multiple layers to progressively extract higher-level features from the raw input. Based on artificial neural networks and representation learning, deep learning can be supervised, semi-supervised or unsupervised. Deep learning models are commonly based on convolutional neural networks but can also include propositional f formulas or latent variables organized by layer.

With this in mind, weve compiled this list of the best deep learning courses and online training to consider if youre looking to grow your neural network and machine learning skills for work or play. This is not an exhaustive list, but one that features the best deep learning courses and online training from trusted online platforms. We made sure to mention and link to related courses on each platform that may be worth exploring as well. Click Go to training to learn more and register.

Platform: Coursera

Description: In the first course of the Deep Learning Specialization, you will study the foundational concept of neural networks and deep learning. By the end, you will be familiar with the significant technological trends driving the rise of deep learning; build, train, and apply fully connected deep neural networks; implement efficient (vectorized) neural networks; identify key parameters in a neural networks architecture, and apply deep learning to your own applications.

Related paths/tracks: Introduction to Deep Learning, Applied AI with DeepLearning, Introduction to Deep Learning & Neural Networks with Keras, An Introduction to Practical Deep Learning, Building Deep Learning Models with TensorFlow

Platform: Codecademy

Description: Deep learning is a cutting-edge form of machine learning inspired by the architecture of the human brain, but it doesnt have to be intimidating. With TensorFlow, coupled with the Keras API and Python, its easy to train, test, and tune deep learning models without knowing advanced math. To start thisPath, sign up for Codecademy Pro.

Platform: DataCamp

Description: Deep learning is the machine learning technique behind the most exciting capabilities in diverse areas like robotics, natural language processing, image recognition, and artificial intelligence, including the famous AlphaGo. In this course, youll gain hands-on, practical knowledge of how to use deep learning with Keras 2.0, the latest version of a cutting-edge library for deep learning in Python.

Related paths/tracks: Introduction to Deep Learning with PyTorch, Introduction to Deep Learning with Keras, Advanced Deep Learning with Keras

Platform: DataCamp

Description: Deep Learning Training with TensorFlow Certification by Edureka is curated with the help of experienced industry professionals as per the latest requirements and demands. This deep learning certification course will help you master popular algorithms like CNN, RCNN, RNN, LSTM, and RBM using the latest TensorFlow 2.0 package in Python. In this deep learning training, you will be working on various real-time projects like Emotion and Gender Detection, Auto Image Captioning using CNN and LSTM, and many more.

Related path/track: Reinforcement Learning

Platform: edX

Description: This 3-credit-hour, 16-week course covers the fundamentals of deep learning. Students will gain a principled understanding of the motivation, justification, and design considerations of the deep neural network approach to machine learning and will complete hands-on projects using TensorFlow and Keras.

Related paths/tracks: Deep Learning Fundamentals with Keras, Deep Learning with Python and PyTorch, Deep Learning with Tensorflow, Using GPUs to Scale and Speed-up Deep Learning, Deep Learning and Neural Networks for Financial Engineering, Machine Learning with Python: from Linear Models to Deep Learning

Platform: Intellipaat

Description: Intellipaats Online Reinforcement Learning course is designed by industry experts to assist you in learning and gaining expertise in reinforcement learning which is one of the core areas of machine learning. In this training, you will be educated on the concepts of machine learning fundamentals, reinforcement learning fundamentals, dynamic programming, temporal difference learning methods, policy gradient methods, Markov Decision, and Deep Q Learning. This Reinforcement Learning certification course will enable you to learn how to make decisions in uncertain circumstances.

Platform: LinkedIn Learning

Description: In this course, learn how to build a deep neural network that can recognize objects in photographs. Find out how to adjust state-of-the-art deep neural networks to recognize new objects, without the need to retrain the network. Explore cloud-based image recognition APIs that you can use as an alternative to building your own systems. Learn the steps involved to start building and deploying your own image recognition system.

Related paths/tracks: Neural Networks and Convolutional Neural Networks Essential Training, Building and Deploying Deep Learning Applications with TensorFlow, PyTorch Essential Training: Deep Learning, Introduction to Deep Learning with OpenCV, Deep Learning: Face Recognition

Platform: Mindmajix

Description: Mindmajix Deep learning with Python Training helps you in mastering various features of debugging concepts, introduction to software programmers, language abilities and capacities, modification of module and pattern designing, and various OS and compatibility approaches. This course also provides training on how to optimize a simple model in Pure Theano, convolutional and pooling layers, and reducing overfitting with dropout regularization. Enroll and get certified now.

Related path/track: AI & Deep Learning with TensorFlow Training

Platform: Pluralsight

Description: In this course, Deep Learning: The Big Picture, you will first learn about the creation of deep neural networks with tools like TensorFlow and the Microsoft Cognitive Toolkit. Next, youll touch on how they are trained, by example, using data. Finally, you will be provided with a high-level understanding of the key concepts, vocabulary, and technology of deep learning. By the end of this course, youll understand what deep learning is, why its important, and how it will impact you, your business, and our world.

Related paths/tracks: Deep Learning with Keras, Building Deep Learning Models Using PyTorch, Deep Learning Using TensorFlow and Apache MXNet on Amazon Sagemaker

Platform: Simplilearn

Description: In this deep learning course with Keras and TensorFlow certification training, you will become familiar with the language and fundamental concepts of artificial neural networks, PyTorch, autoencoders, and more. Upon completion, you will be able to build deep learning models, interpret results, and build your own deep learning project.

Platform: Skillshare

Description: Its hard to imagine a hotter technology thandeep learning,artificial intelligence, andartificial neural networks. If youve got somePython experience under your belt, this course will de-mystify this exciting field with all the major topics you need to know. A few hours is all it takes to get up to speed and learn what all the hype is about. If youre afraid of AI, the best way to dispel that fear is by understanding how it really works and thats what this course delivers.

Related paths/tracks: Ultimate Neural Network and Deep Learning Masterclass, Deep Learning and AI with Python

Platform: Udacity

Description: Become an expert in neural networks, and learn to implement them using the deep learning framework PyTorch. Build convolutional networks for image recognition, recurrent networks for sequence generation, generative adversarial networks for image generation, and learn how to deploy models accessible from a website.

Related path/track: Become a Deep Reinforcement Learning Expert

Platform: Udemy

Description: Artificial intelligence is growing exponentially. There is no doubt about that. Self-driving cars are clocking up millions of miles, IBM Watson is diagnosing patients better than armies of doctors, and Google Deepminds AlphaGo beat the World champion at Go a game where intuition plays a key role. But the further AI advances, the more complex become the problems it needs to solve. And only deep learning can solve such complex problems and thats why its at the heart of artificial intelligence.

Related paths/tracks: Machine Learning, Data Science and Deep Learning with Python, Deep Learning Prerequisites: The Numpy Stack in Python, Complete Guide to TensorFlow for Deep Learning with Python, Data Science: Deep Learning and Neural Networks in Python, Tensorflow 2.0: Deep Learning and Artificial Intelligence, Complete Tensorflow 2 and Keras Deep Learning Bootcamp, Deep Learning Prerequisites: Linear Regression in Python, Natural Language Processing with Deep Learning in Python, Deep Learning: Convolutional Neural Networks in Python, Deep Learning: Recurrent Neural Networks in Python, Deep Learning and Computer Vision A-Z, Deep Learning Prerequisites: Logic Regression in Python

Tim is Solutions Review's Editorial Director and leads coverage on big data, business intelligence, and data analytics. A 2017 and 2018 Most Influential Business Journalist and 2021 "Who's Who" in data management and data integration, Tim is a recognized influencer and thought leader in enterprise business software. Reach him via tking at solutionsreview dot com.

See more here:

The 13 Best Deep Learning Courses and Online Training for 2021 - Solutions Review

Written by admin

April 24th, 2021 at 1:58 am

Posted in Alphago

How AI is being used for COVID-19 vaccine creation and distribution – TechRepublic

Posted: at 1:58 am

without comments

Artificial intelligence is being used in a variety of ways by those trying to address variants and for data management.

Image: Pinyo

Millions of people across the world have already started the process of receiving a COVID-19 vaccines. More than half of all adults in the U.S. have gotten at least one dose of a COVID-19 vaccine while state and local officials seek to get even more people vaccinated as quickly as possible. Some health experts have said artificial intelligence will be integral not just in managing the process of creating boosters for the variants to COVID-19 but also for the distribution of the vaccine.

David Smith, associate VP of virtual medicine at UMass Memorial Health Care, explained that the difference between predictive modeling and AI, or machine learning, is that predictive models depend on past data to foretell future events.

AI, on the other hand, not only uses historical data, it makes assumptions about the data without applying a defined set of rules, Smith said.

"This allows the software to learn and adapt to information patterns in more real time. The AI utility for vaccine distribution could be applied in a variety of ways from understanding which populations to target to curve the pandemic sooner, adjusting supply chain and distribution logistics to ensure the most people get vaccinated in the least amount of time, to tracking adverse reactions and side effects," he noted.

SEE:AI in healthcare: An insider's guide (free PDF)(TechRepublic Premium)

Matthew Putman, an expert in artificial intelligence and CEO of Nanotronics, has been working with one of the top vaccine developers and said AI was helping teams manage the deluge of data that comes with a project like this.

While the number of vaccinated people in the country continues to rise by the millions each day, there is already some concern about how the vaccines will hold up against the multitude of variants.

The biggest challenge right now and the biggest opportunity for changing the way that therapeutics are both developed and deployed, Putman explained, is being able to handle new types of variants.

"In the case of mRNA vaccines, being able to actually do reprogramming as fast as possible in a way that is as coordinated as possible. The things that we have realized in many parts of our lives now is that as good as humans are at exploring things and being creative, being able to deal with enough data and to be able to make intelligent choices about that data is something that actually artificial intelligence agents can do at a pace that is required to keep up with this," Putman said.

"So it means a lot of multivariate correlations to different parts of the process. It means being able to detect potential intrusion and it's a way that we can avoid these lengthy phase three trials. Everything that's going on right now is so incredibly urgent."

Putman added that an AI system would help with building actionable data sets that allow doctors to examine root causes or things that researchers don't have time to spend on.

When researchers are dealing with things like lipid nanoparticles and the tasks of imaging and classifying different features and trends that are on a scale, it can be difficult for humans to manage. AI is now being used for analyzing these images in real time and has helped researchers try to pick out genetic mutations and variations, according to Putman.

"People are more open to AI than ever, and this emergency has brought a focus on things that probably would have been on the backburner. But AI is starting to be used for classification and to understand what genomic features and what type of nano compounding has been going on," Putman added.

"AI has been used for the development of components and much more. It's been crucial to the process and will be crucial to an alteration to the vaccine, which is looking like it will have to be done at some point. The way I look at contemporary AI systems, it's taking into account what move is being made next. This is Alpha Go for drug discovery. A virus will mutate in different ways and now a response to that can be developed in new ways."

Putman went on to compare the situation to the yearly creation of a new flu vaccine, noting that once you've grown a lot of biological specimens, it's a slow tedious process to change for new mutations.

"Using mRNA, it's not, and using AI for being able to see what changes are going on from everywhere from the sequence to the quality inspection is a big deal," Putman said.

When asked about the production of boosters for variants, Putman said adjusting a process usually takes years just for a normal product, and if you're doing something as new as what is going on with the vaccine and you're dealing with the entirety of the supply chain, the process has to be adjusted as fast as the science does.

"We have the science now. We've shown that these types of vaccines can be developed. Now, making sure that your production process stays the same, even if you've adjusted something else, is something that if it's put in place, the process will adjust," Putman said.

"If an AI system worked for this or an intelligent factory system is put into place, then the process can adjust as quickly as the R&D can. Without AI, it would be very difficult."

Cheryl Rodenfels, a healthcare strategist at Nutanix, echoed those remarks, explaining that AI can be an incredibly useful tool when it comes to vaccine distribution.

Organizations that utilize workflow improvement processes can harness AI tools to ensure that the processes are being followed as designed and that any missing elements are identified, Rodenfels said, adding that this process plays into vaccine tracking measures specifically, as AI will track vaccine handling, storage and administration.

"Relying on the technology to manage distribution data eliminates human error, and ensures that healthcare organizations are accurately tracking the vast amounts of data associated with the vaccine rollout," Rodenfels said.

"However, the biggest problem with using AI to assist with vaccine rollout is that each manufacturer has its own process and procedure for the handling, storage, tracking, training and administration of the vaccine. This is then complicated by the amount of manufacturers in the market. Another issue is that hospital pharmacies and labs don't have a lot of extra space to stage and set up the doses. In order to insert effective AI, a hospital would need to ensure a process architect and a data scientist work collaboratively."

These issues are compounded by the fact that there is no baseline for how these things are supposed to work, she noted. The measurements, analytics and information will be developed on the fly, and because it is unknown how many vaccines each organization will be required or allowed to have, it is difficult to predict the capacity or amount of data that will be produced.

The advantage to using AI in vaccine rollout is that it will set us up for success during round two of vaccine dosing. It will also positively impact future vaccine dissemination by creating a blueprint for the next mass inoculation need, both Rodenfels and Putman said.

Walter McAdams, SQA Group senior vice president of solutions delivery, said that AI will be useful in analyzing how the virus is mutating over time, how variations could affect current vaccine make-ups, and how to use that information to accelerate the development of virus countermeasures.

Researchers, he said, can leverage data about how COVID-19 has mutated and vaccine effectiveness to continuously refine the vaccine sequence and, in some cases, get ahead of COVID-19 and prepare new vaccines before additional strains fully develop.

Our editors highlight the TechRepublic articles, downloads, and galleries that you cannot miss to stay current on the latest IT news, innovations, and tips. Fridays


How AI is being used for COVID-19 vaccine creation and distribution - TechRepublic

Written by admin

April 24th, 2021 at 1:58 am

Posted in Alphago

Examining the world through signals and systems – MIT News

Posted: February 10, 2021 at 9:52 pm

without comments

Theres a mesmerizing video animation on YouTube of simulated, self-driving traffic streaming through a six-lane, four-way intersection. Dozens of cars flow through the streets, pausing, turning, slowing, and speeding up to avoid colliding with their neighbors. And not a single car stopping. But what if even one of those vehicles was not autonomous? What if only one was?

In the coming decades, autonomous vehicles will play a growing role in society, whether keeping drivers safer, making deliveries, or increasing accessibility and mobility for elderly or disabled passengers.

But MIT Assistant Professor Cathy Wu argues that autonomous vehicles are just part of a complex transport system that may involve individual self-driving cars, delivery fleets, human drivers, and a range of last-mile solutions to get passengers to their doorstep not to mention road infrastructure like highways, roundabouts, and, yes, intersections.

Transport today accounts for about one-third of U.S. energy consumption. The decisions we make today about autonomous vehicles could have a big impact on this number ranging from a 40 percent decrease in energy use to a doubling of energy consumption.

So how can we better understand the problem of integrating autonomous vehicles into the transportation system? Equally important, how can we use this understanding to guide us toward better-functioning systems?

Wu, who joined the Laboratory for Information and Decision Systems (LIDS) and MIT in 2019, is the Gilbert W. Winslow Assistant Professor of Civil and Environmental Engineering as well as a core faculty member of the MIT Institute for Data, Systems, and Society. Growing up in a Philadelphia-area family of electrical engineers, Wu sought a field that would enable her to harness engineering skills to solve societal challenges.

During her years as an undergraduate at MIT, she reached out to Professor Seth Teller of the Computer Science and Artificial Intelligence Laboratory to discuss her interest in self-driving cars.

Teller, who passed away in 2014, met her questions with warm advice, says Wu. He told me, If you have an idea of what your passion in life is, then you have to go after it as hard as you possibly can. Only then can you hope to find your true passion.

Anyone can tell you to go after your dreams, but his insight was that dreams and ambitions are not always clear from the start. It takes hard work to find and pursue your passion.

Chasing that passion, Wu would go on to work with Teller, as well as in Professor Daniela Russ Distributed Robotics Laboratory, and finally as a graduate student at the University of California at Berkeley, where she won the IEEE Intelligent Transportation Systems Society's best PhD award in 2019.

In graduate school, Wu had an epiphany: She realized that for autonomous vehicles to fulfill their promise of fewer accidents, time saved, lower emissions, and greater socioeconomic and physical accessibility, these goals must be explicitly designed-for, whether as physical infrastructure, algorithms used by vehicles and sensors, or deliberate policy decisions.

At LIDS, Wu uses a type of machine learning called reinforcement learning to study how traffic systems behave, and how autonomous vehicles in those systems ought to behave to get the best possible outcomes.

Reinforcement learning, which was most famously used by AlphaGo, DeepMinds human-beating Go program, is a powerful class of methods that capture the idea behind trial-and-error given an objective, a learning agent repeatedly attempts to achieve the objective, failing and learning from its mistakes in the process.

In a traffic system, the objectives might be to maximize the overall average velocity of vehicles, to minimize travel time, to minimize energy consumption, and so on.

When studying common components of traffic networks such as grid roads, bottlenecks, and on- and off-ramps, Wu and her colleagues have found that reinforcement learning can match, and in some cases exceed, the performance of current traffic control strategies. And more importantly, reinforcement learning can shed new light toward understanding complex networked systems which have long evaded classical control techniques. For instance, if just 5 to 10 percent of vehicles on the road were autonomous and used reinforcement learning, that could eliminate congestion and boost vehicle speeds by 30 to 140 percent. And the learning from one scenario often translates well to others. These insights could one day soon help to inform public policy or business decisions.

In the course of this research, Wu and her colleagues helped improve a class of reinforcement learning methods called policy gradient methods. Their advancements turned out to be a general improvement to most existing deep reinforcement learning methods.

But reinforcement learning techniques will need to be continually improved to keep up with the scale and shifts in infrastructure and changing behavior patterns. And research findings will need to be translated into action by urban planners, auto makers and other organizations.

Today, Wu is collaborating with public agencies in Taiwan and Indonesia to use insights from her work to guide better dialogues and decisions. By changing traffic signals or using nudges to shift drivers behavior, are there other ways to achieve lower emissions or smoother traffic?

Im surprised by this work every day, says Wu. We set out to answer a question about self-driving cars, and it turns out you can pull apart the insights, apply them in other ways, and then this leads to new exciting questions to answer.

Wu is happy to have found her intellectual home at LIDS. Her experience of it is as a very deep, intellectual, friendly, and welcoming place. And she counts among her research inspirations MIT course 6.003 (Signals and Systems) a class she encourages everyone to take taught in the tradition of professors Alan Oppenheim (Research Laboratory of Electronics) and Alan Willsky (LIDS). The course taught me that so much in this world could be fruitfully examined through the lens of signals and systems, be it electronics or institutions or society, she says. I am just realizing as Im saying this, that I've been empowered by LIDS thinking all along!

Research and teaching through a pandemic havent been easy, but Wu is making the best of a challenging first year as faculty. (Ive been working from home in Cambridge my short walking commute is irrelevant at this point, she says wryly.) To unwind, she enjoys running, listening to podcasts covering topics ranging from science to history, and reverse-engineering her favorite Trader Joes frozen foods.

Shes also been working on two Covid-related projects born at MIT: One explores how data from the environment, such as data collected by internet-of-things-connected thermometers, can help identify emerging community outbreaks. Another project asks if its possible to ascertain how contagious the virus is on public transport, and how different factors might decrease the transmission risk.

Both are in their early stages, Wu says. We hope to contribute a bit to the pool of knowledge that can help decision-makers somewhere. Its been very enlightening and rewarding to do this and see all the other efforts going on around MIT.

Original post:

Examining the world through signals and systems - MIT News

Written by admin

February 10th, 2021 at 9:52 pm

Posted in Alphago

Are we ready for bots with feelings? Life Hacks by Charles Assisi – Hindustan Times

Posted: December 12, 2020 at 7:54 am

without comments

In the 2008 Pixar film Wall-E, a robot thats operated alone for centuries meets and falls in love with another bot, starting a relationship with serious implications for mankind.

Does my phone know when Ive exchanged it and moved on to a new one? Does it work better with its new owner because he follows its instructions when he drives, meaning that it doesnt have to scramble to do one more task while its processor was focused on something else? Does the contraption prefer my kids over me?! They seem to have a lot more fun together.

As a species, we tend to anthropomorphise objects a lamp looks cute, a couch looks sad but what do we do with objects to which we have given a kind of operational intelligence, enough at least to operate independently of us? How do we view them? What standing do they have, relative to the lamp and the couch?

The idea that it might be time to start thinking about rights and status for artificial intelligence was explored last month in a lovely essay on the moral implications of building true artificial intelligence, written by Anand Vaidya, professor of philosophy at San Jose University, and published on the academic news portal The Conversation.

His attempt to place things in perspective begins with a question. What is the basis upon which something has rights? What gives an entity moral standing?

That my phone has a kind of intelligence is obvious because the answers that the voice assistant comes up with in response to questions are often indistinguishable from how a human might answer. But this is rather basic. Science has been at work to push those boundaries. Three years ago, an algorithm called AlphaGo taught itself to play chess until it beat the grandmaster Garry Kasparov. A very gracious Kasparov applauded the algorithm and called its win a victory for humankind.

Advances such as these place in perspective why my younger daughter sneaks away with my phone when she thinks no one is looking, as if running off with a friend. She asks Siri to play her a song, tell her a joke, help with her homework. The algorithm powering the device does all that, and rather nonchalantly. When looked at from a distance, it appears, they bond, my daughter and the bot.

Now, it is broadly agreed that rights are to be granted only to beings with a capacity for consciousness. Thats why animals have rights, in our systems of justice, and not hills or rocks.

It is also generally agreed that there are two types of consciousness. One is linked to the experience of phenomena the scent of a rose, the prick of a thorn. Our devices are bereft of this phenomenal consciousness.

But they do have what is called access consciousness, Vaidya points out. In the same way that you can automatically catch and pass a ball mid-game on reflex, a smart device can alert me when it is low on battery and suggest I recharge, save my work, switch to another device.

As the algorithms that allow it to do that evolve, and artificial intelligence gets smarter, developing even more advanced forms of this access consciousness, it is conceivable that a future algorithm will interact very differently with a younger user than with an adult. That it will know one from the other more specifically.

Isnt it time then that we started thinking about creating a code of conduct around how we will interact with such devices, how we will allow them to interact with each other, and at what points we will intervene to control, moderate, or terminate?

I believe it is time we started thinking about these ethical voids. Because the AI of science fiction is still in the future, but we can feel it getting closer all the time.


Are we ready for bots with feelings? Life Hacks by Charles Assisi - Hindustan Times

Written by admin

December 12th, 2020 at 7:54 am

Posted in Alphago

What are proteins and why do they fold? – DW (English)

Posted: at 7:53 am

without comments

The proteins in our bodies are easily confused with the proteinin food.There are similarities and links between the two for example, both consist of amino acids.

But, when scientists talk about proteins in biology, they are talking about tiny butcomplex molecules that perform a huge range of functions at a cellular level, keeping us healthy and functioning as a whole.

Scientists will often talk about proteins "folding" and say that when they fold properly, we're OK. The way they fold determines their shape, or 3D structure, and that determines their function.

But, when proteins fail to fold properly, they malfunction, leaving us susceptible to potentially life-threatening conditions.

We don't fully understand why: why proteins fold and how, and why it doesn't always work out.

When proteins go wrong: 'Lewy bodies' or protein deposits in neurons can lead to Parkinson's cisease

The whole thing has been bugging biologists for 50 or 60 years, with three questions summarized as the "protein-folding problem."

It appears that that final question has now been answered, at least in part.

An artificial intelligence systemknown as AlphaFold can apparently predict the structure of proteins.

AlphaFold is a descendant of AlphaGo a gaming AI that beat human GO champion Lee Sedol in 2016. GO is a game like chess but tougher to the power of 10.

DeepMind,the company behind AlphaFold, is calling it a "major scientific advance."

To be fair, it's not the first time that scientists have reported they have used computer modeling to predict the structure of proteins;they have done that for a decade or more.

Perhaps it's the scale that AI brings to the field the ability to do more, faster. DeepMind say they hope to sequence the human proteome soon, the same way that scientists sequenced the human genome and gave us all our knowledge about DNA.

But why do it? What is it about proteins that makes them so important for life?

Well, predicting protein structure may help scientists predict your health for instance, the kinds of cancer you may or may not be at risk of developing.

Proteins are indeed vital for life they are like mechanical components, such ascogs in a watch or strings and keys in a piano.

Proteins form when amino acids connect in a chain. And that chain "folds" into a 3D structure. When it fails to fold, it forms a veritable mess a sticky lump of dysfunctional nothing.

Proteins can lend strength to muscle cells, or form neurons in the brain.The US National Institutes of Health lists five main groups of proteins and their functions:

There can be between 20,000 and 100,000 unique types of proteins within a human cell. They form out of an average of 300 amino acids, sometimes referred to as protein building blocks. Each is a mix of the 22 differentknown amino acids.

Those amino acids are chained together, and the sequence, or order, of that chain determines how the protein folds upon itselfand, ultimately, its function.

Protein-folding can be a process of hit-and-miss. It's a four-part process that usually begins with twobasic folds.

Healthy proteins depend on a specific sequence of amino acids and how the molecule 'folds' and coils

First, parts of a protein chain coil up into what areknown as "alpha helices."

Then, other parts or regions of the protein form "beta sheets," which look a bit like the improvised paper fans we make on a hot summer's day.

In steps three and four, you get more complex shapes. The two basic structures combine into tubes and other shapes that resemble propellers, horseshoes or jelly rolls. And that gives them their function.

Tube or tunnel-like proteins, for instance, can act as an express route for traffic to flow in and out of cells. There are "coiled coils" that move like snakes to enable a function in DNA clearly, it takes all types in the human body.

Successful protein folding depends on a number of things, such as temperature, sufficient space in a celland, it is said, even electrical and magnetic fields.

Temperature and acidity (pH values) in a cell, for instance, can affect the stability of a protein its ability to hold its shape and therefore perform its correct function.

Chaperone proteins can assist other proteins while folding and help mitigate bad folding. But it doesn't always work.

Misfolded proteins are thought to contribute to a range of neurological diseases, including Alzheimer's, Parkinson's andHuntington's diseaseand ALS.

It's thought that when a protein fails to foldand perform a specific function, known as "loss of function," that specific job just doesn't get done.

As a result, cells can get tired for instance, when a protein isn't there to give them the energy they need and eventually they get sick.

Researchers have been trying to understand why some proteins misfold more than others, why chaperones sometimes fail to help, and why exactly misfolded proteins cause the diseases they are believed to cause.

Who knows? DeepMind's AlphaFold may help scientists answer these questions a lot faster now. Or throw up even more questions to answer.

Bugs can be tasty. So why is it that we don't we eat more of them? There are plenty of reasons to do so: insects are easy to raise and consume fewer resources than cows, sheep or pigs. They dont need pastures, they multiply quickly and they don't produce greenhouse gasses.

Water bugs, scorpions, cockroaches - on a stick or fried to accompany beer: these are delicacies in Asia, and healthy ones at that. Insects, especially larvae, are an energy and protein bomb. One hundred grams of termites, for example, have 610 calories - more than chocolate! Add to that 38 grams of protein and 46 grams of fat.

Insects are full of unsaturated fatty acids, iron, vitamins and minerals says the UNS Food and Agriculture Organisation (FAO). The organization wants to increase the popularity of insect recipes around the world.

In many countries around the world, insects have long been a popular treat, especially in parts of Asia, Africa and Latin America. Mopane caterpillars, like the ones shown here, are a delicacy in southern Africa. They're typically boiled, roasted or grilled.

Even international fine cuisine features insects. And in Mexican restaurants, worms with guacamole are a popular snack. Meanwhile, new restaurants in Germany are starting to pop up that offer grasshoppers, meal worms and caterpillars to foodies with a taste for adventure.

In Europe and America, beetles, grubs, locusts and other creepy crawlers are usually met with a yuck! The thought of eating deep-fried tarantulas, a popular treat in Cambodia, is met with great disgust. But is there a good reason for that response?

Fine food specialists Terre Exotique (Exotic Earth) offer a grilled grasshopper snack. The French company currently sells the crunchy critters online via special order. A 30-gram jar goes for $11.50 (9 euros).

There are about 1,000 edible insect varieties in the world. Bees are one of them. They're a sustainable source of nutrition, full of protein and vitamins - and tasty for the most part. The world needs to discover this delicacy, says the UN's Food and Agriculture Organization.

In 2012, researchers used ecological criteria to monitor mealworm production at an insect farm in the Netherlands. The result? For the production of one kilogram of edible protein, worm farms use less energy and much less space than dairy or beef farms.

Even in Germany, insects used to be eaten in abundance. May beetle soup was popular until the mid-1900s. The taste has been described as reminiscent of crab soup. In addition, beetles were sugared or candied, then sold in pastry shops.

French start-up Ynsect is cooking up plans to offer ground up mealworms as a cost-effective feed for animals like fish, chicken and pigs. This could benefit the European market, where 70 percent of animal feed is imported.

Author: Lori Herber

Continued here:

What are proteins and why do they fold? - DW (English)

Written by admin

December 12th, 2020 at 7:53 am

Posted in Alphago

Are Computers That Win at Chess Smarter Than Geniuses? – Walter Bradley Center for Natural and Artificial Intelligence

Posted: December 4, 2020 at 5:52 am

without comments

Big computers conquered chess quite easily. But then there was the Chinese game of go (pictured), estimated to be 4000 years old, which offers more degrees of freedom (possible moves, strategy, and rules) than chess (210170). As futurist George Gilder tells us, in Gaming AI, it was a rite of passage for aspiring intellects in Asia: Go began as a rigorous rite of passage for Chinese gentlemen and diplomats, testing their intellectual skills and strategic prowess. Later, crossing the Sea of Japan, Go enthralled the Shogunate, which brought it into the Japanese Imperial Court and made it a national cult. (p. 9)

Then AlphaGo, from Googles DeepMind, appeared on the scene in 2016:

As the Chinese American titan Kai-Fu Lee explains in his bestseller AI Super-powers,8 the riveting encounter between man and machine across the Go board had a powerful effect on Asian youth. Though mostly unnoticed in the United States, AlphaGos 2016 defeat of Lee Sedol was avidly watched by 280 million Chinese, and Sedols loss was a shattering experience. The Chinese saw DeepMind as an alien system defeating an Asian man in the epitome of an Asian game.

Thirty-three-year-old Korean Lee Se-dol later announced his retirement from the game. Meanwhile, Gilder tells us, that defeat, plus a later one, sparked a huge surge in Chinese investment in AI in response: Less than two months after Ke Jies defeat, the Chinese government launched an ambitious plan to lead the world in artificial intelligence by 2030. Within a year, Chinese venture capitalists had already surpassed US venture capitalists in AI funding.

AI went on to conquer poker, Starcraft II, and virtual aerial dogfights.

The machines won because improvements in machine learning techniques such as reinforcement learning enable much more effective data crunching. In fact, soon after the defeats of human go champions, a more sophisticated machine was beating a less sophisticated machine at go. As Gilder tells it, in 2017, Googles DeepMind launched AlphaGo Zero. Using a generic adversarial program, AlphaGo Zero played itself billions of times and then went on to defeat AlphaGo 1000 (p. 11). This incident went largely unremarked because it was a mere conflict between machines.

But what has really happened with computers, humans, and games is not what we are sometimes urged to think, that machines are rapidly developing human-like capacities. In all of these games, one feature stands out: The map is the territory.

Think of a simple game like checkers. There are 64 squares and each of two players is given 12 pieces. Each player tries to eliminate the other players pieces from the board, following the rules. Essentially, in checkers, there is nothing beyond the pieces, the board, and the official rules. Like go, its a map and a territory all in one.

Games like chess, go, and poker are vastly more complex than checkers in their degrees of freedom. But they all resemble checkers in one important way: In all cases, the map is the territory. And that limits the resemblance to reality. As Gilder puts it, Go is deterministic and ergodic; any specific arrangement of stones will always produce the same results, according to the rules of the game. The stones are at once symbols and objects; they are always mutually congruent. (pp 5051)

In other words, the structure of a game rules out, by definition, the very types of events that occur constantly in the real world where, as many of us have found reason to complain, the map is not the territory.

Or, as Gilder goes on to say in Gaming AI,

Plausible on the Go board and other game arenas, these principles are absurd in real world situations. Symbols and objects are only roughly correlated. Diverging constantly are maps and territories, population statistics and crowds of people, climate data and the actual weather, the word and the thing, the idea and the act. Differences and errors add up as readily and relentlessly on gigahertz computers as lily pads on the famous exponential pond.

Generally, AI succeeds wherever the skill required to win is calculation and the territory is only a map. For example, take IBM Watsons win at Jeopardy in 2011. As Larry L. Linenschmidt of Hill Country Institute has pointed out, Watson had, it would seem, a built-in advantage then by having infinitemaybe not infinite but virtually infiniteinformation available to it to do those matches.

Indeed. But Watson was a flop later in clinical medicine. Thats probably because computers only calculate and not everything in the practice of medicine in a real-world setting is a matter of calculation.

Not every human intellectual effort involves calculation. Thats why increases in computing power cannot solve all our problems. Computers are not creative and they do not tolerate ambiguity well. Yet success in the real world consists largely in mastering these non-computable areas.

Science fiction has dreamed that ramped-up calculation will turn computers into machines that can think like humans. But even the steepest, most impressive calculations do not suddenly become creativity, for the same reasons as maps do not suddenly become the real-world territory. To think otherwise is to believe in magic.

Note: George Gilders book, Gaming AI, is free for download here.

You may also enjoy: Six limitations of artificial intelligence as we know it. Youd better hope it doesnt run your life, as Robert J. Marks explains to Larry Linenschmidt.

Continued here:

Are Computers That Win at Chess Smarter Than Geniuses? - Walter Bradley Center for Natural and Artificial Intelligence

Written by admin

December 4th, 2020 at 5:52 am

Posted in Alphago

An AI winter may be inevitable. What we should fear more: an AI ice age – ITProPortal

Posted: at 5:52 am

without comments

In The Queens Gambit, the recent Netflix drama on a chess genius, the main character is incredibly focused and driven. You might even say machine-like. Perhaps, you could go so far as to say shes a little bit like an incredibly single-minded Artificial Intelligence (AI) program like AlphaGo.

Hoping not to give any spoilers here, but in the drama, Beth eventually succeeds not just because shes a chess prodigy, able to see the board many moves ahead. She succeeds because she teams up with fellow players who give her hints and tips on the psychology and habits of her main Big Boss opponent.

In other words, she employs tactics, strategy, reasoning and planning; she sees more than the board. She reads the room, one might say. Emotions play a huge part in all she does and is key to her eventual triumph in Moscow.

And this is why were potentially in a lot of trouble in AI. AlphaGo cant do any of what Beth and her friends do. Its a brilliant bit of software, but its an idiot savantall it knows is Go.

Right now, very few people care about that. Which is why I fear we may be headed not just into another AI Winter, but an almost endless AI Ice Age, perhaps decades of rejection of the approach, all the VC money taps being turned off, lack of Uni research fundingall the things we saw in the first AI Winter of 1974-80 and the second of 1987-93.

Only much, much worse.

Im also convinced the only way to save the amazing achievements weve seen with programs like AlphaGo is to make them more like Bethable to see much, much more than just the board in front of them.

Lets put all this in context. Right now, we are without doubt enjoying the best period AI has ever had. Years and years of hard backroom slog at the theoretical level has been accompanied by superb improvements in hardware performancea combination that raised our game really without us asking. Hence todays undoubted AI success story: Machine Learning. Everyone is betting on this approach and its roots in Deep Learning and Big Data, which is fine; genuine progress and real applications are being delivered at the moment.

But theres a hard stop coming. One of the inherent issues for Deep Learning is you need bigger and bigger neural network and parameters to achieve more than you did last time, and so you soon end up with incredible numbers of parameters: the full version of GPT-3 has 175 billion. But to train those size of networks takes immense computational powerand even though Moores Law continues to be our friend, even that has limits. And were set to reach them a lot sooner than many would like to think.

Despite its reputation for handwaving and love of unobtanium, the AI field is full of realists. Most have painful memories of what happened the last time AIs promise met intractable reality, a cycle which gives rise to the concept of the AI Winter. In the UK, in 1973 a scathing analysisthe infamous Lighthill Reportconcluded that AI just wasnt worth putting any more money into. Similarly fed up, once amazingly generous Defence paymasters in the US ended the first heuristic-search based boom, and the field went into steep decline until the expert systems/knowledge engineering explosion of the 1980s, which also, eventually, also went cold when to many over-egged promises met the real world.

To be clear, both periods provided incredible advances, including systems that saved money for people and improved industries. AI never goes away, either; anyone working in IT knows that theres always advanced programming and smart systems somewhere helping outwe dont even call them AI anymore, they just work without issue. So on one hand, AI wont stop, even if it goes out of fashion once again; getting computers and robots to help us is just too useful an endeavour to stop.

But what will happen is an AI Winter that will follow todays boom. Sometime soon, data science will stop being fashionable; ML models will stop being trusted; entrepreneurs offering the City a Deep Learning solution to financial problem X wont have their emails returned.

And what might well happen beyond that is even worse not just a short period of withdrawal of interest, but a deep, deep freeze10, 20, 30 years long. I dont want to see that happen, and thats just not because I love AI or want my very own HAL 9000 (though, course I doso do you). I dont want to see it happen because I know that Artificial Intelligence is real, and while there may be genuinely fascinating philosophical arguments for and against it, eventually we will create something that can do things as well as humans can.

But note that I said things. AlphaGo is better than all of us (certainly me) at playing games. Google Translate is better than me at translating multiple languages, and so on. What we need are smart systems that are better at more than one thing can start being intelligent, even at very low levels, outside a very narrow domain. Step forward AGI, Artificial General Intelligence, which are suites of programs that apply intelligence to a wide variety of problems, in much the same ways humans can.

We're only seeing the most progress in learning because that's where all the investment is going

For example: weve only been focused on learning the last 15 years. But AI done properly needs to cover a range of intelligence capabilities, of which being able to learn and spot patterns is just one; there's reasoning, there's understanding, there's a lot of other types of intelligence capabilities that should be part of an overall Artificial Intelligence practice or capability.

We know why that iswere focused on learning because we got good traction with that and made solid progress. But there's all the other AI capabilities that we should be also looking at and investigating, and were just not. Its a Catch-22: all the smart money is going into Machine Learning because that's where we're seeing the most progress, but we're only seeing the most progress in learning because that's where all the investment is going!

To sum up, then: we may not be able to stave off a Machine Learning AI Winter; perhaps its an inevitable cycle. But to stave off an even worse and very, very destructive AI Ice Age, I think we need to widen the focus here, get AGI back on its feet, and help our Beths get better at a lot more than just chess or were just going to see them turned off, one by one.

Andy Pardoe, founder and managing director, Pardoe Ventures

See the original post here:

An AI winter may be inevitable. What we should fear more: an AI ice age - ITProPortal

Written by admin

December 4th, 2020 at 5:52 am

Posted in Alphago

What the hell is reinforcement learning and how does it work? – The Next Web

Posted: November 2, 2020 at 1:56 am

without comments

Reinforcement learning is a subset of machine learning. It enables an agent to learn through the consequences of actions in a specific environment. It can be used to teach a robot new tricks, for example.

Reinforcement learning is a behavioral learning model where the algorithm provides data analysis feedback, directing the user to the best result.

It differs from other forms of supervised learning because the sample data set does not train the machine. Instead, it learns by trial and error. Therefore, a series of right decisions would strengthen the method as it better solves the problem.

Reinforced learning is similar to what we humans have when we are children. We all went through the learning reinforcement when you started crawling and tried to get up, you fell over and over, but your parents were there to lift you and teach you.

It is teaching based on experience, in which the machine must deal with what went wrong before and look for the right approach.

Although we dont describe the reward policy that is, the game rules we dont give the model any tips or advice on how to solve the game. It is up to the model to figure out how to execute the task to optimize the reward, beginning with random testing and sophisticated tactics.

By exploiting research power and multiple attempts, reinforcement learning is the most successful way to indicate computer imagination. Unlike humans, artificial intelligence will gain knowledge from thousands of side games. At the same time, a reinforcement learning algorithm runs on robust computer infrastructure.

An example of reinforced learning is the recommendation on Youtube, for example. After watching a video, the platform will show you similar titles that you believe you will like. However, suppose you start watching the recommendation and do not finish it. In that case, the machine understands that the recommendation would not be a good one and will try another approach next time.

[Read: What audience intelligence data tells us about the 2020 US presidential election]

Reinforcement learnings key challenge is to plan the simulation environment, which relies heavily on the task to be performed. When trained in Chess, Go, or Atari games, the simulation environment preparation is relatively easy. Building a model capable of driving an autonomous car is key to creating a realistic prototype before letting the car ride the street. The model must decide how to break or prevent a collision in a safe environment. Transferring the model from the training setting to the real world becomes problematic.

Scaling and modifying the agents neural network is another problem. There is no way to connect with the network except by incentives and penalties. This may lead to disastrous forgetfulness, where gaining new information causes some of the old knowledge to be removed from the network. In other words, we must keep learning in the agents memory.

Another difficulty is reaching a great location that is, the agent executes the mission as it is, but not in the ideal or required manner. A hopper jumping like a kangaroo instead of doing what is expected of him is a perfect example. Finally, some agents can maximize the prize without completing their mission.


RL is so well known today because it is the conventional algorithm used to solve different games and sometimes achieve superhuman performance.

The most famous must be AlphaGo and AlphaGo Zero. AlphaGo, trained with countless human games, has achieved superhuman performance using the Monte Carlo tree value research and value network (MCTS) in its policy network. However, the researchers tried a purer approach to RL training it from scratch. The researchers left the new agent, AlphaGo Zero, to play alone and finally defeat AlphaGo 1000.

Personalized recommendations

The work of news recommendations has always faced several challenges, including the dynamics of rapidly changing news, users who tire easily, and the Click Rate that cannot reflect the user retention rate. Guanjie et al. applied RL to the news recommendation system in a document entitled DRN: A Deep Reinforcement Learning Framework for News Recommendation to tackle problems.

In practice, they built four categories of resources, namely: A) user resources, B) context resources such as environment state resources, C) user news resources, and D) news resources such as action resources. The four resources were inserted into the Deep Q-Network (DQN) to calculate the Q value. A news list was chosen to recommend based on the Q value, and the users click on the news was part of the reward the RL agent received.

The authors also employed other techniques to solve other challenging problems, including memory repetition, survival models, Dueling Bandit Gradient Descent, and so on.

Resource management in computer clusters

Designing algorithms to allocate limited resources to different tasks is challenging and requires human-generated heuristics.

The article Resource management with deep reinforcement learning explains how to use RL to automatically learn how to allocate and schedule computer resources for jobs on hold to minimize the average job (task) slowdown.

The state-space was formulated as the current resource allocation and the resource profile of jobs. For the action space, they used a trick to allow the agent to choose more than one action at each stage of time. The reward was the sum of (-1 / job duration) across all jobs in the system. Then they combined the REINFORCE algorithm and the baseline value to calculate the policy gradients and find the best policy parameters that provide the probability distribution of the actions to minimize the objective.

Traffic light control

In the article Multi-agent system based on reinforcement learning to control network traffic signals, the researchers tried to design a traffic light controller to solve the congestion problem. Tested only in a simulated environment, their methods showed results superior to traditional methods and shed light on multi-agent RLs possible uses in traffic systems design.

Five agents were placed in the five intersections traffic network, with an RL agent at the central intersection to control traffic signaling. The state was defined as an eight-dimensional vector, with each element representing the relative traffic flow of each lane. Eight options were available to the agent, each representing a combination of phases, and the reward function was defined as a reduction in delay compared to the previous step. The authors used DQN to learn the Q value of {state, action} pairs.


There is an incredible job in the application of RL in robotics. We recommend reading this paper with the result of RL research in robotics. In this other work, the researchers trained a robot to learn policies to map raw video images to the robots actions. The RGB images were fed into a CNN, and the outputs were the engine torques. The RL component was policy research guided to generate training data from its state distribution.

Web systems configuration

There are more than 100 configurable parameters in a Web System, and the process of adjusting the parameters requires a qualified operator and several tracking and error tests.

The article A learning approach by reinforcing the self-configuration of the online Web system showed the first attempt in the domain on how to autonomously reconfigure parameters in multi-layered web systems in dynamic VM-based environments.

The reconfiguration process can be formulated as a finite MDP. The state-space was the system configuration; the action space was {increase, decrease, maintain} for each parameter. The reward was defined as the difference between the intended response time and the measured response time. The authors used the Q-learning algorithm to perform the task.

Although the authors used some other technique, such as policy initialization, to remedy the large state space and the computational complexity of the problem, instead of the potential combinations of RL and neural network, it is believed that the pioneering work prepared the way for future research in this area


RL can also be applied to optimize chemical reactions. Researchers have shown that their model has outdone a state-of-the-art algorithm and generalized to different underlying mechanisms in the article Optimizing chemical reactions with deep reinforcement learning.

Combined with LSTM to model the policy function, agent RL optimized the chemical reaction with the Markov decision process (MDP) characterized by {S, A, P, R}, where S was the set of experimental conditions ( such as temperature, pH, etc.), A was the set of all possible actions that can change the experimental conditions, P was the probability of transition from the current condition of the experiment to the next condition and R was the reward that is a function of the state.

The application is excellent for demonstrating how RL can reduce time and trial and error work in a relatively stable environment.

Auctions and advertising

Researchers at Alibaba Group published the article Real-time auctions with multi-agent reinforcement learning in display advertising. They stated that their cluster-based distributed multi-agent solution (DCMAB) has achieved promising results and, therefore, plans to test the Taobao platforms life.

Generally speaking, the Taobao ad platform is a place for marketers to bid to show ads to customers. This can be a problem for many agents because traders bid against each other, and their actions are interrelated. In the article, merchants and customers were grouped into different groups to reduce computational complexity. The agents state-space indicated the agents cost-revenue status, the action space was the (continuous) bid, and the reward was the customer clusters revenue.

Deep learning

More and more attempts to combine RL and other deep learning architectures can be seen recently and have shown impressive results.

One of RLs most influential jobs is Deepminds pioneering work to combine CNN with RL. In doing so, the agent can see the environment through high-dimensional sensors and then learn to interact with it.

CNN with RL are other combinations used by people to try new ideas. RNN is a type of neural network that has memories. When combined with RL, RNN offers agents the ability to memorize things. For example, they combined LSTM with RL to create a deep recurring Q network (DRQN) for playing Atari 2600 games. They also usedLSTM with RL to solve problems in optimizing chemical reactions.

Deepmind showed how to use generative models and RL to generate programs. In the model, the adversely trained agent used the signal as a reward for improving actions, rather than propagating gradients to the entry space as in GAN training. Incredible, isnt it?

Reinforcement is done with rewards according to the decisions made; it is possible to learn continuously from interactions with the environment at all times. With each correct action, we will have positive rewards and penalties for incorrect decisions. In the industry, this type of learning can help optimize processes, simulations, monitoring, maintenance, and the control of autonomous systems.

Some criteria can be used in deciding where to use reinforcement learning:

In addition to industry, reinforcement learning is used in various fields such as education, health, finance, image, and text recognition.

This article was written by Jair Ribeiro and was originally published on Towards Data Science. You can read it here.

Published October 27, 2020 10:49 UTC

Originally posted here:

What the hell is reinforcement learning and how does it work? - The Next Web

Written by admin

November 2nd, 2020 at 1:56 am

Posted in Alphago

Page 11234..»