Last week, Google announced the results of its new AlphaZero (aka AlphaZ) AI program that may revolutionize the use of AI in all fields including law.
Here’s the deal: we all remember Google’s AlphaGo, an AI program Goggle developed that beat the world’s human champion of Go, which is called the most complicated game yet developed. Unlike, say chess, Go is a very difficult open ended game.
Google did this by feeding AlphaGo game after Go game until the machine trained itself and learned how to play. AlphaGo literally had to look at thousands of human amateur and professional games before it could competently play. Gradually, it evaluated positions and began selecting moves using its neural networks. These deep neural networks were trained by supervised learning from human expert moves, and by reinforcement learning from self-play. This process took months and months. A similar but more rudimentary type of learning was employed by Watson and Deep Blue to win at Jeopardy and chess.
But Google then took AlphaGo and evolved it. Instead of feeding its new program, called AlphaZ, countless games from which it could learn, Google only gave it the rules of Go and told it to figure it out and teach itself. Once it learned the rules, AlphaZ played itself millions of games a second, starting with random play. In doing so, within 8 hours it taught itself and surpassed not only human level of play but could defeated AlphaGo (which bested the human champion) by 100 games to 0.
In doing so, within 8 hours it taught itself and surpassed not only human level of play but could defeated AlphaGo (which bested the human champion) by 100 games to 0.
Last week, Google published a paper in Nature discussing its findings. Google says AlphaZ is able to do to defeat the previous version so quickly by using a “novel form of reinforcement learning“. The system starts off with a neural network that knows nothing about the game of Go. It then evaluates the rules it is fed and plays games against itself, by combining its neural networks with a powerful search algorithm. As it plays, the neural network tunes itself and updates itself to predict moves, and the eventual winner of the games.
By the way, after it mastered Go, for kicks I suppose, AlphaZ taught itself how to play chess within 4 hours to where it could beat the world’s best computer chess program, Stockfish.
Why’s this important? For AI computer to truly work as hyped, it must learn apart from humans and quickly to be effective. Thus far we have generally trained computers by feeding them examples and real situations. The computer looks at many, many questions and correct answers based on these examples to learn the correlation between the two. Once it does that, it then applies that correlation to questions it has not seen before.
This is a time consuming, cumbersome and somewhat complicated process. What Google did with AlphaZ is entirely different because the program looks only at a set of rules and then trains itself not from examples but based on the rules. As can be seen from the difference in learning times between AlphaGo and AlphaZ, once a computer can teach itself, it can work at much faster pace and reach valid conclusions much more quickly. In essence, AlphaZ has completed the transition from a rules based or even data based learning to something entirely different.
If nothing else, the sheer speed at which the computations can be made could revolutionize automation functions and reduce the time and training for humans to spend. So far, the relatively slow and cumbersome process of machine learning has impeded more progress in the AI field. AlphaZero may have just solved that resulting in AI becoming more mainstream. And yes, so AlphaZ only has tackled games. But many present AI programs were based first on programs that could successively games and then moved on to tackle more significant problems.
the sheer speed at which the computations can be made could revolutionize automation functions and reduce the time and training for humans to spend
What could this mean for the legal profession? Hard to know but it will certainly accelerate the use of AI in all sorts of applications. Think about ediscovery. We started with manual review of documents, with huge investments of human time. We then moved to search capabilities with a computer recognizing and finding certain words and phrases. Then to TAR. But even for TAR to work, humans-with all their faults and “imperfections” -had to spend substantial time teaching the machine what to look and what was relevant. Imagine, though, using AlphaZ’s AI in ediscovery. Instead of spending hours teaching a machine to select relevant documents by giving it countless examples, the machine could train itself based on some rules it was provided. Instead of TAR, we may have TR. Even less human time.
And imagine the impact on legal research where instead of learning how the relevant law might apply to situations, over months if not years, the computer could teach itself the requisite research capabilities in any area of law in a few hours. This would (or will) accelerate the time to market for these AI programs.
Where could this lead? Perhaps, shutter (maybe) to think, could we give the computer some rules and let it preliminarily decide motions based on precedent it finds, much like a magistrate does now? Or could it even serve as a decision maker in low level disputes acting in a cheap and effective manner and helping provide access to justice to those who don’t now have it. Could it serve as an aid in mediation? Could it outline the benefits and impacts of a transaction? Could it help craft an argument that would be more appealing? Reducing or eliminating the time constraints both in time to market, less human involvement and reduced computational time opens many, many doors.
Reducing or eliminating the time constraints both in time to market, less human involvement and reduced computational time opens many, many doors.
But AlphaZero also raises questions. Because AlphaZ taught itself, the code that runs it was essentially developed by itself and may not be easily human readable: in essence, AlphaZ not humans developed the code. This introduces a whole host of legal questions. How can we determine why it decided the way it did without knowing and understanding the code? Because AlphaZero teaches itself, who’s to blame if it decides something the wrong way? Can a computer be liable for a mistake? Do concepts of respondent superior apply to computers? Will AlphaZ someday have to testify in Court and explain its decision about something?
And there is the potential societal disruption from the loss of jobs, autonomy and privacy that computers like AlphaZ could threaten.
Suffice it to say, though, we will be hearing more about AlphaZ and reinforcement learning as AI and its capabilities disrupt law and law practice. Get ready.
Photo Attribution: Photo 1: Many Wonderful Artists via flickr
Photo 2: Kemming Wang via flickr
Photo 3: SEO via flickr