Automated Problem Solving with Genetic Algorithms
I recently made the claim that software development is in the midst of major changes and disruption. The disruption is coming because of Neural Networks powered via Genetic Algorithms. These two techniques merged together allows us to create insanely autonomous and accurate systems without the need for even coding imperative instructions, or fully knowing the answer to the problem we want to solve. Experimenting with these techniques have completely changed the way I think about coding forever.
I had to put these algorithms to test in some of my own projects so that I could see what the big deal was. After several experiments of my own, I am convinced that we live in extremely exciting yet disruptive times.
If you are an Elixir/Erlang developer you have an edge over other technology ecosystems because our ecosystem is designed from the ground up to support this new style of development. Other ecosystems have to force their tech to fit this new AI paradigm while we get much of the support built in via OTP. The next few posts will show examples of AI systems built in Elixir with the ability to learn in many different ways, but first, we will break down genetic algorithms because they are at the heart of how a system problem solves on its' own.
In Biology: a unit of heredity that is transferred from a parent to offspring and is held to determine some characteristic of the offspring
In order to wrap my head around what the hell was going on in the Artificial Life field, I had to start to understand the principles behind the origins of human information known as the Gene.
Without our own personal genetic sequence, we don't have a unique identity. Without genes, there is no life. So if you want to create life-like systems it is important to understand the gene!
We talked in our previous posts on ATF about Neural Networks. However, Neural Networks are only responsible for allowing a system to learn. Our ultimate goal is to have automated systems that can learn and problem solve just like humans do only better. This behavior of learning and problem solving can be achieved with Neural Networks and Genetic algorithms working together toward a specific goal. This goal can be thought of as the TARGET. In this post, GOAL and TARGET will be used synonymously.
EVOLVING TOWARDS A GOAL/TARGET
At conception, a human mutates from being a dormant egg all the way to a fully developed human being with two hands, two feet, eyes, mouth, and all the other characteristics that make us human. But how did this happen? Surely there was no overlord sneaking into the womb of a human mother to program the instructions on how to create the human step by step. Instead, this information was passed on via the two parents genetic information. This information was refined over thousands and thousands of years of evolution and trial and error which in turn passed to the parents from previous ancestors. This refinement of the human genome eventually led to the ability of these genes figuring out what worked and what didn't. Multiple trials of doing these mutations over and over eventually led to what seems to us like.....perfection.
The point to understand is genetic evolutionary ideas can be copied over into the software development process to produce systems that can start with random sequences and select and iterate over those chosen sequences until it finds the best solution! To replicate these ideas in our software we can give a system a goal, and then allow the system to utilize a large amount of data to create experiments with. These experiments are then ranked or scored. When the software finds promising experiments in the data it can then mutate those promising results into a new experimental set. The idea is to continue this process over and over until a desirable solution is reached utilizing the best outcome in all the experiments.
A GENETIC ALGORITHMS 5 NEEDS
A genetic algorithm needs 5 things to be successful. Having these 5 essential pieces embedded in a deep learning algorithm allows for creating intelligent autonomous systems. Here are the common things I've found in these algorithms
- TIME/EPOCHS - In programming terms, this is a loop or iteration.
- RANDOMIZATION - Extremely important to have random generations otherwise there is a huge risk for things getting stuck and never finding a solution because the same sequence is being considered over and over. (BAD FOR BUSINESS)
- DATA - Information is everything, without it, an algorithm doesn't have anything to work with.
- FITNESS - a key indicator to let the algorithm know how well it is performing.
- GOAL/TARGET - The overall point the system is working toward. This can be set by either the developer or the system on its own. Either way, a goal is needed, otherwise, there is no point for the algorithms existence.
In order to help drive the concept home, I've developed a small system called SPELLER that uses genetic evolutionary techniques to spell any word it is asked. Here is an example of SPELLER doing its thing below...
Speller is equipped with all the letters of the alphabet including some punctuation mark like periods, exclamation points, and commas, etc. This is the base data it needs to get its' work done. You can also see from the image that its' goal is identified which is an important item needed for a successful genetic algorithm. An important thing to note is how each sequence it makes, it determines its' new fitness score for that sequence. The closer it gets to the goal the higher the fitness score. Here are some more examples
How was this possible? SPELLER figured out on its own how to spell the target word by using its' own feedback called its' fitness. A fitness function in evolution terms is essentially what evolutionists call the selection process. In natural selection, the best and the strongest win the day and is allowed to proceed with its existence. In evolution, selecting the best of organisms can ensure the organism selected is more likely to reach its goal. In life, our goal is survival of the human species, In computation, our goal is whatever the human desires! The evolution theory says that when you apply evolution to human biology, over time you get us. The idea is to apply the same genetic evolutionary concepts to our software problems.
Applying Evolutionary concepts to SPELLER's algorithm looked like this...
- Fetch the desired goal from the user and set that as the target.
- Fetch all the data necessary( This data can be considered the Gene Pool. Some even call it the population. It is everything the algorithm would need to use to solve the problem. In this case it's the letters of the alphabet.)
- Generate a random sequence of genes that are the same length of the desired target. (This is where the need for random generation comes in. The more random the better.)
- Score the generated gene sequence in comparison to the target. If the fitness is higher than SPELLER's current fitness it should keep that gene set, while continuing to march toward its ultimate goal. If however, it finds that its current gene set is not good enough, it discards it and generates a new gene set. Then it starts the whole process all over again until the target is reached.
The fitness function is the heart of any genetic algorithm. It is the algorithms only feedback for how well its' performing. This is a big deal because applying these principles allows for truly autonomous problem solving. Now imagine coupling this technique with neural networks. Sure SPELLER is trivial, but the same concepts can be expanded to solve a much harder class of problems. In fact, genetic programming will probably be the thing that replaces the need for manual programming as a whole really soon.
WHY ELIXIR/ERLANG IS IMPORTANT
Evolution is a process. It takes a large amount of time to come up with a fit organism. Some even argue that evolution is never really complete. Within our computer systems, we want them to get to a solution within our lifetime. Depending on the type of AI system we might need it to respond with results in milliseconds! Time is of the essence.
Genetic Algorithms and Neural Networks REQUIRE sufficient processing power and parallelization. Concurrency is necessary because its ideal to have a system explore multiple possibilities or possible solutions at the same time. This allows it to find the best choice when given a specific search space. The search space can be 1 in 10 or 1 in 1 million, it makes no difference how large or small the problem is. The system would still need to process all those possibilities quickly and accurately. Concurrent & Parallelized thinking is what makes AI systems as powerful as they are.
Elixir/Erlang makes us think in terms of concurrency. This is a good thing because Machine Learning and genetic algorithms only make sense in a concurrent distributed world. The next few posts we will explore merging these genetic algorithm concepts with neural networks to create self-learning systems that are capable of problem-solving on their own. We will also explore several ways I've experimented with making them perform much faster using the Erlang ecosystem.