#HTE
How to Teach Yourself About Algorithms
Have you ever thrown around the word algorithm without knowing what it means? When people complain about the Facebook algorithm, the Netflix algorithm, or the Google search algorithm, they don’t always understand what it is, even if they understand the effects. But as algorithms gain increasing importance in our lives, it’s critical for everyone to have a very basic understanding of what they are.
At their heart, algorithms are instructions for how to solve a problem. They are analogous to blueprints for a building—a programmer would be the contractor who actually builds it. Just like you can understand and appreciate architecture without formal training, so it goes with algorithms. If you want to develop basic algorithmic literacy, you can do so in a few basic steps: learn some common algorithmic components, recognize common algorithmic challenges, and try creating some algorithms yourself.
The basic elements of algorithms
Every algorithm has inputs and outputs. The output is what you usually see—it’s the result of an algorithm running. That could be Google Maps finding your route home, TurboTax calculating your refund, or Netflix recommending a show. Algorithms also have inputs. The input may be simple, like a number, or it may be much larger than it seems. For example, an algorithm that finds a route from point A to point B takes those points you supply as inputs, but it also uses a whole database of roads, other drivers’ routes, etc. When you think about inputs, consider everything that might be used to solve the problem.
Once they receive inputs, algorithms perform a series of steps to generate outputs. Most algorithms build on other simple processes. In fact, every complex thing you do on a computer could be done by a little machine that reads ones and zeros off a strip of paper, looks something up in a table, and adjusts the digit. (This is known as the Turing Machine, a foundation of computer science.) Common processes include sorting lists, searching for items in lists, and doing some mathematical manipulations—all of which support many bigger algorithms.
Common algorithmic challenges
Computers are always searching for the right thing to do and sorting lists of options to help find it. Since these processes are so common, we want to be able to do them quickly.
Let’s take a look at a couple of search algorithms. Say I give you a deck of cards. It’s sorted first by the card value (twos then threes then fours etc.) and the by suit. So we’d have the two of clubs, two of hearts, two of diamonds, two of spades, three of clubs, three of hearts, etc.
We want an algorithm to find a specific card in that deck. What process would you would use? Algorithms need to be very explicit, so you can’t just say “flip through the deck until you see it.”
One algorithm might start at the front and check each card to see if it’s our card. If it is, we’re done! If not, we go on to the next card. In the worst case, if I ask for the ace of spades, you have to go through all 52 cards to find it. In the best case, I say the two of clubs and you find it immediately. If you average this out for all the possible cards I ask for, on average you have to go through about half the deck, or 26 cards, before finding it.
A better algorithm is called Binary Search. Since our deck is sorted, let’s start by splitting it in half and looking at the center card (this is the 26th card). Check if that’s the card we’re looking for. If it is, great! If not, we determine whether the card we want would be earlier or later in the deck. Let’s say it’s earlier. Then we throw out the whole half of the deck that’s later. We take the front half of the deck, split that in half (to the 13th card), and see if that’s our card. Repeat that process until we find the card we want. Each time, we split the deck in half. That means it will take us at most six comparisons to get to find our card—much faster than our first method.
Simple searching and sorting algorithms like this are basic building blocks that go together to solve complex problems. For example, one way to search the Web (assuming you have access to every Web page) is to search for every page that has your term. Then do a little math to calculate a score for how well they match the search term. Then sort those pages by the score and return them in best-to-worst order.
Recognizing these basic building blocks and how they fit together is a great start toward algorithmic literacy. But there are also some really hard problems you should know about because they show up everywhere.
These are called NP-complete problems. We’ll skip the formal definition here, but basically, they are problems where you have to check a lot of combinations to find the best answer. For instance, there’s one NP-complete problem called the Knapsack Problem. I give you a knapsack that can hold a given amount of weight (say 20 pounds). Then I give you a pile of objects. Each one has a weight and a value. You want to fit the most valuable combination of items into your knapsack, but you can’t exceed its weight limit. Which collection of items do you pick?
Another NP-complete problem is the Traveling Salesman Problem. In this one, you have a list of cities, a list of flights between cities, and the cost of each flight. A traveling salesman wants to visit every city on the cheapest route, and he won’t go through any city more than once.
NP-complete problems are hard (we think). How hard? If I gave you 100 cities in the Traveling Salesman problem, it could take longer than the known age of the universe to solve it. A fast algorithm could exist, but we computer scientists think there probably isn’t one (for more on this, learn about the P vs. NP problem). More interesting is that if we have a fast algorithm for any one of these problems, that algorithm would solve all of them. If you come up with a fast way to do the knapsack problem, you could directly apply that algorithm to the Traveling Salesman problem and vice versa. (Want a proof? Check out these slides). Recognizing something that looks like an NP-complete problem is an important step on your algorithmic literacy journey.
Once you know about these algorithmic basics, you can look for them in your everyday life. As an example, think about a music site recommending songs. What are the inputs? The site knows what songs I’ve listened to just once or repeatedly, and which I liked (or disliked) in the past. They know the genre, artist, and other information about those songs. They could search their list of songs for ones that are similar, calculate a similarity score for each song, sort the list, and show me the best ones.
They also know what songs other people listened to and liked. If that is part of the input, the algorithm could find the most popular songs among people with tastes similar to mine. That involves searching through all its records to find information about who listened to what, scoring my similarity with those people, then scoring the songs, and sorting the list of songs by score. The algorithm could even combine the information from other people’s preferences with the artist and genre information to come up with a score. Those inputs influence the algorithm’s output. Sometimes people talk about algorithmic bias—that’s something that can begin with biased inputs. In this case, those might include poor genre labels or a set of users who have unusual listening habits.
The key here is not to figure out exactly how the algorithm works, but to recognize what it might be using as inputs, what operations it might be doing with those inputs, and how it could generate outputs.
Making your own algorithm
Finally, you should try creating your own algorithms. It’s a good way to develop literacy just like reading is a good way to become a better writer.
Algorithm designers generally follow the process of looking at the big problem and breaking it down into smaller steps. Then we break those smaller steps down further until we start to see patterns or basic steps we need to perform. Essentially, solving puzzles.
If you listen to the Car Talk Puzzler, a lot of those are classic examples we computer scientists think about when we are leaning to build algorithms. Let’s consider one of those puzzlers:
We’re going to play a little card game. I’m going to deal out 21 cards from an ordinary deck of cards to a pile in front of us. We’re going to alternate taking cards from the pile.
Here are the rules:
When it’s your turn, you can take as many as three cards, but you must take at least one. So, you can take one, two, or three cards from the pile. The winner of the game is whoever picks up the last card, or cards, from the table.
For example, if there are six cards left and I pick up three, you will pick up the last three, because we’re alternating turns, and you would win. So, clearly if there were six cards left, I wouldn’t take three.
The question is, is there a strategy you could use that would guarantee you would win?
This puzzler is asking you to build an algorithm to play the game. Start by looking at the basic cases and then look for patterns. Here, we know that I will win if there are one, two, or three cards on the table since I could take all of them. I will lose if there are four cards on the table because, no matter what I do, my opponent will be able to take whatever is left.
So this tells us that it’s a winning proposition when there are one, two, or three cards on the table and a losing one if there are four. We can use this to build a pattern. If I can put my opponent into a losing position (e.g. make them see four cards), then I know I will win. I can do that if there are five cards (by taking one), six cards (by taking two), or seven cards (by taking three). If there are eight cards, there’s nothing I can take to make my opponent see four cards, so I will lose (unless my opponent makes a mistake).
Now we see a pattern. One, two, three are winning setups. Four is a loser. Five, six, seven are winners. Eight is a loser. If you follow this out, you see any multiples of four will be losers since anything you do will put your opponent into a winning configuration. Anything that’s not a multiple of four is a winner because you can make a move to get your opponent down to a losing position.
The input here is how many cards are on the table. Your algorithm will calculate how many cards to take to get the total to a multiple of four, and then make that move. The opponent goes, and then your algorithm repeats the process.
This puzzler is actually a version of a generic game called Nim, which computer science students often study. In Nim, you can change the rules regarding what you can take on each turn (for instance, say you can take only two or seven items on each term) and how many piles there are. The patterns change, but there is always an algorithm for a winning strategy.
Most algorithms you run into in everyday life are extremely complex, but that doesn’t mean you can’t develop a basic literacy to help understand them. Analyze the inputs and try to understand all the data an algorithm might have to work with. Learn the building blocks of algorithms, like searching and sorting, which get combined with some math to make up a lot of the algorithms you see. Recognize common algorithmic problems that show up around you all the time. And, finally, practice making some. Though you won’t become a professional, you will learn to appreciate and critique these processes that influence so much of our daily lives.
This article is part of the geoengineering installment of Futurography, a series in which Future Tense introduces readers to the technologies that will define tomorrow. Each month from January through June 2016, we’ll choose a new technology and break it down. Read more from Futurography about algorithms:
Future Tense is a collaboration among Arizona State University, New America, and Slate. To get the latest from Futurography in your inbox, sign up for the weekly Future Tense newsletter.
http://www.slate.com/articles/technology/future_tense/2016/02/how_to_teach_yourself_about_algorithms.html