Choosing Fast with Dynamic Programming
Finally, the moment is here, you are moving to San Francisco to become a well-known data scientist! You will work for a big company and be famous soon! But, first things first. You will live in a small apartment, way smaller than your current house in the valley. So you have to decide which stuff to take with you. Because you are a data scientist, you will decide this in a smart and efficient way. Let’s find the optimal solution!
Problem Description
At the moment your house is 100 square meters. The area that your things occupy is 50 square meters. Your new apartment is 40 square meters and you want your stuff to occupy 20 square meters at most, because you like to have some free space on the floor. You list your items in a way that you assign to them the area they occupy and the value they add to your life. Your goal is to maximize the value on the 20 square meters.
Items
You start listing your items. You specify the area (in square decimeters) and also the value items add to your life (on a scale from 0 to 100). You end up with the following dictionary with 29 items:
All right! E.g. the bed takes 4 square meters (= 400 square decimeters) and the value it adds to your life is 100, the game table takes 1.5 square meters and adds a value of 30.
Maximizing Value
How can you maximize the value with this items dictionary? You could try all the possible combinations and calculate the total value for each of the combinations. By selecting the maximum value for a maximum area of 20 squared meters you would get the best solution. One problem, the 29 items give 2²⁹ possible combinations of selecting items! That’s over 536 million! That would take some time!
You think about other options. What if you start with the item that adds the most value to your life, then the next item with the second highest value, and so on, until the 20 square meters are full? That’s easy and fast! But will it give the optimal solution? You doubt it. How can you solve the problem fast and get the best solution?
_Note: The Algorithms mentioned above are called Brute Force and Greedy._
Dynamic Programming
Fortunately, a friend tells you a smarter way to handle the problem. He explains that you can divide the problem in smaller and simpler sub-problems in a recursive manner. If you store the results in a memoization table, you can find the optimal solution fast, because you can reuse those results. In this way you trade memory space for time. So you start with this method, called Dynamic Programming. And indeed, one hour later you solved the problem and you know which items you will bring to your new place!
Dynamic Programming Steps
The four steps you need to complete to get your optimal inventory list for the SF apartment are:
Step 1. Create a dictionary with items and area/value they contain.
This is the stuffdict with items mentioned earlier (see first codeblock).
Step 2. Create area and value lists from the dictionary.
You use the stuffdict to create two lists with areas and values.
Step 3. Use these lists to build the memoization table.
With the area list, the value list, the maximum area and the number of items it’s possible to build the memoization table.
Step 4. Get the items included in the last row of the memoization table.The last row of the memoization table contains the optimal solution. It’s possible that multiple optimal solutions exist. E.g. when you have two items with the same area and value, or when some items add up to the same area and value of another item.
Let’s dive deeper into these steps and check out the code.
Step 1. Create a dictionary with items and area/value they contain.This is the dictionary with items, areas and values:
Step 2. Create area and value lists from the dictionary.
You use the stuffdict to create two lists with areas and values. To do this you use the function below. The input is the stuffdict from step 1, and it returns two lists: one with the areas and one with the values.
Step 3. Use these lists to build the memoization table.Now you will create the memoization table to find the optimal solution. You need the area and value list from step two, the total area A and the total number of items n. The table has n+1 rows and A+1 columns. You start on the first row with zero items and you will calculate the value for every cell. You will do this row by row in a bottom-up manner. The value table V looks like this:
The creation of the table in code:
On the second and third line, the table is created. After the creation, you’re going to fill every cell of the table. Line six and seven show the base case: when the area or the number of items is zero, the value is zero. Line eight and nine mean, when the area of the current item is smaller or equal to the area of the cell, calculate the value of the cell according to:
You choose the maximum between the value of the current item plus the value of the previous row and the current area minus the area of the current item (1), and the value of the previous row with the area of the current cell (2). Line ten and eleven mean: if the area of the cell is smaller than the area of the current item you set the value of the cell equal to the value of the previous row with the same area.
Step 4. Get the items included in the last row of the memoization table.
You use the following code to find your perfect furniture combination:
This code returns:
[‘bike_stand’, ‘garbage_can’, ‘standing_lamp’, ‘chest_of_drawers’, ‘plant_3’, ‘plant_2’, ‘diner_table_with_chairs’, ‘bookshelf’, ‘armchair’, ‘table’, ‘desk’, ‘bed’, ‘couch_s’]
Now you know what you need to take with you! Let’s move to San Fransisco!
Note: watch out, this code also returns duplicates (items with the same area and value pair)! In this example that’s not the case.
Conclusion
Hopefully this article can help you to find optimal solutions fast and efficient. You can replace the area value pair with other parameters you want to optimize, like weight and value or time and improvement. An important take away is that by storing solutions of smaller sub-problems you can find the optimal solution faster, and that this technique is called dynamic programming.
Bonus: Heatmap of the Memoization Table
The following heatmap is a nice visualization of the memoization table. On the x-axis you see the area you have available (2000 square decimeters). The y-axis shows the items list. You start with an empty list, and then you start adding items. The colors represent the total value for an area with the items available. You can see how the value becomes higher and higher when you proceed.
Code for building this heatmap, using the memoization table and the stuffdict:
Don’t forget to subscribe if you’d like to get an email whenever I publish a new article.
The post Choosing Fast with Dynamic Programming appeared first on Towards Data Science.