Correlations in the AllRecipes Database
Explanation
This is a tool to visualize the correlations between ingredients in the AllRecipes database. I was inspired to make this by this Reddit post. It generates a correlation matrix between the selected ingredients and then displays it as a heatmap. The color of each cell represents the correlation factor between the two ingredients.
How it was made
                    I first took the dataset from this
                    internet archive
                    that contains about 71,000 recipes scraped from AllRecipes.
                    And wrote a python script that parsed the dataset and loaded
                    it into a json file that contains only the relevant
                    information (recipe name, ingredients, category, rating).
                    
                    Then I downloaded
                    a text file
                    containing a list of bunch of ingredients. However that list
                    contained a lot of junk items (like ingredients that
                    contained measurements), so I make another python script
                    that removed those items. 
                    The recipes dataset's ingredients were written like "1 cup
                    of flour", so I needed to just isolate the ingredient's
                    name. I found a
                    
                        python library
                    that could do that, but it wasn't perfect. So after running
                    the ingredients through that, the python script looks
                    through the list of ingredients, and the largest item from
                    that list that is a substring of the ingredient, becomes the
                    ingredient. After running this I had a list of recipes, and
                    their ingredients.
                    
                    To calculate the correlation matrix, I wrote a javascript
                    function (so it could be hosted on a static website) that
                    takes in a list of ingredients, and a category (if you want
                    to only look at a certain category of recipes), and then
                    calculates the correlation matrix. 
                    Then I wrote a javascript function that takes in the
                    correlation matrix, and generates an image that represents
                    it. 
                    Finally I wrote this webpage to display the image, and allow
                    the user to select the ingredients they want to analyze.
                
Limitations
The dataset is not perfect, and there are some issues I could see with it. The primary issue is that the data is fairly western focused, so the correlations will reflect that.