Giter VIP home page Giter VIP logo

you-are-what-you-eat-ada's Introduction

You are what you eat - Relating Demographic Data to Food Consumption Habits

Abstract

The original paper presents the Tesco Grocery 1.0 data set and verifies the data by correlating the typical food product with the prevalence of different metabolic diseases. We are interested in the influence of demographic data on food composition, more specifically, we want to predict the contents of the typical food product of each ward by demographic markers such as gender, age, race, and wealth. The UK government provides ward profiles with the aforementioned demographic markers. We can merge the grocery data and the demographic data by the ward identifiers. To explore the interplay between demographic data and food consumption habits, we first explore the data and conduct a correlation analysis. Afterward, we build a model that, given demographic markers, predicts the contents of the typical food product consumed in a ward. Lastly, we plan to explore the validity of our model. Our analysis would allow us to better understand the consumption habits of different population groups.

Research questions

  1. What is the relation between each individual demographic marker and food consumption habits?
  2. How well can we predict food consumption habits from demographic markers?
  3. How does data representativeness affect our model performance?

Proposed datasets

Tesco Grocery 1.0 from the paper -- this dataset provides the nutrients of the typical food product on different spatial granularities. Ward Atlas -- this dataset provides several demographic features at the ward level. Specifically, we will use gender, wealth, age, and race.

Methods

Data collection: enrich the Tesco dataset with demographic data from the Ward Atlas dataset.

Data analysis: Once we have demographic data and food datasets merged we will proceed to the analysis of correlations among different properties groups we are interested in, i.e. dependence of protein consumption on median income. After exploring the correlations we will build our model, which should predict the distribution of meal constituents on demographic data for every area.

Building the model: we will build a neural network to predict the typical food product’s ingredients We are going to explore different model configurations, i.e. Loss functions and activation functions.

Validation: We study the dependence of the model’s loss on the representativeness of the training data.

Proposed timeline

  • Week 1: Downloading and merging the data sets, doing a sanity check. Search for the correlations and visualize properties
  • Week 2: Build the model that should predict consumption habits from demographic markers.
  • Week 3: Study performance of the models obtained, prepare data story

Organization within the team

  • Alex: Data pre-processing: downloaded data, cleaned and transformed data, correlations analysis, Setting Up data story, Writing pre-processing part of data story
  • Egor: Finding reliable nutrients to predict, built linear models, built gradient boost models, compared performance of models, merged notebook for final submission, wrote data story part about finding the most important features
  • Denis: built procedure for feature selection, built neural net, built linear models, compared performance, did representativeness analysis

Contributors ✨


Alex

💻

Egor

💻

Denis

💻

you-are-what-you-eat-ada's People

Contributors

alxglvckij avatar denispushkin avatar egorssed avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.