Giter VIP home page Giter VIP logo

zuzannna / vango Goto Github PK

View Code? Open in Web Editor NEW
0.0 2.0 0.0 20.7 MB

🖼 ✨ VanGo: autocurating your art experience✨ 🎩 Art recommendation engine for The British Museum online collection created using NLP & PCA.

Home Page: http://vango.hopto.org

License: MIT License

Jupyter Notebook 48.88% Python 2.43% HTML 11.28% CSS 37.28% JavaScript 0.12%
data-science python flask-application art museum pca-analysis nlp-machine-learning british-museum-collection

vango's Introduction

VanGo

While completing PhD at NYU I often spent my weekends at numerous galleries and art museums in NYC, and I quickly realized that there is simply too much to see at any given time.

There should be a way to curate your trip depending on your personal preferences

That is how the idea of VanGo was born. During Insight Data Science Fellowship, a program that helps PhDs transition from academia to careers as data scientists, I got to spend three weeks developing VanGo from scratch. I started with building the database, then I designed and implemented the algorithm in Python, and finally developed the front-end using Flask and AWS. Let me walk you through the steps.

VanGo_main

As a data source, I used curatorial descriptions of paintings and drawings which I queried in SPARQL (a language for semantic databases, used by cultural heritage institutions) from British Museum's collection. Their database is great and publicly available at http://collection.britishmuseum.org/sparql. After some web scraping, and text processing (such as tokenizing and lemmatizing), I ran principal component analysis (PCA) to reduce the dimensionality of my dataset and basically describe every drawing and painting as a vector of numbers corresponding to the components of the PCA. When user enters a word, it is projected onto the PCA components and the web interface brings up art pieces which have highest cosine similarity with the input word. So, the analysis funnel looks something like this:

VanGo_analysis_funnel

Importantly, it will only return matches for words initially used in curatorial descriptions, so if you try anything like "cool" or "stuff" it won't work (finding a way to bridge these words with curatorial descriptions would be a whole another project:). In other words, the search results will reflect similarities between artworks which were discovered by PCA - which don't necessarily reflect "the grand truth" but reveal the structure of the data set.

For example, the majority of works in this data set originates from Asia, so the themes discovered by my algorithm reflect those favored by artists. How cool is that? Anyway, try it yourself and happy exploring!

For a short presentation of the project, see my slides at http://www.slideshare.net/ZuzannaKyszejko/zuzannaklyszejkovango and if you want to try the app go to http://vango.hopto.org/input

vango's People

Contributors

zuzannna avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.