Giter VIP home page Giter VIP logo

rec-a-sketch's Introduction

Repo for building Sketchfab recommendations. Collecting data, training algorithms, and serving recommendations on a website will all be here.

This repo will likely not work for python 2 due to various encoding issues.

For some of the crawling processes, Selenium is used. You must provide a path to your browser driver in config.yml for this to work. See here for links to download the driver binary.

Collecting data

Use this script to crawl the Sketchfab site and collect data. Currently supports 4 processes as specified by --type argument:

  • urls - Grab the url of every sketchfab model with number of likes >= LIKE_LIMIT as defined in the config.
  • likes - Given collected model urls, collect users who have liked those models.
  • features - Given collected model urls, collect categories and tags associated with those models.
  • thumbs - Given collected model urls, collect 200x200 pixel thumbnails of each model.

Run like

python crawl.py config.yml --type urls

I ran into lots of issues with timeouts when crawling features. To pick back up on a particular row of the urls file pass --start row_number as an optional argument.

Used to anonymize user_id's in likes data. Granted, one could probably back this out, but this serves as a small barrier of privacy.

To run, you must define a secret key for hashing the user_id's

python anonymize.py unanonymized_likes.csv anonymized_likes.csv "SECRET KEY"

The data

Model urls, likes, and features are all in the /data directory. These were roughly collected around October 2016.

All data are pipe-separated csv files with headers and with pandas read_csv() keyword arguments quoting=csv.QUOTE_MINIMAL and escapechar='\\'

rec-a-sketch's People

Contributors

ethanrosenthal avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.