Giter VIP home page Giter VIP logo

techniche's Introduction

summary | demo web app | usage | walk through notebooks | license

techniche

Machine learning-based patent signals for technology decisions

Build Status Binder License

Techniche is a recommendation engine-based decision support tool to help business users surface technology ideas from patent documents for machine learning inventions.

Image description

Business understanding

Technology decision-makers - in engineering, people and product - require data to make choices in markets shaped by machine-learning technologies. Techniche recommends technology ideas based on the pipeline of underlying machine learning technologies in patents.

Usage

Dependencies are specified in requirements.txt

pip install -r requirements.txt

To run the notebooks locally, you will need to have python installed, preferably through anaconda .

You can then clone the techniche repository. From a command line, run

git clone https://github.com/glmack/techniche.git

Move into the techniche directory:

cd techniche

Set up software environment with the provided conda environment:

conda env create -f environment.yml
conda activate techniche_env

Launch Jupyter in your web browser

jupyter notebook

Contents

Walk-through notebooks are available in the model selection directory.

Data understanding

Techniche learns from public patent documents of the United States Patent Organization (USPTO) that are made available through the PatentsView API, dump files of the PatentsView backend database, and supplementary files containing full patent documents not available throuh the API. Users can explore the data used in Techniche via the PatentsView graphical user interface.

Data preparation

Natural language pre-processing techniques such as word tokenization and punctuation cleaning are applied to raw text data from patent titles and summary descriptions prior to introduction to models.

Explore notebooks detailing data preparation and modeling in topic_model.ipynb and rec_system.ipynb of the model selection directory.

Modeling

Techniche predicts technology recommendations using a hybrid recommender system. At the current stage of development, a collaborative filtering recommender component uses matrix factorization based on the Spark implementation of alternating least squares (ALS). A content-based recommender component, currently under development, addresses the cold start problem associated with making predictions for new users and items. The recommender will use text-based document (item) similarity metrics and also elicit user preferences through the web app. Latent Dirichlet Allocation (LDA), an unsupervised set of topic models is explored to generate the probable range of topics expressed in patent documents for machine learning-based inventions.

Evaluation

Recommendations are evaluated in terms of relevance to technology decision-makers. Intermediate intrinsic evaluation metrics, such as coherence and perplexity metrics for LDA provide additional diagnostics.

Deployment

Techniche will be made available for user experimentation as a Flask web app that offers a search interface where - as an intermediate demo step - users can input text strings describing technical areas and return predicted topics and their associated word co-occurences.

techniche's People

Contributors

dependabot[bot] avatar glmack avatar

Stargazers

 avatar  avatar

Forkers

mikpim01

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.