Giter VIP home page Giter VIP logo

automated-literature-analysis's Introduction

Automated Literature Analysis

DOI

This repository shows an example of how to perform an automated analysis of academic literature using Jupyter notebooks and online citation databases such as Scopus, DBLP, and Semantic Scholar. This analysis detects the number of publications over time, popular authors, popular venues, popular affiliations, and popular "topics" that appear within the documents' abstracts (detected using natural language processing).

Requirements

The required Python packages can be found in requirements.txt. Creating a virtual Python environment is recommended (for example, virtualenv or conda). The notebook has been tested using Python 3.6.

Scopus is a citation database of peer-reviewed literature from scientific journals, books, and conference proceedings. To utilize the Scopus API, you (or your institute) needs a Scopus subscription and you must request an Elsevier Developer API key (see Elsevier Developers and Scopus Python API for more information).

Running using virtualenv

Installation using virtualenv is can be using the following commands:

Create virtualenv environment named myenv:

virtualenv myenv --python=`which python3`

Activate virtual environment

source ./myenv/bin/activate

Install requirement dependencies.

pip3 install -r requirements.txt

Install new Jupyter kernel.

ipython kernel install --user --name=myenv

Run Jupyter and select myenv as kernel. Remaining instructions can be found within the notebook itself.

jupyter notebook literature_analysis.ipynb --MappingKernelManager.default_kernel_name=myenv

Examples

Below are examples of the notebook's output for the query title-abs-key("predictive maintenance").

Publications per year

Publications per year.

Top 50 authors

Top 50 authors.

Top 50 publication venues

Top 50 publication venues.

Detected topics visualized as word clouds.

Detected topics visualized as word clouds.

Publications embedded into 2D space based on text similarity. Each publication is labeled with its dominant topic.

Publications embedded into 2D space based on text similarity. Each publication is labeled with its dominant topic.

automated-literature-analysis's People

Contributors

isazi avatar stijnh avatar henkdr avatar giuliostramondo avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.