Giter VIP home page Giter VIP logo

evangelisti-critical-temperature-of-superconductors's Introduction

Critical Temperature of Superconductors

Kaggle Competition available here.

In the one-page/ folder are available the one-page versions of the whole project, in HTML format (with hiplot tool interaction enabled) and PDF format (with hiplot tool interaction NOT enabled)

Problem Description

The phenomenon of superconductivity (Wikipedia) was discovered by Heike Kamerlingh-Onnes in 1911.

Superconductivity is a property of certain substances and materials whose electrical resistance drops to zero when the temperature equals to a certain value, called the critical temperature.

Many of the superconductivity properties are poorly understood, especially if the critical temperature can be predicted from the chemical and physical properties of the material.

Objectives

  1. Develop ML algorithms that can correctly predict the critical temperature, given the chemical structure and physical properties of a substance
  2. Find which features are the most relevant in the estimation

1. Dataset Description

The dataset comes from a database of superconducting materials compiled by Japan's National Institute of Materials Science (NIMS).

See 0_Data_Exploration notebook.

2. Models Training

Different models are trained:

  • Linear Regression
  • Random Forest
  • XGBoost
  • KNN
  • SVM

Using several preprocessing configurations and combinations:

  • Removing highly correlated features
  • StandardScaler, MinMaxScaler
  • Normalizer L1, L2, Max
  • PCA
  • Train only on Properties or Formula dataset

See 1_Training notebook.

3. Relationship between Critical Temperature and other features

To investigate on the relationship between critical temperature and other features, have been considered the following indicators:

  • the coefficients of the Linear Regression model
  • the feature importance based on mean decrease in impurity, of Random Forest and XGBoost models
  • the feature importance based on feature permutation, of Random Forest and XGBoost models

See 2_Features_Importance notebook.

4. Best Result

Best Model XGBoost
Preprocessing None
MSE 78.09
R^2 0.931

Mainly looking at the Feature Permutation of the XGBoost model, the most "important" features are: Cu, Ca, Ba, O, range_ThermalConductivity, Valence

See 3_Results notebook.

evangelisti-critical-temperature-of-superconductors's People

Contributors

riccardoevangelisti avatar

Stargazers

Claudio Sartori avatar

Watchers

Claudio Sartori avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.