Giter VIP home page Giter VIP logo

cmcl's Introduction

README

1 Overview

This README also serves as a design document. Any heading marked TODO may as well be wishful thinking.

cmcl is a high-level library for aggregating chemical datasets and computing a variety of features

It includes a simple user interface for aggregating data from computational experiments and committing the results to a local database. It also includes tools for sharing this database with collaborators.

cmcl is built around Pandas. This ensures data is handled efficiently as it is processed through any variety of inquiries, transformations, and mapping operations enabled through the succinct and powerful Pandas api

2 TODO Data

  • [ ] VASP parser
  • [ ] sqlite rclone interface
  • [ ] cmcl database

3 NEXT Featurization

  • State “NEXT” from “TODO” [2022-02-13 Sun 16:02]
cmcl’s main offer is easy featurization of chemical datasets.

Tabular records of chemical formula and associated observations need only be loaded as a dataframe before cmcl’s formula parser can be used to convert formula strings into equivalent numerical descriptors.

  • [X] descriptors->relational queries via pandas
  • [ ] descriptors->Composition objects->pymatgen feature descriptors
  • [ ] structure object generation and feature calculations

These compositions – themselves a dataframe – can then be further processed into other feature sets using a variety of libraries

  • mendeleev cite:&mendeleev2014
  • Matminer
  • DScribe cite:&himanen-2020-dscrib
  • MEGnet cite:&chen-2019-graph-networ

3.1 Perovskite Property Computation

  • [ ] compute SLME for photovoltaic absorbers. Uses SL3ME Implementation of cite:&yu-2012-ident-poten by @Idwillia on Github
  • [ ]

4 Data Handling Workflow

4.1 TODO Data Aggregation

cmcl exposes itself as a commandline tool for aggregate computational data from VASP and Quantum Espresso experiment directory trees

5 TODO Formula generation

includes tools for randomly creating formula from a set of rules. Usually better to systematically plan an experiment though

6 Property Prediction

cmcl includes some pretrained models which may be used to infer the properties of chemistries

6.1 TODO Methods for applying pretrained models

7 Installation

cmcl is very early in development.

7.1 Install by cloning the repository

yogi can be installed into a standard python environment. It is a poetry project and may be installed using pip.

proceed to run your python process/jupyter kernel of choice and enjoy.

8 Contribution

Yes Please.

To create clean development environment, simply fork/clone the repository and the poetry.lock file will take care of dependency management.

9 TODO Usage Examples

9.1 aggregating data

$ cd /to/experiment/dir
$ python
>>> cmcl aggregate *

9.2 compute features

10 TODO Data Aggregation

10.1 TODO pymatgen assimilation library

For collecting VASP results

10.2 TODO NOMAD?

use nomad for metadata generation and more?

11 TODO Data Sharing

11.1 TODO Local DB

cmcl will create a local database upon a call to a dataframe’s cmclwrite method.

this database can then be freely populated with dataframes

11.2 TODO “Collaboration Remote”

cmcl also provides a “push” method that allows users to choose a remote host

and share local tables with it. cmcl is of the philosophy that ALL data is good data

so, “pull” is implicit. the database only ever grows. nothing is ever overwritten.

$ rclone sync purduebox:/Mannodi_group_research_material/Perovskite\ Dataset/perovskites.db

11.3 TODO “Publish Remote”

cmcl implements OPTIMATE to provide an easy universal query and, where possible, publish option for sharing your data with global platforms

12 External Datasets

compare model to experimental results for validation

  1. cite:&almora-2020-devic-perfor meta-analysis of Perovskite PV devices.
  2. more literature compounds.
  3. Materials Zone aggregate database.

13 Citations

bibliographystyle:authordate1 bibliography:~/org/bibliotex/bibliotex.bib

cmcl's People

Contributors

pmiam avatar mannodiarun avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.