Giter VIP home page Giter VIP logo

ramandatagenerator's Introduction

Raman Data Generator

Generic badge License: MIT

This project aims to offer a fast ⚡ and reliable data augmentation generator of Raman spectra

Download

You can download the python class by executing the following script in you target directory

wget https://raw.githubusercontent.com/fmolivato/RamanDataGenerator/master/raman_data_generator.py

Usage

from raman_data_generator import RamanDataGenerator

dataset = RamanDataGenerator(...)

for batch in dataset:
    print(batch.shape)    # do something with the batch

Arguments

Basic

Param Type Description
df pandas.DataFrame A pandas dataframe with shift's values as columns + a column called "labels" for the categories
batch_size int batch size of samples
max_classes int categories in the labels

Advanced

The standard paramenter were validated on a Raman task, however if you need a greater customization you can still tweak them!

The augmentation process works as follow. For each $sample_i$ of the current batch, takes another sample of the same class $sample_j$ (randomly) and performes:

  1. roll (shift horizontally, i used the roll term because it's easy to misunderstand the horizontal shift with the Raman's shift) $sample_j$ of some roll_factor (Raman's shift values).

  2. a weighted sum with respect of some $a$ probability variable

$$ sample_k = a·sample_i + (1-a)·sample_j $$

This augmentation step is based on the assumption that two samples of the same class are semantically equal (natural class variability) + some sensor noise.

  1. on $sample_k$ apply a slope of some slope factor, which is baseline linear error that emulates the fluorescence issue of some sensors.
  2. on $sample_k$ apply addittive white gaussian noise to the signal
Param Type Description
roll bool Enable/disable the roll step during the augmentation
roll_factor int The signal is rolled(horizontal shifted) of this amount of shifts. It rolls along the dataframe columns. If a signal has a precision of 10 Raman's shifts, wich means that the columns increase 10 shifts at time, using a roll factor of 5, it actually shifts 10*5 = 50 shifts
slope bool Enable/disable the slope step during the augmentation
slope_factor float It's the slope angle of the baseline linear error
noise bool Enable/disable the noise step during the augmentation
noise_range tuple The noise factor is sampled in this range. e.g. (min, max)

Requirements

The python libraries needed are:

random
dataclasses
pandas
numpy
tensorflow

The code is documented for more insightful informations 😉 !

Contributors are welcome 👍

ramandatagenerator's People

Contributors

fmolivato avatar molivato93 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.