Giter VIP home page Giter VIP logo

agtc's Introduction

Wang lab logo

Python version JupyterLab Jupyter Notebook YAML 1.2

Lee la versión en español

Note: to open links in new tab use CTRL+click (Windows and Linux) or CMD+click (MacOS).

What is AgTC?

Agronomic Experiments Data Template Creator is a set of functions written in python that allow you to create custom templates to collect data for most of the typical agronomic experiments. It also can be used as template to load a "field" in the Field Book app.

How to run AgTC?

Prerequisites

Install the requiered libraries using the pip package installer for Python.

Clone or download AgTC from the GitHub repository

  • Clone option

    1. Open a new Jupyter Notebook Terminal

    New > Terminal

    1. Clone the GitHub repository
        git clone https://github.com/Purdue-LuisVargas/AgTC.git
    
  • Download option

    1. Download AgTC from the Github repository: https://github.com/Purdue-LuisVargas/AgTC.
    2. Unzip the entire folder, then copy (if running Jupyter locally) or upload the downloaded files (if using the Jupyter Hub environment) in your Jupyter Notebook directory.

Upload the Initial Template file

Upload the Initial Template file to the ./input folder. It should be a CSV format file and it has to contain the information that changes for each plot and that you want to maintain in the New Template, such as plot number, repetition number, genotype name, etc. The table below is an example of columns and rows that a Initial Template could have:

plot genotype repetition
1 PI594301 3
2 LD-07-3395bf 23
3 SA1730464 18
4 PI154189 14
5 PI594451 2
... ... ...
71 CR16-0042 20
72 SA1811280 25
73 E19517GT 24
74 LD-07-3395bf 23
75 PI6548362 8

Understanding the Configuration File

The config.yml file is a YML file that could be edited using Jupyter or a text editor. The file is divided into six blocks of configurations, where each block is identified with uppercase letters. A BLOCK COLLECTION could have keys, values and/or sequences.

For instance, the following data structure contains the most common items in the config.yml file.

TEMPLATE_INPUT: 
  Folder : ./input/
  Sample_name: 
    - A
    - B

Where:

TEMPLATE_INPUT = BLOCK COLLECTION

Folder = Key

Sample_name = Key

./input/ = Value

A, B = Sequences

The data structures in the config.yml file could have one of the following arranges.

  • Case 1
    BLOCK COLLECTION:
       Key: Value
  • Case 2
    BLOCK COLLECTION:
       Key: 
          - Sequences
  • Case 3
    BLOCK COLLECTION:
       - Sequences 

Update the Configuration File

You need to update the config.yml file using information about the experiment.

General rules

  1. It is NOT recommendable to change BLOCK COLLECTION name items. If you change them, you should update the new name in the following line code on the main.ipynb Jupyter file:
functions.create_new_template('config.yml', 'TEMPLATE_INPUT', 'COLUMNS_TEMPLATE', 'NEW_COLUMNS', 'SAMPLES_PER_PLOT', 'SAMPLE_IDENTIFIER', 'TEMPLATE_OUTPUT')
  1. You can modify the name of any Key item. Additionally, for some BLOCK COLLECTION items, you should delete or add more depending on the experiment information.

  2. You should update Value items with the experiment information, but you cannot add more than one item per Key.

  3. You should delete, add or modify Sequences items as needed.

  4. Be sure to keep the correct indentation. Use the spacebar instead tab key.

TEMPLATE_INPUT

Used to define the path and name of the Initial Template file.

  • You can modify the name of the Key items.

  • You should update the Value items (e.g. trialInformation_PPAC_soybean_M2_y22.csv from the Input_template_file_name Key) with your file Initial Template name.

TEMPLATE_INPUT: 
  Folder : ./input/
  Input_template_file_name : trialInformation_PPAC_soybean_M2_y22.csv

COLUMNS_TEMPLATE

It allows to specify the column names that will be selected from the Initial Template File.

  • Even if you want to maintain all the columns from the Initial Template file, you should write their names as Sequences. Add or delete as many Sequences as necessary.
COLUMNS_TEMPLATE:
  - Plot
  - Repetition
  - Genotype

NEW_COLUMNS

It allows to write the names and values for the new columns that will be added to the New Template.

  • You can add, delete or modify Key items based on your experiment information.

  • You should update the Value items based on your experiment information.

NEW_COLUMNS: 
  Experiment : ACRE-Biomass
  Season : y22
  Environment : Early planting date
  Measurment : Plant height
  Sampling_identifier : Sampling-2

SAMPLES_PER_PLOT

This section is used to specify the number of repetitions of a measurement on the same experimental unit (plot, pot, growth chamber, etc.). It would create a row for each subsample name that the user indicates.

  • You should specify the Sequences with the name that you want to identify each subsample. Modify, delete, or change the Sequences according to the measurement characteristics for which the Template will be created.
SAMPLES_PER_PLOT:
  Sample_name: 
    - A
    - B
    - C
    - D

SAMPLE_IDENTIFIER

It allows to specify the names of the columns with the values that will be used to create a unique identifier for each subsample (row).

  • You should NOT add o delete Key items, but you can change its name. The Key name is the name of the column that the New Template will have.

  • You should update the Sequences items according to the experiment information. Add or delete as many Sequences as necessary. Make sure that the Sequences names exist as key in the items COLUMNS_TEMPLATE, NEW_COLUMNS and SAMPLES_PER_PLOT.

SAMPLE_IDENTIFIER :
  id_sample:
    - Plot
    - Sample_name
    - Experiment
    - Environment
    - Season
    - Measurment
    - Sampling_identifier

TEMPLATE_OUTPUT

It allows to specify the column names which values will be used to create the New Template file name.

  • You should update the Sequences items according the experiment information. Add or delete as many Sequences as necessary. Make sure that the Sequences names exist as key in the NEW_COLUMNS item.
TEMPLATE_OUTPUT: 
  - Measurment
  - Sampling_identifier
  - Experiment
  - Environment
  - Season

Run the Python functions

Open the main.ipynb file in Jupyter, execute the two block of instructions. You would find your New Template in the ./output folder.

Contact

Luis Vargas Rojas - [email protected]

Purdue University, Wang Lab dianewanglab.com

agtc's People

Contributors

ds4ag avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.