The agtc from ivanprzol

Lee la versión en español

Note: to open links in new tab use CTRL+click (Windows and Linux) or CMD+click (MacOS).

What is AgTC?

Agronomic Experiments Data Template Creator is a set of functions written in python that allow you to create custom templates to collect data for most of the typical agronomic experiments. It also can be used as template to load a "field" in the Field Book app.

How to run AgTC?

Option 1
- Using a Jupyter Hub enviroment.
- For Purdue University members only Jupyter Hub is available at https://notebook.scholar.rcac.purdue.edu/. It requires the use of BoilerKey Two-Factor Authentication.
Option 2
- You should make a simple installation of either JupyterLab or Jupyter Notebook, or you also can install an environment management such as conda, mamba, or pipenv.

Prerequisites

Install the requiered libraries using the pip package installer for Python.

PyYAML
```
    pip install pyyaml
```
Pandas
```
    pip install pandas
```

Clone or download AgTC from the GitHub repository

Clone option
1. Open a new Jupyter Notebook Terminal
New > Terminal
1. Clone the GitHub repository
```
    git clone https://github.com/Purdue-LuisVargas/AgTC.git
```
Download option
1. Download AgTC from the Github repository: https://github.com/Purdue-LuisVargas/AgTC.
2. Unzip the entire folder, then copy (if running Jupyter locally) or upload the downloaded files (if using the Jupyter Hub environment) in your Jupyter Notebook directory.

Upload the Initial Template file

Upload the Initial Template file to the ./input folder. It should be a CSV format file and it has to contain the information that changes for each plot and that you want to maintain in the New Template, such as plot number, repetition number, genotype name, etc. The table below is an example of columns and rows that a Initial Template could have:

plot	genotype	repetition
1	PI594301	3
2	LD-07-3395bf	23
3	SA1730464	18
4	PI154189	14
5	PI594451	2
...	...	...
71	CR16-0042	20
72	SA1811280	25
73	E19517GT	24
74	LD-07-3395bf	23
75	PI6548362	8

Understanding the Configuration File

The config.yml file is a YML file that could be edited using Jupyter or a text editor. The file is divided into six blocks of configurations, where each block is identified with uppercase letters. A BLOCK COLLECTION could have keys, values and/or sequences.

For instance, the following data structure contains the most common items in the config.yml file.

TEMPLATE_INPUT: 
  Folder : ./input/
  Sample_name: 
    - A
    - B

Where:

TEMPLATE_INPUT = BLOCK COLLECTION

Folder = Key

Sample_name = Key

./input/ = Value

A, B = Sequences

The data structures in the config.yml file could have one of the following arranges.

Case 1

    BLOCK COLLECTION:
       Key: Value

Case 2

    BLOCK COLLECTION:
       Key: 
          - Sequences

Case 3

    BLOCK COLLECTION:
       - Sequences

Update the Configuration File

You need to update the config.yml file using information about the experiment.

General rules

It is NOT recommendable to change BLOCK COLLECTION name items. If you change them, you should update the new name in the following line code on the main.ipynb Jupyter file:

functions.create_new_template('config.yml', 'TEMPLATE_INPUT', 'COLUMNS_TEMPLATE', 'NEW_COLUMNS', 'SAMPLES_PER_PLOT', 'SAMPLE_IDENTIFIER', 'TEMPLATE_OUTPUT')

You can modify the name of any Key item. Additionally, for some BLOCK COLLECTION items, you should delete or add more depending on the experiment information.
You should update Value items with the experiment information, but you cannot add more than one item per Key.
You should delete, add or modify Sequences items as needed.
Be sure to keep the correct indentation. Use the spacebar instead tab key.

TEMPLATE_INPUT

Used to define the path and name of the Initial Template file.

You can modify the name of the Key items.
You should update the Value items (e.g. trialInformation_PPAC_soybean_M2_y22.csv from the Input_template_file_name Key) with your file Initial Template name.

TEMPLATE_INPUT: 
  Folder : ./input/
  Input_template_file_name : trialInformation_PPAC_soybean_M2_y22.csv

COLUMNS_TEMPLATE

It allows to specify the column names that will be selected from the Initial Template File.

Even if you want to maintain all the columns from the Initial Template file, you should write their names as Sequences. Add or delete as many Sequences as necessary.

COLUMNS_TEMPLATE:
  - Plot
  - Repetition
  - Genotype

NEW_COLUMNS

It allows to write the names and values for the new columns that will be added to the New Template.

You can add, delete or modify Key items based on your experiment information.
You should update the Value items based on your experiment information.

NEW_COLUMNS: 
  Experiment : ACRE-Biomass
  Season : y22
  Environment : Early planting date
  Measurment : Plant height
  Sampling_identifier : Sampling-2

SAMPLES_PER_PLOT

This section is used to specify the number of repetitions of a measurement on the same experimental unit (plot, pot, growth chamber, etc.). It would create a row for each subsample name that the user indicates.

You should specify the Sequences with the name that you want to identify each subsample. Modify, delete, or change the Sequences according to the measurement characteristics for which the Template will be created.

SAMPLES_PER_PLOT:
  Sample_name: 
    - A
    - B
    - C
    - D

SAMPLE_IDENTIFIER

It allows to specify the names of the columns with the values that will be used to create a unique identifier for each subsample (row).

You should NOT add o delete Key items, but you can change its name. The Key name is the name of the column that the New Template will have.
You should update the Sequences items according to the experiment information. Add or delete as many Sequences as necessary. Make sure that the Sequences names exist as key in the items COLUMNS_TEMPLATE, NEW_COLUMNS and SAMPLES_PER_PLOT.

SAMPLE_IDENTIFIER :
  id_sample:
    - Plot
    - Sample_name
    - Experiment
    - Environment
    - Season
    - Measurment
    - Sampling_identifier

TEMPLATE_OUTPUT

It allows to specify the column names which values will be used to create the New Template file name.

You should update the Sequences items according the experiment information. Add or delete as many Sequences as necessary. Make sure that the Sequences names exist as key in the NEW_COLUMNS item.

TEMPLATE_OUTPUT: 
  - Measurment
  - Sampling_identifier
  - Experiment
  - Environment
  - Season

Run the Python functions

Open the main.ipynb file in Jupyter, execute the two block of instructions. You would find your New Template in the ./output folder.

Contact

Luis Vargas Rojas - [email protected]

Purdue University, Wang Lab dianewanglab.com

ivanprzol / agtc Goto Github PK

agtc's Introduction