Cancer Analysis Pipeline

This repository contains a pipeline for analyzing cancer and control patient datasets using next-generation sequencing data. The pipeline involves several functions that perform various tasks, from data preprocessing to neural network-based analysis. Please note that due to the intensive nature of the analysis, there might be cases where your computer's memory might not be sufficient to process all the data.

Function Descriptions

1. `metadata_treat.py`

This function utilizes data from the SraRunTable.txt metadata file, which can be obtained through the NCBI SRA Run Selector. It creates a dataset containing information about control and cancer patients, along with their SRA run names.

2. `sra_script.py`

Using the SRA run names from the dataset generated by metadata_treat.py, this function downloads SRA run data, aligns it to the human hg38 genome, and generates BED files. Replace "cancer" with "control" if using the df_control dataset.

3. `Test_on_Script.py`

This function conducts tests on the BED files to enable thorough analysis of the dataset. Various tests and quality checks are performed to ensure the reliability of the data.

4. `Chrom_info.py`

Extracts chromosome positioning information from the BED files, which is essential for generating histograms containing fragment distribution data.

5. `histogram_creation.py`

Uses the chromosome positioning information to create histograms that provide insights into fragment distribution patterns within the dataset.

6. `AI_simple_NN_WRST.py`

Implements a small neural network using the histogram_creation data. This neural network aids in data analysis and treatment.

Important Notice

Please be aware that due to the complexity of the analysis and the large amount of data involved, your computer's memory might be insufficient to handle all aspects of this pipeline. It's recommended to have a system with sufficient memory and processing capabilities before attempting to run this analysis.

Usage

Clone the repository to your local machine.

Create and activate a Conda environment to isolate dependencies for this pipeline:

conda create -n cancer_analysis_env python=<python_version>
conda activate cancer_analysis_env

Replace <python_version> with the desired Python version.

Install the required dependencies using Conda and Bioconda, including:

FASTQC
Bedtools

Samtools

conda install -c bioconda fastqc bedtools samtools

Ensure you are using a Linux-based system, as the pipeline is designed to work best on this platform.

Run the functions in the order specified above, ensuring that you provide the necessary inputs and configurations.

Monitor memory usage during execution and consider utilizing a system with higher memory capacity if memory-related errors occur.

Dependencies

FASTQC
Bedtools
Samtools

Contributions

Contributions to this repository are welcome. If you encounter issues or have ideas for improvements, feel free to open an issue or submit a pull request.

samuelbernard4 / epigenetic-modeling-for-cancer-detection Goto Github PK

epigenetic-modeling-for-cancer-detection's Introduction

Cancer Analysis Pipeline

Function Descriptions

1. `metadata_treat.py`

2. `sra_script.py`

3. `Test_on_Script.py`

4. `Chrom_info.py`

5. `histogram_creation.py`

6. `AI_simple_NN_WRST.py`

Important Notice

Usage

Dependencies

Contributions

epigenetic-modeling-for-cancer-detection's People

Contributors

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

samuelbernard4 / epigenetic-modeling-for-cancer-detection Goto Github PK

epigenetic-modeling-for-cancer-detection's Introduction

Cancer Analysis Pipeline

Function Descriptions

1. metadata_treat.py

2. sra_script.py

3. Test_on_Script.py

4. Chrom_info.py

5. histogram_creation.py

6. AI_simple_NN_WRST.py

Important Notice

Usage

Dependencies

Contributions

epigenetic-modeling-for-cancer-detection's People

Contributors

Watchers

Recommend Projects

Recommend Topics

Recommend Org

1. `metadata_treat.py`

2. `sra_script.py`

3. `Test_on_Script.py`

4. `Chrom_info.py`

5. `histogram_creation.py`

6. `AI_simple_NN_WRST.py`