A tool that was created to work downstream of antismash7 results to look for evidence of horizontal gene transfer (HGT) events.
The tool takes input of a folder containing gbff files (this can be multiple within the directory) and a single tsv file describing the biosynthetic gene clusters (BGC) regions within the genomes. The tool was built around correlating GC content of antismash BGCs with host GC content.
The tool then finds the GC content of the BGC vs the whole genome.
The output is a plot of the correlation between the two GC content values created using this tool. Antismash was run on 616 enterococcal genomes and the tool was run on the genomes and a tsv file describing the products.
- BGC Prediction Correlation: Correlates antiSMASH-predicted BGC's GC content with whole genome GC.
- Python 3.7.12
- antiSMASH (for generating initial data)
- Biopython
- Clone the repository to your local machine:
git clone https://github.com/DEHourigan/smashGC.git
- Navigate to the cloned directory:
cd smashGC
- Install the required dependencies:
conda env create -f smashGC.yml conda activate smashGC
The tool comes with a biopython script to pull out the necessary information from an antismash results start and end in the context of the whole genome.
-
Prepare your data:
- Ensure your
.gbff
files are located within a single folder. - Create a
.tsv
file containing the headersproduct
,assembly
,orig_start
,orig_end
,locus
,filename
as described:- product: Predicted by antiSMASH. ie. lanthipeptide-i
- assembly: Accession of the genome.
- orig_start and orig_end: antiSMASH regions in the context of the whole genome.
- locus: Contig name
- filename: Name of each individual
.gbff
file.
- Ensure your
-
Run smashGC:
python smashGC.py -f /path/to/folder -t /path/to/file.tsv
Replace /path/to/folder
and /path/to/file.tsv
with the actual paths to your .gbff
files folder and .tsv
file, respectively.
tsv file contiaining GC content of BGC vs genome. Can be plotted from here.
DEHourigan
For any queries, please reach out via GitHub issues or directly to [email protected]
.