Giter VIP home page Giter VIP logo

primers's Introduction

primers

This is a small, straightforward tool for creating PCR primers. Its target use-case is DNA assembly.

Reasons to choose primers instead of Primer3 include its:

  • features: It is uniquely focused on DNA assembly flows like Gibson Assembly and Golden Gate cloning. You can design primers while adding sequence to the 5' ends of primers.
  • simplicity: It is a small and simple Python CLI/library with a single dependency (seqfold). It is easier to install and use.
  • interface: The Python library accepts and create primers for Biopython Seq classes. It outputs JSON for easy integration with other applications.
  • license: It has a permissive, business-friendly license (MIT) instead of a copyleft GPL v2 license.

Installation

pip install primers

Usage

primers chooses pairs while optimizing for length, tm, GC ratio, secondary structure, and off-target binding. In the simplest case, you just pass the sequence you want to amplify:

$ primers create CTACTAATAGCACACACGGGGACTAGCATCTATCTCAGCTACGATCAGCATC
  dir    tm   ttm  gc     dg     p  seq
  FWD  63.6  63.6 0.5      0   2.6  CTACTAATAGCACACACGGG
  REV  63.2  63.2 0.5  -0.16  1.52  GATGCTGATCGTAGCTGAGATA

Additional sequence is added to the 5' end of primers via the add_fwd/add_rev args (-f/-r with CLI). By default, it will prepend the entire additional sequence. If you want it to choose the best subsequence to add to the 5' end (factoring in the features dicussed below), allow it to choose from a range of indicies via the add_fwd_len/add_rev_len (-fl/-rl with CLI). Each primer has two tms: "tm", the melting temperature for the portion of the primer that binds to the template sequence and "tm_total", the melting temperature for the entire primer including the additional sequence added to primers' 5' end.

Python

from primers import create

# add enzyme recognition sequences to FWD and REV primers: BsaI, BpiI
fwd, rev = create("AATGAGACAATAGCACACACAGCTAGGTCAGCATACGAAA", add_fwd="GGTCTC", add_rev="GAAGAC")
print(fwd.fwd)      # True
print(fwd.seq)      # GGTCTCAATGAGACAATAGCACACACA; 5' to 3'
print(fwd.tm)       # 62.4; melting temp
print(fwd.tm_total) # 68.6; melting temp with added seq (GGTCTC)
print(fwd.dg)       # -1.86; minimum free energy of the secondary structure

# add from a range of sequence to the FWD primer: [5, 12] bp
fwd, rev = create("AATGAGACAATAGCACACACAGCTAGGTCAGCATACGAAA", add_fwd="GGATCGAGCTTGA", add_fwd_len=(5, 12))
print(fwd.seq)      # AGCTTGAAATGAGACAATAGCACACACAGC (AGCTTGA added from add_fwd)
print(fwd.tm)       # 62.2
print(fwd.tm_total) # 70.0

CLI

$ primers create --help
usage: primers create [-h] [-f SEQ] [-fl INT INT] [-r SEQ] [-rl INT INT] [-t SEQ] [-j | --json | --no-json] SEQ

positional arguments:
  SEQ                   create primers to amplify this sequence

options:
  -h, --help            show this help message and exit
  -f SEQ                additional sequence to add to FWD primer (5' to 3')
  -fl INT INT           space separated min-max range for the length to add from '-f' (5' to 3')
  -r SEQ                additional sequence to add to REV primer (5' to 3')
  -rl INT INT           space separated min-max range for the length to add from '-r' (5' to 3')
  -t SEQ                sequence to check for off-target binding sites
  -j, --json, --no-json
                        write the primers to a JSON array

Table Output Format

By default, the primers are logged in table format in rows of dir, tm, ttm, gc, dg, p, seq where:

  • dir: FWD or REV
  • tm: the melting temperature of the annealing portion of the primer (Celsius)
  • ttm: the total melting temperature of the primer with added seq (Celsius)
  • gc: the GC ratio of the primer
  • dg: the minimum free energy of the primer (kcal/mol)
  • p: the primer's penalty score. Lower is better
  • seq: the sequence of the primer in the 5' to the 3' direction
$ primers create -f GGTCTC -r GAAGAC AATGAGACAATAGCACACACAGCTAGGTCAGCATACGAAA
  dir    tm   ttm  gc     dg     p  seq
  FWD  60.8  67.0 0.5  -1.86  5.93  GGTCTCAATGAGACAATAGCACACAC
  REV  60.8  65.8 0.5      0   3.2  GAAGACTTTCGTATGCTGACCTAG

JSON Output Format

The --json flag prints primers in JSON format with more details on scoring. The example below is truncated for clarity:

$ primers create CTACTAATAGCACACACGGGGACTAGCATCTATCTCAGCTACGATCAGCATC --json| jq
[
  {
    "seq": "CTACTAATAGCACACACGGG",
    "len": 20,
    "tm": 63.6,
    "tm_total": 63.6,
    "gc": 0.5,
    "dg": 0,
    "fwd": true,
    "off_target_count": 0,
    "scoring": {
      "penalty": 2.6,
      "penalty_tm": 1.6,
      "penalty_tm_diff": 0,
      "penalty_gc": 0,
      "penalty_len": 1,
      "penalty_dg": 0,
      "penalty_off_target": 0
    }
  },
...

Algorithm

Choosing PCR primers requires optimizing for a few different characteristics. Ideally, pairs of primers for PCR amplification would have similar tms, GC ratios close to 0.5, high minimum free energies (dg), and a lack off-target binding sites. In primers, like Primer3, choosing amongst those (sometimes competing) goals is accomplished with a linear function that penalizes undesirable characteristics. The primer pair with the lowest combined penalty is chosen.

Scoring

The penalty for each possible primer, p, is calculated as:

PENALTY(p) =
    abs(p.tm - optimal_tm) * penalty_tm +     // penalize each deg of suboptimal melting temperature
    abs(p.gc - optimal_gc) * penalty_gc +     // penalize each percentage point of suboptimal GC ratio
    abs(len(p) - optimal_len) * penalty_len + // penalize each bp of suboptimal length
    abs(p.tm - p.pair.tm) * penalty_tm_diff + // penalize each deg of melting temperature diff between primers
    abs(p.dg) * penalty_dg +                  // penalize each kcal/mol of free energy in secondary structure
    p.offtarget_count * penalty_offtarget     // penalize each off-target binding site

Each of the optimal (optimal_*) and penalty (penalty_*) parameters is adjustable in the primers.create() function. The defaults are below:

optimal_tm: float = 62.0
optimal_gc: float = 0.5
optimal_len: int = 22
penalty_tm: float = 1.0
penalty_gc: float = 0.2
penalty_len: float = 0.5
penalty_tm_diff: float = 1.0
penalty_dg: float = 2.0
penalty_offtarget: float = 20.0

Scoring Existing Primers

If you already have primers, and you want to see their features and penalty score, use the primers score command. The command below scores a FWD and REV primer against the sequence -s that they were created to amplify:

$ primers score GGTCTCAATGAGACAATA TTTCGTATGCTGACCTAG -s AATGAGACAATAGCACACACAGCTAGGTCAGCATACGAAATTT --json | jq
[
  {
    "seq": "GGTCTCAATGAGACAATA",
    "len": 18,
    "tm": 39.4,
    "tm_total": 55,
    "gc": 0.4,
    "dg": -1.86,
    "fwd": true,
    "off_target_count": 0,
    "scoring": {
      "penalty": 49.9,
      "penalty_tm": 22.6,
      "penalty_tm_diff": 19.6,
      "penalty_gc": 2,
      "penalty_len": 2,
      "penalty_dg": 3.7,
      "penalty_off_target": 0
    }
  },
  {
    "seq": "TTTCGTATGCTGACCTAG",
    "len": 18,
    "tm": 59,
    "tm_total": 59,
    "gc": 0.5,
    "dg": 0,
    "fwd": false,
    "off_target_count": 0,
    "scoring": {
      "penalty": 24.6,
      "penalty_tm": 3,
      "penalty_tm_diff": 19.6,
      "penalty_gc": 0,
      "penalty_len": 2,
      "penalty_dg": 0,
      "penalty_off_target": 0
    }
  }
]

Off-target Binding Sites

Usually, off-target binding sites should be avoided. In primers, off-target binding sites are those with <= 1 mismatch in the last 10 bair pairs of the primer's 3' end. This definition is experimentally supported by:

Wu, J. H., Hong, P. Y., & Liu, W. T. (2009). Quantitative effects of position and type of single mismatch on single base primer extension. Journal of microbiological methods, 77(3), 267-275

By default, primers are checked for off-targets within the seq parameter passed to create(seq). But the primers can be checked against another sequence if it is passed to the optional offtarget_check argument (-t for CLI). This is useful when PCR'ing a subsequence of a larger DNA sequence like a plasmid.

from primers import create

seq = "AATGAGACAATAGCACACACAGCTAGGTCAGCATACGAAA"
seq_parent = "ggaattacgtAATGAGACAATAGCACACACAGCTAGGTCAGCATACGAAAggaccagttacagga"

# primers are checked for offtargets in `seq_parent`
fwd, rev = create(seq, offtarget_check=seq_parent)

primers's People

Contributors

guzmanvig avatar jjti avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

primers's Issues

released package on pypi don't contain `requirements.txt` which makes installation fail if not using `whl` file

to reproduce

wget https://pypi.io/packages/source/p/primers/primers-0.5.4.tar.gz
tar zxvf primer*
cd primers-0.5.4
ls

got me

LICENSE
PKG-INFO
primers
primers.egg-info
README.md
setup.cfg
setup.py
tests

you see there is no requirements.txt, but setup.py reads it.

thus got me

......
  Installing build dependencies ... done
  Running command Getting requirements to build wheel
  <string>:3: DeprecationWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html
  Traceback (most recent call last):
    File "/tmp/primers-0.5.4/.venv/lib/python3.11/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 353, in <module>
      main()
    File "/tmp/primers-0.5.4/.venv/lib/python3.11/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 335, in main
      json_out['return_val'] = hook(**hook_input['kwargs'])
                               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    File "/tmp/primers-0.5.4/.venv/lib/python3.11/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 118, in get_requires_for_build_wheel
      return hook(config_settings)
             ^^^^^^^^^^^^^^^^^^^^^
    File "/tmp/pip-build-env-hoe006rw/overlay/lib/python3.11/site-packages/setuptools/build_meta.py", line 325, in get_requires_for_build_wheel
      return self._get_build_requires(config_settings, requirements=['wheel'])
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    File "/tmp/pip-build-env-hoe006rw/overlay/lib/python3.11/site-packages/setuptools/build_meta.py", line 295, in _get_build_requires
      self.run_setup()
    File "/tmp/pip-build-env-hoe006rw/overlay/lib/python3.11/site-packages/setuptools/build_meta.py", line 487, in run_setup
      super().run_setup(setup_script=setup_script)
    File "/tmp/pip-build-env-hoe006rw/overlay/lib/python3.11/site-packages/setuptools/build_meta.py", line 311, in run_setup
      exec(code, locals())
    File "<string>", line 10, in <module>
  FileNotFoundError: [Errno 2] No such file or directory: 'requirements.txt'
  error: subprocess-exited-with-error
......

the reason why I don't use wheel is that I am packing it to bioconda conda channel. patching it as temp workaround

Rename some of the parameters

opt_tm: float = 62.0
opt_gc: float = 0.5
opt_len: int = 22
penalty_tm: float = 1.0
penalty_gc: float = 3.0
penalty_len: float = 1.0
penalty_tm_diff: float = 1.0
penalty_dg: float = 2.0
penalty_offtarget: float = 20.0

to

optimal_tm: float = 62.0
optimal_gc: float = 0.5
optimal_len: int = 22
penalty_tm_delta: float = 1.0 # delta from optimal
penalty_gc_delta: float = 3.0
penalty_len_delta: float = 1.0
penalty_dg_delta: float = 2.0
penalty_offtarget: float = 20.0
penalty_pair_tm_delta: float = 1.0 # delta of tm between the pair's primers

How to create a pair of primers to amplify a plasmid?

Hello,

I am looking for guidance on how to design a pair of primers to amplify a specific plasmid sequence. I need to ensure that one primer extends downstream (forward primer) and the other extends upstream (reverse primer) from a specified starting position. The primers do not need to be of the same length. Can I achieve this using primers?

.seq property of reverse primer is str type while forward primer is Bio.Seq type

This is not a dealbreaker but having the reverse complement method

def _rc(seq: str) -> str:
return str means that the underlying 'seq' property for the reverse primer is also a string. This differs from the seq property of the forward primer, which is a Bio.Seq type.
I think the seq properties for both primers should be either a str of Bio.Seq type.
Not the end of the world but has made it a little frustrating dealing with this property.

Standalone tool to evaluate PCR quality

It would be great to be able to score a PCR based on the primers that I specify. Something like this:

primers -t {template} -oligo1 {oligo1_seq} -oligo2 {oligo2_seq}

Thanks,
Joe

Error when running off-target example

I am getting the following error when trying to run the off-target example:

% primers AATGAGACAATAGCACACACAGCTAGGTCAGCATACGAAA -t ggaattacgtAATGAGACAATAGCACACACAGCTAGGTCAGCATACGAAAggaccagttacagga
Traceback (most recent call last):
File "/opt/anaconda3/bin/primers", line 8, in
sys.exit(run())
File "/opt/anaconda3/lib/python3.9/site-packages/primers/main.py", line 19, in run
fwd, rev = primers(
File "/opt/anaconda3/lib/python3.9/site-packages/primers/primers.py", line 258, in primers
fwd_primers = _primers(
File "/opt/anaconda3/lib/python3.9/site-packages/primers/primers.py", line 314, in _primers
ot = offtargets(seq, offtarget_check)
File "/opt/anaconda3/lib/python3.9/site-packages/primers/offtargets.py", line 34, in offtargets
for m in mutate(check_seq[s : s + 10]):
File "/opt/anaconda3/lib/python3.9/site-packages/primers/offtargets.py", line 29, in mutate
for m in mutate_map[c]:
KeyError: 'g'

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.