sgRNAcas9-AI

Help Topics

Introduction

sgRNAcas9-AI is a web application that allows a user to upload target sequences, set specifications according to SpCas9 and its variants (eSpCas9(1.1), HypaCas9, evoCas9, SpCas9-VRQR, Sniper-Cas9, SpCas9-HF1, SpCas9-NG and xCas9), and recieve target sgRNA candidates.

Using large-scale test data, we developed 9 deep-learning-based computational models that accurately predict the activity of Cas9 and its variants at any target sequence in human and animal cells.Optimal candidates are suggested through consideration of predicted target efficiency and off-target effect.

1.How to use Cas9 Variants tool?

Description: Only provides on-target scoring method for CRISPR small guide RNAs (sgRNAs). The available on-target cutting efficiency scoring method was developed by Xie lab.

License: MIT + file LICENSE

Step 1 : Enter the query target sequences in FASTA format

Notes: In FASTA format the line before the nucleotide sequence, called the FASTA definition line, must begin with a carat (">"), followed by a unique SeqID (sequence identifier).

The SeqID must be unique for each nucleotide sequence and should not contain any spaces. Please limit the SeqID to 25 characters or less. The SeqID can only include letters, digits, hyphens (-), underscores (_), periods (.), colons (:), asterisks (*), and number signs (#).

Step 2 : Select the type of Cas9 variants

Notes: In FASTA format the line before the nucleotide sequence, called the FASTA definition line, must begin with a carat (">"), followed by a unique SeqID (sequence identifier).

The following nucleases are supported: SpCas9, eSpCas9(1.1), HypaCas9, evoCas9, SpCas9-VRQR, Sniper-Cas9, SpCas9-HF1, SpCas9-NG and xCas9.

Step 3 : Click the button below to scan for sgRNA(sgRNA with higher scores are predicted to be more efficient)

On-target score

The on-target score of sgRNA sequence based on deep learning method. The higher the score, the better the predicted activity. Guide scores should typically be used in a relative manner as opposed to absolute. For example, if one guide is scored 3.9 and another guide is scored 1.9, the first guide would be considered better than the second guide. The score is purely based on on-target activity and does not incorporate off-target activity.

Retrieve Jobs: When you job has been submit, you can get a job id for Retrieve Jobs.

Description of Output table

First column - sgR_ID - unique identifier for the sgRNA sequence

#_a_1, A stand for PAM located on the anti-sense strand

#_s_1, S stand for PAM located on the sense strand

Second column - sgR_seq+PAM - nucleotide sequence of the sgRNA sequence (including PAM)

Third column - sgR_seq - only nucleotide sequence of the sgRNA sequence

Forth column - PAM_motif - nucleotide sequence of the PAM motif

Fifth column - strand - stand of target sequence in the chromosome (+/positive, -/minus)

Sixth column - start - start position of the sgRNA target site in given query sequence

Seventh column - end - end position of the sgRNA target site in given query sequence

Eighth column - GC% - GC contents of the sgRNA sequence (or protospacer)

Ninth column - 4Ts_motif - Discard sgRNAs with 4 or more consecutive T bases

Tenth column - sgR_efficiency - sgRNA efficiency calculated by sgRNAcas9-AI program

2.How to use Custom PAM tool?

Custom PAM is a web service to help users design the optimal sgRNAs/crRNAs for Cas9, Cas12, Cas13, and Cas14 systems with a minimal number of off-target effects.

Step 1 : Enter the query target sequences in FASTA format

Step 2 : Upload the custom reference genome (FASTA format, < 8Mb)

Step 3 : Describe the PAM and sgRNA requirements

Custom PAM:

Enter your PAM - enter user defined-PAM

NGG - SpCas9 from Streptococcus pyogenes - direction: 3’

NRG - SpCas9 from Streptococcus pyogenes - direction: 3’

NNAGAAW - StCas9 from Streptococcus thermophilus - direction: 3’

NNNNGMTT - NmCas9 from Neisseria meningitidis - direction: 3’

NNGRRT - SaCas9 from Staphylococcus aureus - direction: 3’

NNNRRT - SaCas9 KKH variant - direction: 3’

NGG(reduced NAG binding) - SpCas9 D1135E variant - direction: 3’

NGCG - SpCas9 VRER variant - direction: 3’

NGAG - SpCas9 EQR variant - direction: 3’

NGAN-NGNG - SpCas9 VQR variant - direction: 3’

NGG - FnCas9 from Francisella novicida - direction: 3’

YG - FnCas9 RHA variant - direction: 3’

TTTN - AsCas12 from Acidaminococcus, LbCas12 from Lachnospiraceae - direction: 5’

TTN - FnCas12 from Francisella novicida strain U112 - direction: 5’

CTA - FnCas12 from Francisella novicida strain U112 - direction: 5’

TTN-CTA - FnCas12 from Francisella novicida strain U112 - direction: 5’

TTN - C2c1 from four major taxa: Bacilli, Verrucomicrobia, a-proteobacteria, and d-proteobacteria - direction: 5’

Code	Base	Code	Base
A	Adenine	K	G or T
C	Cytosine	M	A or C
G	Guanine	B	C or G or T
T	Thymine	D	A or G or T
R	A or G	H	A or C or T
Y	C or T	V	A or C or G
S	G or C	N	any base
W	A or T	-	-

Note that sgRNAcas9-AI allows mixed bases to account for the degeneracy in PAM sequences.

Orientation: The orientation of the CRISPR PAM can be set on the 5’ or 3’.

Length of sgRNA: Length of a targeting sequence (crRNA sequence), not including PAM.

Step 4 : Choose an off-target setting

Mismatches number (M): The maximum number of mismatches that allowed in the "sgRNA" region when perform whole genome alignment, 'N' in PAM sequence are not counted as mismatched bases.

Step 5 : Click the button below to scan for sgRNA

Retrieve Jobs: When you job has been submit, you can get a job id for Retrieve Jobs.

Description of Output table

First column - sgR_ID - unique identifier for the sgRNA sequence

#_a_1, A stand for PAM located on the anti-sense strand

#_s_1, S stand for PAM located on the sense strand

Second column - sgR_seq+PAM - nucleotide sequence of the sgRNA sequence (including PAM)

Third column - PAM_motif - nucleotide sequence of the PAM motif

Forth column - strand - stand of target sequence in the chromosome (+/positive, -/minus)

Fifth column - start - start position of the sgRNA target site in given query sequence

Sixth column - end - end position of the sgRNA target site in given query sequence

Seventh column - sgR_seq - only nucleotide sequence of the sgRNA sequence

Eighth column - GC% - GC contents of the sgRNA sequence (or protospacer)

Ninth column - 0M - the number of the perfect matched site, if 0M (mismatch) =1, represent unique on-target site in genome; if 0M = 0, represent no perfect matched site in genome; if 0M >1, please check the target gene whether is a multi-copied gene, it’s may target to the same sequence, otherwise, it’s may contain perfect matched off-target sites.

Tenth column -1M- the number of the off-target sites with 3 mismatched bases (1M)

Eleventh column -2M- the number of the off-target sites with 4 mismatched bases (2M)

Twelfth column -3M- the number of the off-target sites with 5 mismatched bases (3M)

Thirteenth column -4M- the number of the off-target sites with 6 mismatched bases (4M)

Fourteenth column -5M- the number of the off-target sites with 7 mismatched bases (5M)

3.Frequently Asked Questions

3.1 Understanding Cas9 and its variants

Several SpCas9 variants have been developed to improve an enzyme’s specificity or to alter or broaden its protospacer-adjacent motif (PAM) compatibility, but selecting the optimal variant for a given target sequence and application remains difficult. To build computational models to predict the sequence-specific activity of SpCas9 variants (eSpCas9(1.1), HypaCas9, evoCas9, SpCas9-VRQR, Sniper-Cas9, SpCas9-HF1, SpCas9-NG and xCas9), we first assessed their cleavage efficiency at larger number of target sequences which downloaded from online resources. Using these data, we developed 9 deep-learning-based computational models that accurately predict the activity of these variants at any target sequence.

3.2 Is PAM sequence part of the sgRNA sequence construct?

The PAM sequence is located on the non-complementary strand. In other words, it is on the strand of DNA that contains the same DNA sequence as the target sgRNA. The PAM sequence should not be included in the design of the sgRNA.