Ultrafast genome-wide gene expression analysis. Try RNA-Seq now. Learn More

Resource Center

FAQ

Genotyping Analysis

Technical Details

We sequence each sample with Oxford Nanopore long reads to very high depth before generating reads using the latest basecalling and polishing software. Then we cluster these reads into phased alleles. 

  • We construct an amplification-free long-read sequencing library using the newest v14 library prep chemistry
  • We sequence the library with a primer-free protocol using the most accurate R10.4.1 flow cells.
  • We remove noise, phase alleles, and report frequencies only for alleles >10%. Frequencies are adjusted so that reported alleles total 100%.
  • If we detect a sequence at a frequency below 10%, the tool assumes it is likely due to PCR or sequencing error and assigns those reads to a “major”  allele (one that has >10% allele frequency) that is most similar in sequence.
    • Example: When a major allele is identified at 48% frequency and a closely related sequence is identified at 2%, the lower-frequency sequence is merged with the major allele. The final reported frequency of the major allele is therefore 50%.

Our pipeline phases variants into assembled sequences called alleles. If you provide a reference sequence, we align these alleles to your reference.

In some samples, your provided reference sequence may be the same as one of the observed alleles (for example one allele is WT and one allele is a mutant). If this is the case, we will denote this allele with a label eg Allele 1 (Your reference sequence name).

There may be cases when the provided reference is not one of the observed alleles, for example, in cases when all alleles are mutants. In these cases, we will only show your provided reference in the nucleotide and amino acid comparison view.

Finally, if you choose not to provide a reference sequence, we will align all sequences to the sequence with the highest frequency (denoted as Allele 1).

We have tested the genotyping analysis at a range of different allelic ratios for SNPs and small indels. In general we find a strong correlation between experimentally prepared ratios and pipeline-predictions (R2 > .98). If you notice significant discrepancies between our analysis and your expected results, please reach out to us at support@plasmidsaurus.com. To learn more about our data deliverables, see the Genotyping Analysis product page.
 

Premium PCR and Genotyping Analysis return several of the same core data files.

  • FASTQ files: These contain the raw reads generated from your sequencing run in FASTQ format.
  • Histograms and AB1 files: We provide read-length and coverage histograms as well as chromatogram files for each amplicon in your sample. Amplicons of substantially different length are managed separately.
  • Virtual gel: The virtual gel displays the distribution of read lengths across your sample and provides a quick check on the length of amplicons present.

How Genotyping Analysis output files differ from Premium PCR

Unlike Premium PCR, Genotyping Analysis does not generate or report a single consensus sequence. Instead, reads are segregated into distinct allelic groups that are analyzed separately. As a result, you will not receive files associated with generating a consensus, including: the GenBank file, interactive feature map, comparison-results.tsv, or fasta assembly files that accompany Premium PCR samples.

Genotyping Analysis files

Genotyping analysis returns unique files that are not returned in the Premium PCR service:

  • Per-base data (.tsv): Reports base calls and their frequencies for each allele.
  • Allele count tables (.tsv): Summarizes the total read counts supporting each reported allele.
  • FASTA files: FASTA files for your reference sequence and each additional reported allele. 

If your sample includes amplicons that differ significantly in length, for example due a large insertion or deletion on one allele, the pipeline will perform separate analyses on each amplicon and thus report allele frequencies for each amplicon separately. For example, in the case of a simple heterozygous locus with a large indel, you will see two amplicon maps, one for each sized amplicon, each with one allele at  100% allelic frequency.

Genotyping Analysis and Premium PCR samples are  processed  using the native barcoding approach from Oxford Nanopore Technologies. This ligates the sequencing adapter to the ends of the molecules, preserving their full length which makes them extremely easy to tell apart—the single molecule nature of the sequencing platform means that each read corresponds to one specific molecule. In contrast we use Oxford Nanopore’s rapid barcoding approach for linear/PCR. You can learn more about this approach in our FAQ titled “What’s the difference between how you prepare Linear/PCR and Premium PCR samples?

Genotyping Analysis is initially offered with a target of 3,000 reads (our “Standard” category). The exact number of reads used to identify and count alleles varies based on our internal quality control metrics, but is typically over a thousand. In the future, we plan to roll out Genotyping Analysis with higher levels of sequencing depths.

You will receive fasta files and tsv files of mutations for each allele.

Results for the Genotyping analysis services are typically returned within 1-2 business days.


This is the average time it takes to return data to customers who are submitting orders from the US, UK and EU. Turnaround times for customers in APAC may be slightly longer due to the time required to ship samples to our lab in Singapore.
 

Low sample quality can also negatively impact turnaround time and the quality of your results. Please read sample prep instructions carefully to make sure your samples meet input requirements. Due to variability in shipping logistics and sample quality that is outside our control we cannot guarantee turnaround times or sample success rates.