RNA-Seq: Ultrafast gene expression analysis. Now with ambient shipping for cells and purified RNA. Learn More

Resource Center

FAQ

Genotyping Analysis

Technical Details

We sequence each sample with Oxford Nanopore long reads to very high depth before generating reads using the latest basecalling and polishing software. Then we cluster these reads into phased alleles. 

  • We construct an amplification-free long-read sequencing library using the newest v14 library prep chemistry
  • We sequence the library with a primer-free protocol using the most accurate R10.4.1 flow cells.
  • We remove noise, phase alleles, and report frequencies only for alleles >10%. Frequencies are adjusted so that reported alleles total 100%.
  • If we detect a sequence at a frequency below 5%, the tool assumes it is likely due to PCR or sequencing error and assigns those reads to a “major”  allele (one that has >5% allele frequency) that is most similar in sequence.
    • Example: When a major allele is identified at 48% frequency and a closely related sequence is identified at 2%, the lower-frequency sequence is merged with the major allele. The final reported frequency of the major allele is therefore 50%.

Genotyping Analysis is designed to identify discrete alleles present at frequencies ≥5%. In some cases, such as highly variable mixtures or repetitive sequences, it may not be possible to resolve distinct alleles or accurately estimate their frequencies. This can occur with high complexity libraries or pooled CRISPR transfections. Genotyping Analysis is also limited to analyzing one locus at a time; it doesn’t support multiplexing different amplicons. 
 
If we are unable to generate a Genotyping Analysis readout, we will attempt to process your FASTQ files through our Premium PCR pipeline to produce a consensus sequence and associated outputs. If a consensus sequence cannot be generated, we will provide the FASTQ files so you can perform a custom analysis.
 

Unassigned reads include low-frequency sequences (<5%), reads that could not be confidently assigned to a called allele, divergent alleles that could not be grouped with a major allele, and reads failing quality filters.

For Genotyping Analysis, a “failure” means we were unable to generate sequencing data of sufficient quality or quantity to process through our analysis pipelines. However, sample complexity can also result in the inability to call alleles and, for this reason, is not automatically considered a failure.

Even if your Genotyping Analysis sample doesn’t fail, you may still qualify for a rerun if the read depth falls below a minimum threshold. See the FAQ “Does my failed sample qualify for a re-run?” for more information.

In order to deliver our extremely fast turnaround times we do not perform extensive pre-sequencing QC of your samples. However, by far the most common reasons are:

  • Sample DNA concentration is lower than our specifications
    • The most common cause of this is using a Nanodrop to quantify DNA concentration. We strongly recommend using a Qubit or equivalent spectrophotometry approach.
    • You may see evidence of this failure mode in the low amount of total data reported in the raw read length histogram
  • Samples contain fragmented linear/PCR products and/or fragmented genomic DNA

To achieve optimal sequencing results, please follow our recommended sample prep instructions.

We have tested the genotyping analysis at a range of different allelic ratios for SNPs and small indels. In general we find a strong correlation between experimentally prepared ratios and pipeline-predictions (R2 > .98). If you notice significant discrepancies between our analysis and your expected results, please reach out to us at support@plasmidsaurus.com. To learn more about our data deliverables, see the Genotyping Analysis product page.
 

Premium PCR and Genotyping Analysis return several of the same core data files.

  • FASTQ files: These contain the raw reads generated from your sequencing run in FASTQ format.
  • Histograms:  For each sample, we provide coverage histograms for the total sequencing run. 
  • Virtual gel: The virtual gel displays the distribution of read lengths across your sample and provides a quick check on the length of amplicons present.

How Genotyping Analysis output files differ from Premium PCR

Unlike Premium PCR, Genotyping Analysis does not generate or report consensus assemblies. Instead, reads are segregated into distinct allelic sequences that are analyzed separately. As a result, you will not receive the consensus GenBank file, FASTA assemblies, chromatograms, interactive feature map, or comparison-results.tsv that are included with Premium PCR samples.

Genotyping analysis files

Genotyping analysis returns unique files that are not returned in the Premium PCR service:

  • Variants files (.tsv): Reports the variants and their frequencies for each allele.
  • Allele count tables (.tsv):  Tables summarizing the total read counts and frequencies for each allele.
  • FASTA files (.fa): Files of your reference sequence and each reported allele. 
  • Chromatogram (.ab1): Trace files for each allele. 
  • Genbank file (.gbk): Annotated allelic sequence files.

If your sample includes alleles that are very dissimilar (<50% sequence identity), the pipeline will categorize the lower frequency sequence(s) as an unassigned read. If there is some homology between a sequence and the dominant, highest frequency allele and our tool cannot confidently align the sequences, alleles will be displayed in separate groups.
 

Genotyping Analysis and Premium PCR samples are  processed  using the native barcoding approach from Oxford Nanopore Technologies. This ligates the sequencing adapter to the ends of the molecules, preserving their full length which makes them extremely easy to tell apart—the single molecule nature of the sequencing platform means that each read corresponds to one specific molecule. In contrast we use Oxford Nanopore’s rapid barcoding approach for linear/PCR. You can learn more about this approach in our FAQ titled “What’s the difference between how you prepare Linear/PCR and Premium PCR samples?

Genotyping Analysis is available in three tiers with target read depths of 3,000, 6,000, and 12,000 reads (“Standard,” “Big,” and “Huge”). While we cannot guarantee a particular depth, the vast majority of our customers receive reads at or exceeding these target read depths.

Please note that higher sequencing depth requires submission of additional sample mass and volume, as outlined on our order page.
 

Results for the Genotyping analysis services are typically returned within 1-2 business days.


This is the average time it takes to return data to customers who are submitting orders from the US, UK and EU. Turnaround times for customers in APAC may be slightly longer due to the time required to ship samples to our lab in Singapore.
 

Low sample quality can also negatively impact turnaround time and the quality of your results. Please read sample prep instructions carefully to make sure your samples meet input requirements. Due to variability in shipping logistics and sample quality that is outside our control we cannot guarantee turnaround times or sample success rates.

We designed our analysis to be compatible with amplicons generated under a broad range of PCR conditions, so no specific amplification protocol is absolutely required.

However, we have evaluated polymerases and PCR conditions that reduce PCR-mediated recombination, which can complicate variant phasing. That is why for most genotyping applications, we recommend starting with a highly processive enzyme such as PrimeSTAR GXL DNA Polymerase from Takara, using a low cycle number (15 cycles) and a low template input mass (0.1 ng). Please refer to our technical documentation for additional details.

For Genotyping Analysis we aim to provide target read depths of 3,000, 6,000, or 12,000 reads for our “Standard,” “Big,” and “Huge” tiers. However, the exact number of reads depends on the quantity and purity of the submitted sample. You will qualify for a complimentary rerun if you receive under 1,000, 2,000, and 4,000 reads for each “Standard,” “Big,” or “Huge” tiers, respectively. Reruns can be ordered for qualifying samples at the bottom of the results page.

We will evaluate whether your sample quality and quantity permits rerunning the sample (and we may also ask you to provide a reference sequence).

Sample quality checks may require any of the following: 

  • Quantify concentration and/ or purity
  • Sample clean-up using the Bead Clean Up Protocol ($5 per sample)
  • Normalizing concentration

Please note that sequencing reruns may require the above quality checks and importantly do not guarantee additional sequencing depth. Please see our “why did my amplicon sequencing fail” FAQ for more information.