Quadram Institute logo

Lapidary DNA-protein matches for fastq read files




This site uses Lapidary to identify amino acid sequences in paired read files.

Samuel Bloomfield1, Aldert Zomer2, Alison Mather1

1Quadram Institute Bioscience, Research Park, Rosalind Franklin Rd, Norwich NR4 7UQ, United Kingdom

2Utrecht University, Heidelberglaan 8, 3584 CS Utrecht, Netherlands


Introduction:

Genome and metagenome comparisons rely on identifying genetic elements that differ and are in common between samples. These genetic elements can be identified by assembling sequenced reads and identifying the genetic element in the assembly, or by aligning nucleotide sequences in the reads to the nucleotide sequences of a reference genetic element. The first relies on the complete assembly of the genetic element of interest, and the second relies on a reference sequence represented in nucleotides. This is particularly challenging with metagenome data, where assemblies often contain very fragmented regions and thus present difficulties in identifying genetic elements through the first approach. A common approach with metagenomes is to map reads against reference nucleotide sequences and extract the depth and coverage from those reference sequences. However, currently no software exists to identity genetic elements using DNA-protein alignments in metagenomes. We have developed the software Lapidary to identify the presence, depth, coverage and most likely sequence of amino acid sequences from genome and metagenome read files. We tested the effectiveness of the method against simulated, genomic and metagenomic read datasets. Lapidary is more sensitive than assembly methods for metagenomic data that often have fragmented assemblies but is less sensitive when assemblies are more complete, as is the case with genomic data.


Methods:

Lapidary https://github.com/samuelbloomfield/lapidary

Terms and conditions

Terms and conditions


Instructions:

Please fill in the sample name (no spaces, only letters, numbers and dashes allowed), add your email address. Select the amino acid fasta file and the forward and reverse fastq.gz files from your computer, and then press "Upload and run"




Metadata










AA database:



Forward:

Reverse:







Copyright Quadram Institute Bioscience 2024