Traditional 'bottom-up' proteomics approaches use proteolytic digestion, LC-MS/MS and database searching to elucidate peptide identities and their parent proteins. Protein sequences absent from the database cannot be identified, and even if present in the database, complete sequence coverage is rarely achieved even for the most abundant proteins in the sample. Thus, sequencing of unknown proteins such as antibodies or constituents of metaproteomes remains a challenging problem. To date, there is no available method for full-length protein sequencing, independent of a reference database, in high throughput. Here we present Database Independent Protein Sequencing (DiPS), a method for unambiguous, rapid, database independent, full-length protein sequencing. The method is a novel combination of non-enzymatic, semi-random cleavage of the protein, LC-MS/MS analysis, peptide de novo sequencing, extraction of peptide tags, and their assembly into a consensus sequence using an algorithm named "Peptide Tag Assembler" (pTA). As proof-of-concept, the method was applied to samples of three known proteins representing three size classes and to a previously un-sequenced, clinically relevant, monoclonal antibody. Excluding leucine/isoleucine and glutamic-acid/deamidated glutamine ambiguities, end-to-end, full-length de novo sequencing was achieved with 99-100% accuracy for all benchmarking proteins and the antibody light chain. Accuracy of the sequenced antibody heavy chain, including the entire variable region, was also 100% but there was a 23 residue gap in the constant region sequence.
from #AlexandrosSfakianakis via Alexandros G.Sfakianakis on Inoreader http://ift.tt/2ntKcII