NOmESS is a homolgy-driven assembly tool to create a NOn-redundant protEin Sequence Set for mass spectrometry. The software facilitates to overcome the limitations of proteomic studies with poorly characterized organisms. The automated pipeline takes all available amino acid sequences from one organism as input (e.g. translated DNA sequences derived from ESTs or RNA deep-sequencing) and aligns them to a closely related and fully sequenced organism. In a step by step process the input sequences are clustered, assembled, joined and in the end representatives of each cluster are selected, resulting in a non-redundant reference set representing the maximal available amino acid sequence information. This set can then be used as the search database in a MaxQuant run.
The software is implemented in C# and freely available. The executable as well as data (including the one from the publication) can be downloaded here. We provide amino acid sequences in fasta format from Xenopus laevis and its homolog organism Xenopus tropicalis as well as sequneces from Coturnix japonica and its homolog organism Gallus gallus. For both pairs of organisms we alo included the NOmESS results. To execute NOmESS, the installation of BLASTp and cd-hit is required. The NOmESS source code can be downloaded from GitHub.
Temu, T., Mann, M., Raeschle, M. and Cox, J. Homology-driven assembly of Non-redundant protein Sequence Sets (NOmESS) for mass spectrometry. Bioinformatics, 2016, btv756.