COMPare Analysis of Sequences with Software
The COMprehensive Protein Allergen REsource (COMPARE) database is a peer-reviewed, publicly accessible, database of allergen amino acid sequences. A key feature of the COMPARE allergen database is its capability in aiding an assessment of potential allergy health hazards, by using comparative sequence software.
COMPARE is now equipped with a comparative sequence search software as a built-in feature: COMPASS (COMPare Analysis of Sequences with Software), incorporating the open source FASTA software package (FASTA v36). In this tool page, COMPARE users can conduct website-based, real-time use of the COMPARE database to produce amino acid sequence alignments (between two or more amino acid sequences). The tool is oriented to regulatory safety review of novel foods and feeds, and implementation of criteria to help establish whether relevant alignments have occurred. The criteria are oriented to identifying amino acid sequence alignments that might be observed for any sequence compared with the COMPARE database.
COMPASS Intended use:
The intended use of the built-in COMPASS tool is to provide users the possibility of assessing the degree of shared sequence similarity between an amino acid sequence of interest ("query sequence") and allergen sequences within the COMPARE database. To this end, the user can perform three different types of sequence searches, using either default parameters or manually adjusted parameters.
The selection of search features available on the database tool was informed by FAO/WHO (2001) and CODEX Alimentarius guidelines on the testing of genetically modified plants for allergenicity (2003,2009). Users are highly encouraged to consult those guidelines for full details.
Search options in COMPASS:
Users can choose to perform three independent types of sequence comparisons:
- Full Length Sequence search
- 80-mer sliding window FASTA search
- 8-mer FASTA search
FAO/WHO (2001) has recommended that IgE cross-reactivity between a particular protein and allergen be considered when there is greater than 35% identity over a sliding "window" of 80 amino acids. The 35% identity threshold was based on data indicating protein cross-reactivity occurring between Bet v 1 and vegetable proteins at approximately 40% protein identity (Scheurer et al., 1999). The 80 amino acid window was selected to represent the size of a 'typical' protein domain.
Additional educational materials regarding the use of bioinformatics for risk assessment, FASTA, E-value, criteria for significance, the sliding window search process, as well as all references cited, are available in the "About" page (top left tab).