Using Phenotype Ontologies in GVF
This page describes developing a set of best practices when using phenotype ontologies in GVF.
The first draft of these recommendations written by Cjmungall Chris Mungall - other opinions are available, feel free to edit
Contents |
Using IDs vs Term Names vs Comments
GVF allows you to use either IDs or label strings to indicate the phenotype. I recommend using the ID at all times.
If the ontology term is not granular enough, use the comment field to enter a free text description
Which Phenotype Ontology to use?
If we take "phenotype" to be inclusive of traits, diseases, pathological features, etc then there are a number of choices. The main ontologies are listed below, together with examples.
Human Phenotype Ontology (HP)
In OBO Library? YES
Scope:
Any human phenotype. Currently focuses on morphological abnormalities, but is being extended to cover other domains
Example:
An individual with "Cafe-au-lait" spots (http://purl.obolibrary.org/obo/HP_0000957):
##phenotype-description Term=HP:0000957;Ontology=http://purl.obolibrary.org/obo/hp.obo
Mammalian Phenotype Ontology (MP)
In OBO Library? YES
Scope:
MP is primarily focused on mouse phenotypes
Example:
An individual with situs inversus
##phenotype-description Term=MP:0002766;Ontology=http://purl.obolibrary.org/obo/mp.obo
Mammalian Pathology Ontology (MPATH)
In OBO Library? YES
Scope:
Pathological physical entities and processes
Example:
An individual with truncoconal septal defect
##phenotype-description Term=MPATH:619;Ontology=http://purl.obolibrary.org/obo/mpath.obo
Note there is some overlap with HP and MP. MPATH is more focused towards the actual pathological entity, and may be most suitable for genotyping of pathological tissue samples?
Disease Ontology (DO)
In OBO Library? YES
Scope:
All human diseases
Example:
An individual with Parkinsons disease
##phenotype-description Term=DO:14330;Ontology=http://purl.obolibrary.org/obo/doid.obo
TODO: document DO vs MESH
OMIM
OMIM may be more suited to recording if the individual has a particular genetic disorder
SNOMED-CT
In OBO Library? NO
SNOMED-CT has "findings" and "disorder" sub-hierarchies, as well as disease, which can be used to indicate the phenotype. Many electronic health care systems use SNOMED-CT, so there might be phenotype records available for individuals already using this vocabulary.
Note that SNOMED-CT is not open - there are ongoing talks related to SNOMED transitioning to an open system. For now, bear in mind that if you use SNOMED it may restrict the ability of some people to do analyses that make use of the ontology structure (though mappings to HPO are available from the HPO team)
If you use SNOMED, use the SCTID ID space
Example:
An individual with male hypogonadism
##phenotype-description Term=SCTID:48723006
ICD-9
In OBO Library? NO
Like SNOMED, this is frequently used in EHRs.
For describing human phenotypes, HPO may be more suitable
Non-mammalian Organisms
We recommend the following ontologies for the following taxa
(See http://obofoundry.org for details)
- FYPO - fission yeast
- APO - budding yeast
- WormPhenotype - C elegans
- TO - plants
- Flybase vocabulary - D melanogaster
The VTO includes vertebrate traits, and is primarily aimed at agriculturally important traits
Generic Phenotypic Qualities (PATO)
In OBO Library? YES
Scope:
A collection of high-level characteristics. PATO is intended to be used compositionally - eg. in conjunction with an anatomy ontology such as the FMA
Example:
A sterile individual
##phenotype-description Term=PATO:0000956;Ontology=http://purl.obolibrary.org/obo/mp.obo
Compositional Descriptions of Phenotypes
Many phenotyping groups describe phenotypes using post-composition - the simple case is where a general characteristic (e.g. "hyoplastic", from PATO) is combined with an anatomical term (e.g. "kidney", from FMA). This is in contrast to ontologies such as the MP, which pre-compose phenotype descriptions, with a single ID for each description. For more details see:
- Mungall, C. J., Gkoutos, G. V., Smith, C. L., Haendel, M. A., Lewis, S. E., and Ashburner, M. (2010). Integrating phenotype ontologies across multiple species. Genome Biology 11, R2 [1]
At this time GVF does not support post-composition. Note that post-composed descriptions can always be translated into a phenotype ontology outside of GVF. However, some groups might still feel the need to use post-composition within GVF. The developers are open to extending it in this direction in the future - contact them on the mail list below.
One way in which compositional descriptions are useful is for describing measurements
Quantitative description of phenotypes
Pre-composed phenotype ontologies typically use qualitative descriptions, such as "increased width of face". Sometimes it's useful to record actual measurements - this could be a tuple of entity - characteristic - unit - value.
Currently GVF doesn't support this, but support could be added in the future.
See:
Hancock, J. M., Mallon, A.-M., Beck, T., Gkoutos, G. V., Mungall, C., and Schofield, P. N. (2009). Mouse, man, and meaning: bridging the semantics of mouse phenotype and human disease Mammalian Genome 20, 457-461 [2]
Including Phenotype Annotations outside GVF
Of course, it is always possible to include phenotype annotations outside GVF. This has the advantage of allowing for a more expressive representation, including evidence, assay details, temporal information, measurements, complex compositional descriptions etc.
However, as yet there is no one single format that suits all requirements. There is a growing body of best practice on how to record these using OWL - contact the obo-phenotype mail list below if you are interested.
In the absence of a single recommended supplementary phenotype information file, we recommend including as much phenotype information as is appropriate directly in the GVF file using one of the ontologies above.
Questions?
You can ask questions about GVF on the song-devel mail list (song-devel AT lists DOT sourceforge DOT net)
Questions about phenotype ontologies can be asked on the obo-phenotype mail list (obo-phenotype AT lists DOT sourceforge DOT net)