SOBA - The Sequence Ontology Bioinformatics Analysis Tool provides a high-level overview of the features in a GFF3 sequence annotation file. While GFF3 - the standard file format for genome annotation - is simple to produce and work with, whole genome annotation data still present a large and complex dataset. SOBA automatically calculates and displays some common statistics and graphics used when working with GFF3 files:
- Summary counts and statistics of feature types and attributes used
- Histograms of feature lengths
- Graphs of Sequence Ontology terms used
- Histograms of intron density
- Suggestions to improve SO compliance for invalid terms
Having ready access to summary details helps annotators and others working with the data make rapid evaluations about the quality and completeness of an annotation set as well as allowing comparison with other genome annotations.
Use the file browser below to select and upload one or more GFF3 files and run an analysis. Please limit total size of uploaded files to 1.5 GB.
Documentation for SOBA can be found on the Sequence Ontology Wiki