format-version: 1.2 date: 30:08:2007 13:40 saved-by: kareneilbeck auto-generated-by: OBO-Edit 1.101 subsetdef: biosapiens "biosapiens protein feature ontology" subsetdef: SOFA "SO feature annotation" default-namespace: sequence remark: autogenerated-by\: DAG-Edit version 1.417\nsaved-by\: eilbeck\ndate\: Tue May 11 15\:18\:44 PDT 2004\nversion\: $Revision\: 1.45 $ [Term] id: SO:0000000 name: Sequence_Ontology subset: SOFA [Term] id: SO:0000001 name: region def: "A sequence_feature with an extent greater than zero." [SO:ke] subset: SOFA synonym: "sequence" EXACT [] is_a: SO:0000110 ! sequence_feature [Term] id: SO:0000006 name: PCR_product def: "A region amplified by a PCR reaction." [SO:ke] comment: This term is mapped to MGED. Do not obsolete without consulting MGED ontology. subset: SOFA synonym: "amplicon" RELATED [] synonym: "PCR product" EXACT [] is_a: SO:0000695 ! reagent [Term] id: SO:0000007 name: read_pair def: "A pair of sequencing reads in which the two members of the pair are related by originating at either end of a clone insert." [SO:ls] subset: SOFA is_a: SO:0000143 ! assembly_component relationship: part_of SO:0000149 ! contig [Term] id: SO:0000013 name: scRNA def: "Any one of several small cytoplasmic RNA molecules present in the cytoplasm and sometimes nucleus of a eukaryote." [http://www.ebi.ac.uk/embl/WebFeat/align/scRNA_s.html] subset: SOFA synonym: " small cytoplasmic RNA" EXACT [] is_a: SO:0000655 ! ncRNA [Term] id: SO:0000038 name: match_set def: "A collection of match parts." [SO:ke] subset: SOFA is_obsolete: true [Term] id: SO:0000039 name: match_part def: "A part of a match, for example an hsp from blast isa match_part." [SO:ke] subset: SOFA is_a: SO:0000001 ! region relationship: part_of SO:0000343 ! match [Term] id: SO:0000050 name: gene_part def: "A part of a gene, that has no other route in the ontology back to region. This concept is necessary for logical inference as these parts must have the properties of region. It is also allows us to associate all the parts of genes with a gene." [SO:ke] subset: SOFA is_obsolete: true [Term] id: SO:0000057 name: operator def: "A regulatory element of an operon to which activators or repressors bind hereby effecting translation of genes in that operon." [SO:ma] subset: SOFA synonym: "operator segment" EXACT [] is_a: SO:0000752 ! gene_group_regulatory_region [Term] id: SO:0000101 name: transposable_element def: "A transposon or insertion sequence. An element that can insert in a variety of DNA sequences." [http://www.sci.sdsu.edu/~smaloy/Glossary/T.html] subset: SOFA synonym: " transposon" EXACT [] [Term] id: SO:0000102 name: expressed_sequence_match def: "A match to an EST or cDNA sequence." [SO:ke] subset: SOFA is_a: SO:0000347 ! nucleotide_match [Term] id: SO:0000103 name: clone_insert_end def: "The end of the clone insert." [SO:ke] subset: SOFA is_a: SO:0000699 ! junction [Term] id: SO:0000104 name: polypeptide def: "A sequence of amino acids linked by peptide bonds which may lack appreciable tertiary structure and may not be liable to irreversible denaturation." [SO:ma] comment: This term is mapped to MGED. Do not obsolete without consulting MGED ontology. subset: SOFA is_a: SO:0000001 ! region relationship: derives_from SO:0000316 ! CDS [Term] id: SO:0000109 name: sequence_variant_obs def: "A sequence_variant is a non exact copy of a sequence_feature or genome exhibiting one or more sequence_alteration." [SO:ke] subset: SOFA synonym: "mutation" RELATED [] is_obsolete: true [Term] id: SO:0000110 name: sequence_feature def: "An extent of biological sequence." [SO:ke] subset: SOFA synonym: "located sequence feature" RELATED [] synonym: "located_sequence_feature" EXACT [] is_a: SO:0000000 ! Sequence_Ontology [Term] id: SO:0000112 name: primer def: "A short preexisting polynucleotide chain to which new deoxyribonucleotides can be added by DNA polymerase." [http://www.ornl.gov/TechResources/Human_Genome/publicat/primer2001/glossary.html] subset: SOFA synonym: "primer oligonucleotide" EXACT [] synonym: "primer polynucleotide" EXACT [] synonym: "primer sequence" EXACT [] is_a: SO:0000696 ! oligo [Term] id: SO:0000113 name: proviral_region def: "A viral sequence which has integrated into a host genome." [SO:ke] subset: SOFA synonym: "proviral sequence" RELATED [] [Term] id: SO:0000114 name: methylated_C def: "A methylated deoxy-cytosine." [SO:ke] subset: SOFA synonym: "methylated C" EXACT [] synonym: "methylated cytosine" EXACT [] synonym: "methylated cytosine base" EXACT [] synonym: "methylated cytosine residue" EXACT [] is_a: SO:0000306 ! methylated_base_feature [Term] id: SO:0000120 name: protein_coding_primary_transcript def: "A primary transcript that, at least in part, encodes one or more proteins." [SO:ke] comment: May contain introns. subset: SOFA synonym: "pre mRNA" RELATED [] is_a: SO:0000185 ! primary_transcript [Term] id: SO:0000139 name: ribosome_entry_site def: "Region in mRNA where ribosome assembles." [SO:ke] comment: Gene:. subset: SOFA is_a: SO:0000837 ! UTR_region [Term] id: SO:0000140 name: attenuator def: "A sequence segment located within the five prime end of an mRNA that causes premature termination of translation." [SO:as] subset: SOFA synonym: "attenuator sequence" EXACT [] is_a: SO:0005836 ! regulatory_region relationship: part_of SO:0000234 ! mRNA [Term] id: SO:0000141 name: terminator def: "The sequence of DNA located either at the end of the transcript that causes RNA polymerase to terminate transcription." [http://www.ebi.ac.uk/embl/Documentation/FT_definitions/feature_table.html] subset: SOFA synonym: "terminator sequence" EXACT [] is_a: SO:0005836 ! regulatory_region [Term] id: SO:0000143 name: assembly_component def: "A region of sequence which may be used to manufacture a longer assembled, sequence." [SO:ke] subset: SOFA is_a: SO:0000001 ! region [Term] id: SO:0000147 name: exon def: "A region that codes for portion of spliced messenger RNA (SO:0000234); may contain 5'-untranslated region (SO:0000204), all open reading frames (SO:0000236) and 3'-untranslated region (SO:0000205)." [http://www.ebi.ac.uk/embl/Documentation/FT_definitions/feature_table.html] comment: This term is mapped to MGED. Do not obsolete without consulting MGED ontology. subset: SOFA is_a: SO:0000833 ! transcript_region [Term] id: SO:0000148 name: supercontig def: "One or more contigs that have been ordered and oriented using end-read information. Contains gaps that are filled with N's." [SO:ls] subset: SOFA synonym: "scaffold" RELATED [] is_a: SO:0000353 ! assembly relationship: part_of SO:0000719 ! ultracontig [Term] id: SO:0000149 name: contig def: "A contiguous sequence derived from sequence assembly. Has no gaps, but may contain N's from unvailable bases." [SO:ls] subset: SOFA is_a: SO:0000143 ! assembly_component is_a: SO:0000353 ! assembly relationship: part_of SO:0000148 ! supercontig [Term] id: SO:0000150 name: read def: "A sequence obtained from a single sequencing experiment. Typically a read is produced when a base calling program interprets information from a chromatogram trace file produced from a sequencing machine." [SO:rd] subset: SOFA is_a: SO:0000143 ! assembly_component relationship: part_of SO:0000149 ! contig [Term] id: SO:0000151 name: clone def: "A piece of DNA that has been inserted in a vector so that it can be propagated in E. coli or some other organism." [http://www.geospiza.com/community/support/glossary/] subset: SOFA is_a: SO:0000695 ! reagent [Term] id: SO:0000159 name: deletion def: "The point at which a deletion occured." [SO:ke] subset: SOFA synonym: "deleted_sequence" EXACT [] is_a: SO:0000001 ! region [Term] id: SO:0000161 name: methylated_A def: "A methylated adenine." [SO:ke] subset: SOFA synonym: "methylated A" EXACT [] synonym: "methylated adenine" EXACT [] synonym: "methylated adenine base" EXACT [] synonym: "methylated adenine residue" EXACT [] is_a: SO:0000306 ! methylated_base_feature [Term] id: SO:0000162 name: splice_site def: "The position where intron is excised." [SO:ke] subset: SOFA is_a: SO:0000835 ! primary_transcript_region [Term] id: SO:0000163 name: five_prime_splice_site def: "The junction between the 3 prime end of an exon and the following intron." [http://www.ucl.ac.uk/~ucbhjow/b241/glossary.html] subset: SOFA synonym: "5' splice site" EXACT [] synonym: "donor" RELATED [] synonym: "donor splice site" EXACT [] synonym: "splice donor site" EXACT [] is_a: SO:0000162 ! splice_site [Term] id: SO:0000164 name: three_prime_splice_site def: "The junction between the 3 prime end of an intron and the following exon." [http://www.ucl.ac.uk/~ucbhjow/b241/glossary.html] subset: SOFA synonym: "3' splice site" RELATED [] synonym: "acceptor" RELATED [] synonym: "acceptor splice site" EXACT [] synonym: "splice acceptor site" EXACT [] is_a: SO:0000162 ! splice_site [Term] id: SO:0000165 name: enhancer def: "A cis-acting sequence that increases the utilization of (some) eukaryotic promoters, and can function in either orientation and in any location (upstream or downstream) relative to the promoter." [http://www.ebi.ac.uk/embl/Documentation/FT_definitions/feature_table.html] subset: SOFA relationship: part_of SO:0000234 ! mRNA [Term] id: SO:0000167 name: promoter def: "A regulatory_region composed of the TSS(s) and binding sites for TF_complexes of the basal transcription machinery." [SO:regcreative] comment: This term is mapped to MGED. Do not obsolete without consulting MGED ontology.\nThe region on a DNA molecule involved in RNA polymerase binding to initiate transcription. subset: SOFA synonym: "promoter sequence" EXACT [] [Term] id: SO:0000177 name: cross_genome_match def: "A nucleotide match against a sequence from another organism." [SO:ma] subset: SOFA is_a: SO:0000347 ! nucleotide_match [Term] id: SO:0000178 name: operon def: "A group of contiguous genes transcribed as a single (polycistronic) mRNA from a single regulatory region." [SO:ma] comment: This term is mapped to MGED. Do not obsolete without consulting MGED ontology. subset: SOFA is_a: SO:0005855 ! gene_group [Term] id: SO:0000179 name: clone_insert_start def: "The start of the clone insert." [SO:ke] subset: SOFA is_a: SO:0000699 ! junction [Term] id: SO:0000181 name: translated_nucleotide_match def: "A match against a translated sequence." [SO:ke] subset: SOFA is_a: SO:0000347 ! nucleotide_match [Term] id: SO:0000183 name: non_transcribed_region def: "A region of the gene which is not transcribed." [SO:ke] subset: SOFA synonym: "non-transcribed sequence" EXACT [] synonym: "nontranscribed region" EXACT [] synonym: "nontranscribed sequence" EXACT [] is_a: SO:0000842 ! gene_component_region [Term] id: SO:0000185 name: primary_transcript def: "A transcript that in its initial state requires modification to be functional." [SO:ma] subset: SOFA synonym: "precursor RNA" EXACT [] is_a: SO:0000673 ! transcript [Term] id: SO:0000187 name: repeat_family def: "A group of characterized repeat sequences." [SO:ke] subset: SOFA is_obsolete: true [Term] id: SO:0000188 name: intron def: "A segment of DNA that is transcribed, but removed from within the transcript by splicing together the sequences (exons) on either side of it." [http://www.ebi.ac.uk/embl/Documentation/FT_definitions/feature_table.html] comment: This term is mapped to MGED. Do not obsolete without consulting MGED ontology. subset: SOFA is_a: SO:0000835 ! primary_transcript_region [Term] id: SO:0000193 name: RFLP_fragment def: "A polymorphism detectable by the size differences in DNA fragments generated by a restriction enzyme." [PMID:6247908] subset: SOFA synonym: "restriction fragment length polymorphism" EXACT [] synonym: "RFLP" EXACT [] is_a: SO:0000412 ! restriction_fragment [Term] id: SO:0000203 name: UTR def: "Messenger RNA sequences that are untranslated and lie five prime and three prime to sequences which are translated." [SO:ke] subset: SOFA synonym: "untranslated region" EXACT [] is_a: SO:0000836 ! mRNA_region [Term] id: SO:0000204 name: five_prime_UTR def: "A region at the 5' end of a mature transcript (preceding the initiation codon) that is not translated into a protein." [http://www.ebi.ac.uk/embl/Documentation/FT_definitions/feature_table.html] subset: SOFA synonym: "5' UTR" EXACT [] synonym: "five_prime_untranslated_region" EXACT [] is_a: SO:0000203 ! UTR [Term] id: SO:0000205 name: three_prime_UTR def: "A region at the 3' end of a mature transcript (following the stop codon) that is not translated into a protein." [http://www.ebi.ac.uk/embl/Documentation/FT_definitions/feature_table.html] subset: SOFA synonym: "three prime untranslated region" EXACT [] is_a: SO:0000203 ! UTR [Term] id: SO:0000233 name: processed_transcript def: "A transcript which has undergone the necessary modifications for its function. In eukaryotes this includes, for example, processing of introns, cleavage, base modification, and modifications to the 5' and/or the 3' ends, other than addition of bases. In bacteria functional mRNAs are usually not modified." [SO:ke] comment: A processed transcript cannot contain introns. subset: SOFA is_a: SO:0000673 ! transcript relationship: derives_from SO:0000185 ! primary_transcript [Term] id: SO:0000234 name: mRNA def: "Messenger RNA is the intermediate molecule between DNA and protein. It includes UTR and coding sequences. It does not contain introns." [SO:ma] comment: An mRNA does not contain introns as it is a processd_transcript. The equivalent kind of primary_transcript is protein_coding_primary_transcript (SO:0000120) which may contain introns. This term is mapped to MGED. Do not obsolete without consulting MGED ontology. subset: SOFA synonym: "messenger RNA" EXACT [] is_a: SO:0000233 ! processed_transcript [Term] id: SO:0000235 name: TF_binding_site def: "A region of a molecule that binds a TF complex [GO:0005667]." [SO:ke] subset: SOFA synonym: "transcription factor binding site" EXACT [] is_a: SO:0005836 ! regulatory_region [Term] id: SO:0000236 name: ORF def: "The inframe interval between the stop codons of a reading frame which when read as sequential triplets, has the potential of encoding a sequential string of amino acids. TER(NNN)nTER." [SO:ma, SO:rb] comment: The definition was modified by Rama. This terms now basically is the same as a CDS. This must be revised. This term is mapped to MGED. Do not obsolete without consulting MGED ontology. subset: SOFA synonym: "open reading frame" EXACT [] is_a: SO:0000717 ! reading_frame [Term] id: SO:0000239 name: flanking_region def: "The DNA sequences extending on either side of a specific locus." [http://biotech.icmb.utexas.edu/search/dict-search.mhtml] subset: SOFA is_a: SO:0000001 ! region [Term] id: SO:0000252 name: rRNA def: "RNA that comprises part of a ribosome, and that can provide both structural scaffolding and catalytic activity." [http://www.ebi.ac.uk/embl/Documentation/FT_definitions/feature_table.html, ISBN:0198506732] subset: SOFA synonym: " ribosomal ribonucleic acid" EXACT [] synonym: "ribsomal RNA" EXACT [] is_a: SO:0000655 ! ncRNA [Term] id: SO:0000253 name: tRNA def: "Transfer RNA (tRNA) molecules are approximately 80 nucleotides in length. Their secondary structure includes four short double-helical elements and three loops (D, anti-codon, and T loops). Further hydrogen bonds mediate the characteristic L-shaped molecular structure. Transfer RNAs have two regions of fundamental functional importance: the anti-codon, which is responsible for specific mRNA codon recognition, and the 3' end, to which the tRNA's corresponding amino acid is attached (by aminoacyl-tRNA synthetases). Transfer RNAs cope with the degeneracy of the genetic code in two manners: having more than one tRNA (with a specific anti-codon) for a particular amino acid; and 'wobble' base-pairing, i.e. permitting non-standard base-pairing at the 3rd anti-codon position." [http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00005, ISBN:0198506732] comment: This term is mapped to MGED. Do not obsolete without consulting MGED ontology. subset: SOFA synonym: "transfer ribonucleic acid" RELATED [] synonym: "transfer RNA" RELATED [] is_a: SO:0000655 ! ncRNA [Term] id: SO:0000274 name: snRNA def: "Small non-coding RNA in the nucleoplasm. A small nuclear RNA molecule involved in pre-mRNA splicing and processing." [ems:WB, http://www.ebi.ac.uk/embl/Documentation/FT_definitions/feature_table.html, PMID:11733745] comment: This term is mapped to MGED. Do not obsolete without consulting MGED ontology. subset: SOFA synonym: "small nuclear RNA" RELATED [] is_a: SO:0000655 ! ncRNA [Term] id: SO:0000275 name: snoRNA def: "Small nucleolar RNAs (snoRNAs) are involved in the processing and modification of rRNA in the nucleolus. There are two main classes of snoRNAs: the box C/D class, and the box H/ACA class. U3 snoRNA is a member of the box C/D class. Indeed, the box C/D element is a subset of the six short sequence elements found in all U3 snoRNAs, namely boxes A, A', B, C, C', and D. The U3 snoRNA secondary structure is characterised by a small 5' domain (with boxes A and A'), and a larger 3' domain (with boxes B, C, C', and D), the two domains being linked by a single-stranded hinge. Boxes B and C form the B/C motif, which appears to be exclusive to U3 snoRNAs, and boxes C' and D form the C'/D motif. The latter is functionally similar to the C/D motifs found in other snoRNAs. The 5' domain and the hinge region act as a pre-rRNA-binding domain. The 3' domain has conserved protein-binding sites. Both the box B/C and box C'/D motifs are sufficient for nuclear retention of U3 snoRNA. The box C'/D motif is also necessary for nucleolar localization, stability and hypermethylation of U3 snoRNA. Both box B/C and C'/D motifs are involved in specific protein interactions and are necessary for the rRNA processing functions of U3 snoRNA." [http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00012] subset: SOFA synonym: "small nucleolar RNA" EXACT [] is_a: SO:0000655 ! ncRNA [Term] id: SO:0000276 name: miRNA def: "Small, ~22-nt, RNA molecule that is the endogenous transcript of a miRNA gene. Micro RNAs are produced from precursor molecules (SO:0000647) that can form local hairpin structures, which ordinarily are processed (via the Dicer pathway) such that a single miRNA molecule accumulates from one arm of a hairpinprecursor molecule. Micro RNAs may trigger the cleavage of their target molecules or act as translational repressors." [PMID:12592000] subset: SOFA synonym: "micro RNA" EXACT [] is_a: SO:0000370 ! small_regulatory_ncRNA [Term] id: SO:0000289 name: microsatellite def: "A very short unit sequence of DNA (2 to 4 bp) that is repeated multiple times in tandem." [http://www.informatics.jax.org/silver/glossary.shtml] subset: SOFA synonym: "microsatellite locus" EXACT [] synonym: "microsatellite marker" EXACT [] synonym: "VNTR" EXACT [] is_a: SO:0000705 ! tandem_repeat [Term] id: SO:0000294 name: inverted_repeat def: "The sequence is complementarily repeated on the opposite strand. It is a palindrome, and it may, or may not be hyphenated. Examples: GCTGATCAGC, or GCTGA-----TCAGC." [SO:ke] subset: SOFA synonym: "inverted repeat" EXACT [] synonym: "inverted repeat sequence" EXACT [] is_a: SO:0000657 ! repeat_region [Term] id: SO:0000296 name: origin_of_replication def: "The origin of replication; starting site for duplication of a nucleic acid molecule to give two identical copies." [http://www.ebi.ac.uk/embl/Documentation/FT_definitions/feature_table.html] subset: SOFA synonym: " ori" EXACT [] is_a: SO:0000001 ! region [Term] id: SO:0000303 name: clip def: "Part of the primary transcript that is clipped off during processing." [SO:ke] subset: SOFA is_a: SO:0000835 ! primary_transcript_region [Term] id: SO:0000305 name: modified_base_site def: "A modified nucleotide, i.e. a nucleotide other than A, T, C. G or (in RNA) U." [http://www.ebi.ac.uk/embl/Documentation/FT_definitions/feature_table.html] comment: Modified base:. subset: SOFA is_a: SO:0000001 ! region [Term] id: SO:0000306 name: methylated_base_feature def: "A nucleotide modified by methylation." [SO:ke] subset: SOFA is_a: SO:0000305 ! modified_base_site [Term] id: SO:0000307 name: CpG_island def: "Regions of a few hundred to a few thousand bases in vertebrate genomes that are relatively GC and CpG rich; they are typically unmethylated and often found near the 5' ends of genes." [SO:rd] subset: SOFA synonym: "CG island" EXACT [] is_a: SO:0000001 ! region [Term] id: SO:0000314 name: direct_repeat def: "A repeat where the same sequence is repeated in the same direction. Example: GCTGA-----GCTGA." [SO:ke] subset: SOFA is_a: SO:0000657 ! repeat_region [Term] id: SO:0000315 name: transcription_start_site def: "The site where transcription begins." [SO:ke] subset: SOFA synonym: "TSS" EXACT [] is_a: SO:0000699 ! junction is_a: SO:0000835 ! primary_transcript_region [Term] id: SO:0000316 name: CDS def: "A contiguous sequence which begins with, and includes, a start codon and ends with, and includes, a stop codon." [SO:ma] subset: SOFA synonym: "coding sequence" EXACT [] is_a: SO:0000836 ! mRNA_region [Term] id: SO:0000318 name: start_codon def: "First codon to be translated by a ribosome." [SO:ke] subset: SOFA synonym: "initiation codon" EXACT [] is_a: SO:0000360 ! codon [Term] id: SO:0000319 name: stop_codon def: "In mRNA, a set of three nucleotides that indicates the end of information for protein synthesis." [SO:ke] subset: SOFA is_a: SO:0000360 ! codon [Term] id: SO:0000324 name: tag def: "A nucleotide sequence that may be used to identify a larger sequence." [SO:ke] subset: SOFA is_a: SO:0000695 ! reagent [Term] id: SO:0000326 name: SAGE_tag def: "A short diagnostic sequence tag, serial analysis of gene expression (SAGE), that allows the quantitative and simultaneous analysis of a large number of transcripts." [http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=7570003&dopt=Abstract] subset: SOFA is_a: SO:0000324 ! tag [Term] id: SO:0000330 name: conserved_region def: "Region of sequence similarity by descent from a common ancestor." [SO:ke] subset: SOFA is_a: SO:0000001 ! region [Term] id: SO:0000331 name: STS def: "Short (typically a few hundred base pairs) DNA sequence that has a single occurrence in a genome and whose location and base sequence are known." [http://www.biospace.com] subset: SOFA synonym: "sequence tag site" EXACT [] is_a: SO:0000324 ! tag [Term] id: SO:0000332 name: coding_conserved_region def: "Coding region of sequence similarity by descent from a common ancestor." [SO:ke] subset: SOFA is_a: SO:0000330 ! conserved_region [Term] id: SO:0000333 name: exon_junction def: "The boundary between two exons in a processed transcript." [SO:ke] subset: SOFA is_a: SO:0000699 ! junction relationship: part_of SO:0000233 ! processed_transcript [Term] id: SO:0000334 name: nc_conserved_region def: "Non-coding region of sequence similarity by descent from a common ancestor." [SO:ke] subset: SOFA synonym: "noncoding conserved region" EXACT [] is_a: SO:0000330 ! conserved_region [Term] id: SO:0000336 name: pseudogene def: "A sequence that closely resembles a known functional gene, at another locus within a genome, that is non-functional as a consequence of (usually several) mutations that prevent either its transcription or translation (or both). In general, pseudogenes result from either reverse transcription of a transcript of their \"normal\" paralog (SO:0000043) (in which case the pseudogene typically lacks introns and includes a poly(A) tail) or from recombination (SO:0000044) (in which case the pseudogene is typically a tandem duplication of its \"normal\" paralog)." [http://www.ucl.ac.uk/~ucbhjow/b241/glossary.html] subset: SOFA is_a: SO:0000462 ! pseudogenic_region relationship: non_functional_homolog_of SO:0000704 ! gene [Term] id: SO:0000337 name: RNAi_reagent def: "A double stranded RNA duplex, at least 20bp long, used experimentally to inhibit gene function by RNA interference." [SO:rd] subset: SOFA is_a: SO:0000696 ! oligo [Term] id: SO:0000340 name: chromosome def: "Structural unit composed of a nucleic acid molecule which controls its own replication through the interaction of specific proteins at one or more origins of replication." [SO:ma] comment: This term is mapped to MGED. Do not obsolete without consulting MGED ontology. subset: SOFA is_a: SO:0000001 ! region [Term] id: SO:0000341 name: chromosome_band def: "A cytologically distinguishable feature of a chromosome, often made visible by staining, and usually alternating light and dark." [SO:ma] subset: SOFA synonym: "cytoband" EXACT [] synonym: "cytological band" EXACT [] is_a: SO:0000830 ! chromosome_part [Term] id: SO:0000343 name: match def: "A region of sequence, aligned to another sequence with some statistical significance, using an algorithm such as BLAST or SIM4." [SO:ke] subset: SOFA is_a: SO:0000001 ! region [Term] id: SO:0000344 name: splice_enhancer def: "Region of a transcript that regulates splicing." [SO:ke] subset: SOFA [Term] id: SO:0000345 name: EST def: "Expressed Sequence Tag: The sequence of a single sequencing read from a cDNA clone or PCR product; typically a few hundred base pairs long." [http://genomics.phrma.org/lexicon/e.html] comment: This term is mapped to MGED. Do not obsolete without consulting MGED ontology. subset: SOFA synonym: "expressed sequence tag" EXACT [] is_a: SO:0000695 ! reagent relationship: derives_from SO:0000234 ! mRNA [Term] id: SO:0000347 name: nucleotide_match def: "A match against a nucleotide sequence." [SO:ke] subset: SOFA is_a: SO:0000343 ! match [Term] id: SO:0000349 name: protein_match def: "A match against a protein sequence." [SO:ke] subset: SOFA is_a: SO:0000343 ! match [Term] id: SO:0000353 name: assembly def: "A sequence of nucleotides that has been algorithmically derived from an alignment of two or more different sequences." [SO:ma] subset: SOFA is_a: SO:0000001 ! region [Term] id: SO:0000360 name: codon def: "A set of (usually) three nucleotide bases in a DNA or RNA sequence, which together signify a unique amino acid or the termination of translation." [http://genomics.phrma.org/lexicon/c.html] subset: SOFA is_a: SO:0000836 ! mRNA_region [Term] id: SO:0000366 name: insertion_site def: "The junction where an insertion occurred." [SO:ke] subset: SOFA is_a: SO:0000699 ! junction [Term] id: SO:0000368 name: transposable_element_insertion_site def: "The junction in a genome where a transposable_element has inserted." [SO:ke] subset: SOFA is_a: SO:0000366 ! insertion_site [Term] id: SO:0000370 name: small_regulatory_ncRNA def: "A non-coding RNA, usually with a specific secondary structure, that acts to regulate gene expression." [SO:ma] subset: SOFA is_a: SO:0000655 ! ncRNA [Term] id: SO:0000372 name: enzymatic_RNA def: "A non-coding RNA, usually with a specific secondary structure, that acts to regulate gene expression." [SO:ma] subset: SOFA is_a: SO:0000655 ! ncRNA [Term] id: SO:0000374 name: ribozyme def: "An RNA with catalytic activity." [SO:ma] subset: SOFA is_a: SO:0000372 ! enzymatic_RNA [Term] id: SO:0000375 name: rRNA_5.8S def: "5. 8S ribosomal RNA (5. 8S rRNA) is a component of the large subunit of the eukaryotic ribosome. It is transcribed by RNA polymerase I as part of the 45S precursor that also contains 18S and 28S rRNA. Functionally, it is thought that 5. 8S rRNA may be involved in ribosome translocation. It is also known to form covalent linkage to the p53 tumour suppressor protein. 5. 8S rRNA is also found in archaea." [http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00002] subset: SOFA synonym: "5.8S rRNA" EXACT [] is_a: SO:0000651 ! large_subunit_rRNA [Term] id: SO:0000380 name: hammerhead_ribozyme def: "A small catalytic RNA motif that catalyzes self-cleavage reaction. Its name comes from its secondary structure which resembles a carpenter's hammer. The hammerhead ribozyme is involved in the replication of some viroid and some satellite RNAs." [http://rnaworld.bio.ukans.edu/class/RNA/RNA00/RNA_World_3.html] subset: SOFA is_a: SO:0000374 ! ribozyme [Term] id: SO:0000385 name: RNase_MRP_RNA def: "The RNA molecule essential for the catalytic activity of RNase MRP, an enzymatically active ribonucleoprotein with two distinct roles in eukaryotes. In mitochondria it plays a direct role in the initiation of mitochondrial DNA replication. In the nucleus it is involved in precursor rRNA processing, where it cleaves the internal transcribed spacer 1 between 18S and 5.8S rRNAs." [http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00030] subset: SOFA is_a: SO:0000372 ! enzymatic_RNA [Term] id: SO:0000386 name: RNase_P_RNA def: "The RNA component of Ribonuclease P (RNase P), a ubiquitous endoribonuclease, found in archaea, bacteria and eukarya as well as chloroplasts and mitochondria. Its best characterised activity is the generation of mature 5 prime ends of tRNAs by cleaving the 5 prime leader elements of precursor-tRNAs. Cellular RNase Ps are ribonucleoproteins. RNA from bacterial RNase Ps retains its catalytic activity in the absence of the protein subunit, i.e. it is a ribozyme. Isolated eukaryotic and archaeal RNase P RNA has not been shown to retain its catalytic function, but is still essential for the catalytic activity of the holoenzyme. Although the archaeal and eukaryotic holoenzymes have a much greater protein content than the bacterial ones, the RNA cores from all the three lineages are homologous. Helices corresponding to P1, P2, P3, P4, and P10/11 are common to all cellular RNase P RNAs. Yet, there is considerable sequence variation, particularly among the eukaryotic RNAs." [http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00010] subset: SOFA is_a: SO:0000374 ! ribozyme [Term] id: SO:0000390 name: telomerase_RNA def: "The RNA component of telomerase, a reverse transcriptase that synthesises telomeric DNA." [http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00025] subset: SOFA is_a: SO:0000372 ! enzymatic_RNA [Term] id: SO:0000391 name: U1_snRNA def: "U1 is a small nuclear RNA (snRNA) component of the spliceosome (involved in pre-mRNA splicing). Its 5' end forms complementary base pairs with the 5' splice junction, thus defining the 5' donor site of an intron. There are significant differences in sequence and secondary structure between metazoan and yeast U1 snRNAs, the latter being much longer (568 nucleotides as compared to 164 nucleotides in human). Nevertheless, secondary structure predictions suggest that all U1 snRNAs share a 'common core' consisting of helices I, II, the proximal region of III, and IV." [http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00003] subset: SOFA is_a: SO:0000274 ! snRNA [Term] id: SO:0000392 name: U2_snRNA def: "U2 is a small nuclear RNA (snRNA) component of the spliceosome (involved in pre-mRNA splicing). Complementary binding between U2 snRNA (in an area lying towards the 5' end but 3' to hairpin I) and the branchpoint sequence (BPS) of the intron results in the bulging out of an unpaired adenine, on the BPS, which initiates a nucleophilic attack at the intronic 5' splice site, thus starting the first of two transesterification reactions that mediate splicing." [http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00004] subset: SOFA is_a: SO:0000274 ! snRNA [Term] id: SO:0000393 name: U4_snRNA def: "U4 small nuclear RNA (U4 snRNA) is a component of the major U2-dependent spliceosome. It forms a duplex with U6, and with each splicing round, it is displaced from U6 (and the spliceosome) in an ATP-dependent manner, allowing U6 to refold and create the active site for splicing catalysis. A recycling process involving protein Prp24 re-anneals U4 and U6." [http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00015] subset: SOFA is_a: SO:0000274 ! snRNA [Term] id: SO:0000394 name: U4atac_snRNA def: "An snRNA required for the splicing of the minor U12-dependent class of eukaryotic nuclear introns. It forms a base paired complex with U6atac_snRNA (SO:0000397)." [PMID:=12409455] subset: SOFA is_a: SO:0000274 ! snRNA [Term] id: SO:0000395 name: U5_snRNA def: "U5 RNA is a component of both types of known spliceosome. The precise function of this molecule is unknown, though it is known that the 5' loop is required for splice site selection and p220 binding, and that both the 3' stem-loop and the Sm site are important for Sm protein binding and cap methylation." [http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00020] subset: SOFA is_a: SO:0000274 ! snRNA [Term] id: SO:0000396 name: U6_snRNA def: "U6 snRNA is a component of the spliceosome which is involved in splicing pre-mRNA. The putative secondary structure consensus base pairing is confined to a short 5' stem loop, but U6 snRNA is thought to form extensive base-pair interactions with U4 snRNA." [http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00015] subset: SOFA is_a: SO:0000274 ! snRNA [Term] id: SO:0000397 name: U6atac_snRNA def: "U6atac_snRNA -An snRNA required for the splicing of the minor U12-dependent class of eukaryotic nuclear introns. It forms a base paired complex with U4atac_snRNA (SO:0000394)." [http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=retrieve&db=pubmed&list_uids=12409455&dopt=Abstract] subset: SOFA is_a: SO:0000274 ! snRNA [Term] id: SO:0000398 name: U11_snRNA def: "U11 snRNA plays a role in splicing of the minor U12-dependent class of eukaryotic nuclear introns, similar to U1 snRNA in the major class spliceosome it base pairs to the conserved 5' splice site sequence." [PMID:9622129] subset: SOFA is_a: SO:0000274 ! snRNA [Term] id: SO:0000399 name: U12_snRNA def: "The U12 small nuclear (snRNA), together with U4atac/U6atac, U5, and U11 snRNAs and associated proteins, forms a spliceosome that cleaves a divergent class of low-abundance pre-mRNA introns." [http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00007] subset: SOFA is_a: SO:0000274 ! snRNA [Term] id: SO:0000403 name: U14_snRNA def: "U14 small nucleolar RNA (U14 snoRNA) is required for early cleavages of eukaryotic precursor rRNAs. In yeasts, this molecule possess a stem-loop region (known as the Y-domain) which is essential for function. A similar structure, but with a different consensus sequence, is found in plants, but is absent in vertebrates." [http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00016] subset: SOFA is_a: SO:0000274 ! snRNA [Term] id: SO:0000404 name: vault_RNA def: "A family of RNAs are found as part of the enigmatic vault ribonuceoprotein complex. The complex consists of a major vault protein (MVP), two minor vault proteins (VPARP and TEP1), and several small untranslated RNA molecules. It has been suggested that the vault complex is involved in drug resistance." [http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00006] subset: SOFA is_a: SO:0000655 ! ncRNA [Term] id: SO:0000405 name: Y_RNA def: "Y RNAs are components of the Ro ribonucleoprotein particle (Ro RNP), in association with Ro60 and La proteins. The Y RNAs and Ro60 and La proteins are well conserved, but the function of the Ro RNP is not known. In humans the RNA component can be one of four small RNAs: hY1, hY3, hY4 and hY5. These small RNAs are predicted to fold into a conserved secondary structure containing three stem structures. The largest of the four, hY1, contains an additional hairpin." [http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00019] subset: SOFA is_a: SO:0000655 ! ncRNA [Term] id: SO:0000407 name: rRNA_18S def: "A large polynucleotide which functions as a part of the small subunit of the ribosome." [SO:ke] subset: SOFA synonym: "18S rRNA" RELATED [] is_a: SO:0000650 ! small_subunit_rRNA [Term] id: SO:0000409 name: binding_site alt_id: BS:00033 def: "A region on the surface of a molecule that may interact with another molecule. When applied to polypeptides: Amino acids involved in binding or interactions. It can also apply to an amino acid bond which is represented by the positions of the two flanking amino acids." [EBIBS:GAR, SO:ke] comment: Discrete. subset: biosapiens subset: SOFA synonym: "binding_or_interaction_site" EXACT [] synonym: "site" RELATED [] is_a: SO:0000001 ! region [Term] id: SO:0000412 name: restriction_fragment def: "Any of the individual polynucleotide sequences produced by digestion of DNA with a restriction endonuclease." [http://www.agron.missouri.edu/cgi-bin/sybgw_mdb/mdb3/Term/119] subset: SOFA is_a: SO:0000695 ! reagent [Term] id: SO:0000413 name: sequence_difference def: "A region where the sequences differs from that of a specified sequence." [SO:ke] subset: SOFA is_a: SO:0000700 ! remark [Term] id: SO:0000418 name: signal_peptide alt_id: BS:00159 def: "The signal_peptide is a short region of the peptide located at the N-terminal that directs the protein to be secreted or part of membrane components." [http://www.ebi.ac.uk/embl/Documentation/FT_definitions/feature_table.html] comment: Old def before biosapiens:The sequence for an N-terminal domain of a secreted protein; this domain is involved in attaching nascent polypeptide to the membrane leader sequence. subset: biosapiens subset: SOFA synonym: "signal" RELATED [] synonym: "signal peptide coding sequence" EXACT [] synonym: "signal_peptide" EXACT [] [Term] id: SO:0000419 name: mature_protein_region alt_id: BS:00149 def: "The extent of a polypeptide chain in the mature protein." [EBIBS:GAR, http://www.ebi.ac.uk/embl/Documentation/FT_definitions/feature_table.html] comment: This term mature peptide, merged with the biosapiens term mature protein region and took that to be the new name. Old def: The coding sequence for the mature or final peptide or protein product following post-translational modification. subset: biosapiens subset: SOFA synonym: "chain" EXACT [] synonym: "mature peptide" RELATED [] synonym: "mature_protein_region" EXACT [] is_a: SO:0000839 ! polypeptide_region [Term] id: SO:0000436 name: ARS def: "A sequence that can autonomously replicate, as a plasmid, when transformed into a bacterial host." [SO:ma] subset: SOFA synonym: "autonomously replicating sequence" EXACT [] is_a: SO:0000296 ! origin_of_replication [Term] id: SO:0000454 name: rasiRNA def: "A small, 17-28-nt, small interfering RNA derived from transcripts of repetitive elements." [http://www.developmentalcell.com/content/article/abstract?uid=PIIS1534580703002284] subset: SOFA synonym: "repeat associated small interfering RNA" EXACT [] is_a: SO:0000655 ! ncRNA [Term] id: SO:0000462 name: pseudogenic_region def: "A non-functional descendent of a functional entity." [SO:cjm] subset: SOFA is_a: SO:0000001 ! region [Term] id: SO:0000464 name: decayed_exon def: "A non-functional descendent of an exon." [SO:ke] subset: SOFA is_a: SO:0000462 ! pseudogenic_region relationship: non_functional_homolog_of SO:0000147 ! exon [Term] id: SO:0000468 name: golden_path_fragment def: "One of the pieces of sequence that make up a golden path." [SO:rd] subset: SOFA is_a: SO:0000143 ! assembly_component relationship: part_of SO:0000688 ! golden_path [Term] id: SO:0000472 name: tiling_path def: "A set of regions which overlap with minimal polymorphism to form a linear sequence." [CJM:SO] subset: SOFA is_a: SO:0000353 ! assembly [Term] id: SO:0000474 name: tiling_path_fragment def: "A piece of sequence that makes up a tiling_path (SO:0000472)." [SO:ke] subset: SOFA is_a: SO:0000143 ! assembly_component relationship: part_of SO:0000472 ! tiling_path [Term] id: SO:0000483 name: nc_primary_transcript def: "A primary transcript that is never translated into a protein." [SO:ke] subset: SOFA synonym: "noncoding primary transcript" EXACT [] is_a: SO:0000185 ! primary_transcript [Term] id: SO:0000499 name: virtual_sequence def: "A continuous piece of sequence similar to the 'virtual contig' concept of the Ensembl database." [SO:ke] subset: SOFA is_a: SO:0000353 ! assembly [Term] id: SO:0000502 name: transcribed_region def: "A region of sequence that is transcribed. This region may cover the transcript of a gene, it may emcompas the sequence covered by all of the transcripts of a alternately spliced gene, or it may cover the region transcribed by a polycistronic transcript. A gene may have 1 or more transcribed regions and a transcribed_region may belong to one or more genes." [SO:ke] comment: This concept cam about as a direct result of the SO meeting August 2004.nThe exact nature of the relationship between transcribed_region and gene is still up for discussion. We are going with 'associated_with' for the time being. subset: SOFA is_obsolete: true [Term] id: SO:0000551 name: polyA_signal_sequence def: "The recognition sequence necessary for endonuclease cleavage of an RNA transcript that is followed by polyadenylation; consensus=AATAAA." [http://www.ebi.ac.uk/embl/Documentation/FT_definitions/feature_table.html] subset: SOFA synonym: "poly(A) signal" EXACT [] is_a: SO:0005836 ! regulatory_region [Term] id: SO:0000553 name: polyA_site def: "The site on an RNA transcript to which will be added adenine residues by post-transcriptional polyadenylation." [http://www.ebi.ac.uk/embl/Documentation/FT_definitions/feature_table.html] subset: SOFA is_a: SO:0000699 ! junction relationship: part_of SO:0000233 ! processed_transcript [Term] id: SO:0000577 name: centromere def: "A region of chromosome where the spindle fibers attach during mitosis and meiosis." [SO:ke] subset: SOFA is_a: SO:0000628 ! chromosomal_structural_element [Term] id: SO:0000581 name: cap def: "A structure consisting of a 7-methylguanosine in 5'-5' triphosphate linkage with the first nucleotide of an mRNA. It is added post-transcriptionally, and is not encoded in the DNA." [http://seqcore.brcf.med.umich.edu/doc/educ/dnapr/mbglossary/mbgloss.html] subset: SOFA is_a: SO:0000001 ! region [Term] id: SO:0000587 name: group_I_intron def: "Group I catalytic introns are large self-splicing ribozymes. They catalyse their own excision from mRNA, tRNA and rRNA precursors in a wide range of organisms. The core secondary structure consists of 9 paired regions (P1-P9). These fold to essentially two domains, the P4-P6 domain (formed from the stacking of P5, P4, P6 and P6a helices) and the P3-P9 domain (formed from the P8, P3, P7 and P9 helices). Group I catalytic introns often have long ORFs inserted in loop regions." [http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00028] subset: SOFA is_a: SO:0000188 ! intron [Term] id: SO:0000588 name: autocatalytically_spliced_intron def: "A self spliced intron." [SO:ke] subset: SOFA is_a: SO:0000188 ! intron is_a: SO:0000374 ! ribozyme [Term] id: SO:0000590 name: SRP_RNA def: "The signal recognition particle (SRP) is a universally conserved ribonucleoprotein. It is involved in the co-translational targeting of proteins to membranes. The eukaryotic SRP consists of a 300-nucleotide 7S RNA and six proteins: SRPs 72, 68, 54, 19, 14, and 9. Archaeal SRP consists of a 7S RNA and homologues of the eukaryotic SRP19 and SRP54 proteins. In most eubacteria, the SRP consists of a 4.5S RNA and the Ffh protein (a homologue of the eukaryotic SRP54 protein). Eukaryotic and archaeal 7S RNAs have very similar secondary structures, with eight helical elements. These fold into the Alu and S domains, separated by a long linker region. Eubacterial SRP is generally a simpler structure, with the M domain of Ffh bound to a region of the 4.5S RNA that corresponds to helix 8 of the eukaryotic and archaeal SRP S domain. Some Gram-positive bacteria (e.g. Bacillus subtilis), however, have a larger SRP RNA that also has an Alu domain. The Alu domain is thought to mediate the peptide chain elongation retardation function of the SRP. The universally conserved helix which interacts with the SRP54/Ffh M domain mediates signal sequence recognition. In eukaryotes and archaea, the SRP19-helix 6 complex is thought to be involved in SRP assembly and stabilizes helix 8 for SRP54 binding." [http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00017] subset: SOFA synonym: "7S RNA" RELATED [] synonym: "signal recognition particle RNA" RELATED [] is_a: SO:0000655 ! ncRNA [Term] id: SO:0000602 name: guide_RNA def: "A short 3'-uridylated RNA that can form a duplex (except for its post-transcriptionally added) oligo_U tail (SO:0000609)) with a stretch of mature edited mRNA." [http://www.rna.ucla.edu/index.html] subset: SOFA synonym: "gRNA" EXACT [] is_a: SO:0000655 ! ncRNA [Term] id: SO:0000603 name: group_II_intron def: "Group II introns are found in rRNA, tRNA and mRNA of organelles in fungi, plants and protists, and also in mRNA in bacteria. They are large self-splicing ribozymes and have 6 structural domains (usually designated dI to dVI). A subset of group II introns also encode essential splicing proteins in intronic ORFs. The length of these introns can therefore be up to 3kb. Splicing occurs in almost identical fashion to nuclear pre-mRNA splicing with two transesterification steps. The 2' hydroxyl of a bulged adenosine in domain VI attacks the 5' splice site, followed by nucleophilic attack on the 3' splice site by the 3' OH of the upstream exon. Protein machinery is required for splicing in vivo, and long range intron-intron and intron-exon interactions are important for splice site positioning. Group II introns are further sub-classified into groups IIA and IIB which differ in splice site consensus, distance of bulged A from 3' splice site, some tertiary interactions, and intronic ORF phylogeny." [http://www.sanger.ac.uk/Software/Rfam/browse/index.shtml] subset: SOFA is_a: SO:0000188 ! intron [Term] id: SO:0000605 name: intergenic_region def: "The region between two known genes." [SO:ke] comment: This term is mapped to MGED. Do not obsolete without consulting MGED ontology. subset: SOFA is_a: SO:0000001 ! region [Term] id: SO:0000610 name: polyA_sequence def: "Sequence of about 100 nucleotides of A added to the 3' end of most eukaryotic mRNAs." [SO:ke] subset: SOFA is_a: SO:0000001 ! region [Term] id: SO:0000611 name: branch_site def: "A pyrimidine rich sequence near the 3' end of an intron to which the 5'end becomes covalently bound during nuclear splicing. The resulting structure resembles a lariat." [SO:ke] subset: SOFA synonym: "branch point" EXACT [] synonym: "branch_point" EXACT [] is_a: SO:0000841 ! spliceosomal_intron_region [Term] id: SO:0000612 name: polypyrimidine_tract def: "The polypyrimidine tract is one of the cis-acting sequence elements directing intron removal in pre-mRNA splicing." [http://nar.oupjournals.org/cgi/content/full/25/4/888] subset: SOFA is_a: SO:0000841 ! spliceosomal_intron_region [Term] id: SO:0000616 name: transcription_end_site def: "The site where transcription ends." [SO:ke] subset: SOFA is_a: SO:0000699 ! junction is_a: SO:0000835 ! primary_transcript_region [Term] id: SO:0000624 name: telomere def: "A specific structure at the end of a linear chromosome, required for the integrity and maintenance of the end." [SO:ma] subset: SOFA is_a: SO:0000628 ! chromosomal_structural_element [Term] id: SO:0000625 name: silencer def: "Combination of short DNA sequence elements which suppress the transcription of an adjacent gene or genes." [http://www.brunel.ac.uk/depts/bio/project/old_hmg/gloss3.htm] subset: SOFA [Term] id: SO:0000627 name: insulator def: "A trancriptional cis regulatory region that when located between a CM and a gene's promoter prevents the CRM from modulating that genes expression." [SO:regcreative] subset: SOFA synonym: "insulator element" EXACT [] [Term] id: SO:0000628 name: chromosomal_structural_element subset: SOFA is_a: SO:0000830 ! chromosome_part [Term] id: SO:0000643 name: minisatellite def: "A repetitive sequence spanning 500 to 20,000 base pairs (a repeat unit is 5 - 30 base pairs)." [http://www.rerf.or.jp/eigo/glossary/minisate.htm] subset: SOFA is_a: SO:0000705 ! tandem_repeat [Term] id: SO:0000644 name: antisense_RNA def: "Antisense RNA is RNA that is transcribed from the coding, rather than the template, strand of DNA. It is therefore complementary to mRNA." [SO:ke] subset: SOFA is_a: SO:0000655 ! ncRNA [Term] id: SO:0000645 name: antisense_primary_transcript def: "The reverse complement of the primary transcript." [SO:ke] subset: SOFA is_a: SO:0000185 ! primary_transcript [Term] id: SO:0000646 name: siRNA def: "A small RNA molecule that is the product of a longer exogenous or endogenous dsRNA, which is either a bimolecular duplex or very long hairpin, processed (via the Dicer pathway) such that numerous siRNAs accumulate from both strands of the dsRNA. SRNAs trigger the cleavage of their target molecules." [PMID:12592000] subset: SOFA synonym: "small interfering RNA" EXACT [] is_a: SO:0000655 ! ncRNA [Term] id: SO:0000649 name: stRNA def: "Non-coding RNAs of about 21 nucleotides in length that regulate temporal development; first discovered in C. elegans." [PMID:11081512] subset: SOFA synonym: "small temporal RNA" EXACT [] is_a: SO:0000655 ! ncRNA [Term] id: SO:0000650 name: small_subunit_rRNA subset: SOFA is_a: SO:0000252 ! rRNA [Term] id: SO:0000651 name: large_subunit_rRNA subset: SOFA is_a: SO:0000252 ! rRNA [Term] id: SO:0000652 name: rRNA_5S def: "5S ribosomal RNA (5S rRNA) is a component of the large ribosomal subunit in both prokaryotes and eukaryotes. In eukaryotes, it is synthesised by RNA polymerase III (the other eukaryotic rRNAs are cleaved from a 45S precursor synthesised by RNA polymerase I). In Xenopus oocytes, it has been shown that fingers 4-7 of the nine-zinc finger transcription factor TFIIIA can bind to the central region of 5S RNA. Thus, in addition to positively regulating 5S rRNA transcription, TFIIIA also stabilises 5S rRNA until it is required for transcription." [http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00001] subset: SOFA synonym: "5S rRNA" EXACT [] is_a: SO:0000651 ! large_subunit_rRNA [Term] id: SO:0000653 name: rRNA_28S def: "A component of the large ribosomal subunit." [SO:ke] subset: SOFA synonym: "28S rRNA" EXACT [] is_a: SO:0000651 ! large_subunit_rRNA [Term] id: SO:0000655 name: ncRNA def: "An mRNA sequence that does not encode for a protein rather the RNA molecule is the gene product." [SO:ke] comment: A ncRNA is a processed_transcript, so it may not contain parts such as transcribed_spacer_regions that are removed in the act of processing. For the corresponding primary_transcripts, please see term SO:0000483 nc_primary_transcript. subset: SOFA synonym: "noncoding RNA" EXACT [] is_a: SO:0000233 ! processed_transcript [Term] id: SO:0000657 name: repeat_region def: "A region of sequence containing one or more repeat units." [SO:ke] subset: SOFA is_a: SO:0000001 ! region [Term] id: SO:0000658 name: dispersed_repeat def: "A repeat that is located at dispersed sites in the genome." [SO:ke] subset: SOFA synonym: "interspersed repeat" EXACT [] is_a: SO:0000657 ! repeat_region [Term] id: SO:0000662 name: spliceosomal_intron def: "An intron which is spliced by the spliceosome." [SO:ke] subset: SOFA is_a: SO:0000188 ! intron [Term] id: SO:0000667 name: insertion def: "A region of sequence that has been inserted." [SO:ke] subset: SOFA is_a: SO:0000001 ! region [Term] id: SO:0000668 name: EST_match def: "A match against an EST sequence." [SO:ke] subset: SOFA is_a: SO:0000102 ! expressed_sequence_match [Term] id: SO:0000673 name: transcript def: "An RNA synthesized on a DNA or RNA template by an RNA polymerase." [SO:ma] subset: SOFA is_a: SO:0000831 ! gene_member_region [Term] id: SO:0000684 name: nuclease_sensitive_site def: "A region of nucleotide sequence targeting by a nuclease enzyme." [SO:ma] subset: SOFA is_a: SO:0000001 ! region [Term] id: SO:0000687 name: deletion_junction def: "The space between two bases in a sequence which marks the position where a deletion has occured." [SO:ke] subset: SOFA is_a: SO:0000699 ! junction [Term] id: SO:0000688 name: golden_path def: "A set of subregions selected from sequence contigs which when concatenated form a nonredundant linear sequence." [SO:ls] subset: SOFA is_a: SO:0000353 ! assembly [Term] id: SO:0000689 name: cDNA_match def: "A match against cDNA sequence." [SO:ke] subset: SOFA is_a: SO:0000102 ! expressed_sequence_match [Term] id: SO:0000694 name: SNP def: "SNPs are single base pair positions in genomic DNA at which different sequence alternatives (alleles) exist in normal individuals in some population(s), wherein the least frequent allele has an abundance of 1% or greater." [http://www.cgr.ki.se/cgb/groups/brookes/Articles/essence_of_snps_article.pdf] subset: SOFA synonym: "single nucleotide polymorphism" EXACT [] is_a: SO:1000002 ! substitution [Term] id: SO:0000695 name: reagent def: "A sequence used in experiment." [SO:ke] comment: Requested by Lynn Crosby, jan 2006. subset: SOFA is_a: SO:0000001 ! region [Term] id: SO:0000696 name: oligo def: "A short oligonucleotide sequence, of length on the order of 10's of bases; either single or double stranded." [SO:ma] subset: SOFA synonym: "oligonucleotide" EXACT [] is_a: SO:0000695 ! reagent [Term] id: SO:0000699 name: junction def: "A sequence_feature with an extent of zero." [SO:ke] comment: A junction is a boundary between regions. A boundary has an extent of zero. subset: SOFA synonym: "boundary" EXACT [] is_a: SO:0000110 ! sequence_feature [Term] id: SO:0000700 name: remark def: "A comment about the sequence." [SO:ke] subset: SOFA is_a: SO:0000001 ! region [Term] id: SO:0000701 name: possible_base_call_error def: "A region of sequence where the validity of the base calling is questionable." [SO:ke] subset: SOFA is_a: SO:0000413 ! sequence_difference [Term] id: SO:0000702 name: possible_assembly_error def: "A region of sequence where there may have been an error in the assembly." [SO:ke] subset: SOFA is_a: SO:0000413 ! sequence_difference [Term] id: SO:0000703 name: experimental_result_region def: "A region of sequence implicated in an experimental result." [SO:ke] subset: SOFA is_a: SO:0000700 ! remark [Term] id: SO:0000704 name: gene def: "A region (or regions) that includes all of the sequence elements necessary to encode a functional transcript. A gene may include regulatory regions, transcribed regions and/or other functional sequence regions." [SO:immuno_workshop] comment: This term is mapped to MGED. Do not obsolete without consulting MGED ontology.\nA gene may be considered as a unit of inheritance. subset: SOFA is_a: SO:0000001 ! region relationship: member_of SO:0005855 ! gene_group [Term] id: SO:0000705 name: tandem_repeat def: "Two or more adjacent copies of a DNA sequence." [http://www.sci.sdsu.edu/~smaloy/Glossary/T.html] subset: SOFA is_a: SO:0000657 ! repeat_region [Term] id: SO:0000706 name: trans_splice_acceptor_site def: "The process that produces mature transcripts by combining exons of independent pre-mRNA molecules. The acceptor site lies on the 3' of these molecules." [SO:ke] subset: SOFA is_a: SO:0000164 ! three_prime_splice_site [Term] id: SO:0000714 name: nucleotide_motif def: "A region of nucleotide sequence corresponding to a known motif." [SO:ke] subset: SOFA is_a: SO:0000001 ! region [Term] id: SO:0000717 name: reading_frame def: "A nucleic acid sequence that when read as sequential triplets, has the potential of encoding a sequential string of amino acids. It need not contain the start or stop codon." [SO:rb] comment: This term was added after a request by SGD. August 2004. Modified after SO meeting in Cambridge to not include start or stop. subset: SOFA is_a: SO:0000001 ! region [Term] id: SO:0000719 name: ultracontig def: "An ordered and oriented set of scaffolds based on somewhat weaker sets of inferential evidence such as one set of mate pair reads together with supporting evidence from ESTs or location of markers from SNP or microsatellite maps, or cytogenetic localization of contained markers." [FB:WG] subset: SOFA is_a: SO:0000353 ! assembly [Term] id: SO:0000724 name: oriT def: "A region of a DNA molecule where transfer is initiated during the process of conjugation or mobilization." [http://www.ebi.ac.uk/embl/Documentation/FT_definitions/feature_table.html] subset: SOFA synonym: "origin of transfer" EXACT [] is_a: SO:0000296 ! origin_of_replication [Term] id: SO:0000725 name: transit_peptide alt_id: BS:00055 def: "The transit_peptide is a short region at the N-terminal of the peptide that directs the protein to an organelle (chloroplast, mitochonrion, microbody or cyanelle)." [http://www.ebi.ac.uk/embl/Documentation/FT_definitions/feature_table.html] comment: Added to bring SO inline with the embl ddbj genbank feature table. Old definition before biosapiens: The coding sequence for an N-terminal domain of a nuclear-encoded organellar protein. This domain is involved in post translational import of the protein into the organelle. subset: biosapiens subset: SOFA synonym: "signal" RELATED [] synonym: "transit_peptide" EXACT [] [Term] id: SO:0000730 name: gap def: "A gap in the sequence of known length. The unkown bases are filled in with N's." [SO:ke] subset: SOFA is_a: SO:0000143 ! assembly_component relationship: part_of SO:0000353 ! assembly [Term] id: SO:0000752 name: gene_group_regulatory_region subset: SOFA is_a: SO:0005836 ! regulatory_region relationship: member_of SO:0005855 ! gene_group [Term] id: SO:0000777 name: pseudogenic_rRNA comment: Added Jan 2006 to allow the annotation of the pseudogenic rRNA by flybase. subset: SOFA is_a: SO:0000462 ! pseudogenic_region [Term] id: SO:0000778 name: pseudogenic_tRNA comment: Added Jan 2006 to allow the annotation of the pseudogenic tRNA by flybase. subset: SOFA is_a: SO:0000462 ! pseudogenic_region [Term] id: SO:0000830 name: chromosome_part def: "A region of a chromosome." [SO:ke] comment: This is a manufactured term, that serves the purpose of allow the parts of a chromosome to have an is_a path to the root. subset: SOFA is_a: SO:0000001 ! region relationship: part_of SO:0000340 ! chromosome [Term] id: SO:0000831 name: gene_member_region def: "A region of a gene." [SO:ke] comment: A manufactured term used to allow the parts of a gene to have an is_a path to the root. subset: SOFA is_a: SO:0000001 ! region relationship: member_of SO:0000704 ! gene [Term] id: SO:0000833 name: transcript_region def: "A region of a transcript." [SO:ke] comment: This term was added to provide a grouping term for the region parts of transcript, thus giving them an is_a path back to the root. subset: SOFA is_a: SO:0000001 ! region relationship: part_of SO:0000673 ! transcript [Term] id: SO:0000834 name: processed_transcript_region def: "A region of a processed transcript." [SO:ke] comment: A manufactured term to collect together the parts of a processed transcript and give them an is_a path to the root. subset: SOFA is_a: SO:0000833 ! transcript_region [Term] id: SO:0000835 name: primary_transcript_region def: "A region of a primary transcript." [SO:ke] comment: This term was added to provide a grouping term for the region parts of primary_transcript, thus giving them an is_a path back to the root. subset: SOFA is_a: SO:0000833 ! transcript_region relationship: part_of SO:0000185 ! primary_transcript [Term] id: SO:0000836 name: mRNA_region comment: This term was added to provide a grouping term for the region parts of mRNA, thus giving them an is_a path back to the root. subset: SOFA is_a: SO:0000834 ! processed_transcript_region relationship: part_of SO:0000234 ! mRNA [Term] id: SO:0000837 name: UTR_region def: "A region of UTR." [SO:ke] comment: A region of UTR. This term is a grouping term to allow the parts of UTR to have an is_a path to the root. subset: SOFA is_a: SO:0000836 ! mRNA_region relationship: part_of SO:0000203 ! UTR [Term] id: SO:0000839 name: polypeptide_region alt_id: BS:00124 alt_id: BS:00331 def: "Biological sequence region that can be assigned to a specific subsequence of a protein." [SO:GAR, SO:ke] comment: Added to allow the polypeptide regions to have is_a paths back to the root. subset: biosapiens subset: SOFA synonym: "positional" EXACT [] synonym: "positional polypeptide feature" EXACT [] synonym: "region or site annotation" EXACT [] is_a: SO:0000001 ! region relationship: part_of SO:0000104 ! polypeptide [Term] id: SO:0000841 name: spliceosomal_intron_region def: "A region within an intron." [SO:ke] comment: A terms added to allow the parts of introns to have is_a paths to the root. subset: SOFA is_a: SO:0000835 ! primary_transcript_region relationship: part_of SO:0000662 ! spliceosomal_intron [Term] id: SO:0000842 name: gene_component_region subset: SOFA is_a: SO:0000001 ! region relationship: part_of SO:0000704 ! gene [Term] id: SO:0000851 name: CDS_region subset: SOFA is_a: SO:0000836 ! mRNA_region relationship: part_of SO:0000316 ! CDS [Term] id: SO:0001000 name: rRNA_16S def: "A large polynucleotide which functions as a part of the small subunit of the ribosome." [SO:ke] subset: SOFA synonym: "16S rRNA" RELATED [] is_a: SO:0000650 ! small_subunit_rRNA [Term] id: SO:0001001 name: rRNA_23S def: "A component of the large ribosomal subunit." [SO:ke] subset: SOFA synonym: "23S rRNA" EXACT [] is_a: SO:0000651 ! large_subunit_rRNA [Term] id: SO:0001002 name: rRNA_25S subset: SOFA synonym: "25S rRNA" EXACT [] is_a: SO:0000651 ! large_subunit_rRNA [Term] id: SO:0005836 name: regulatory_region def: "A DNA sequence that controls the expression of a gene." [http://www.genpromag.com/scripts/glossary.asp?LETTER=R] subset: SOFA is_a: SO:0000831 ! gene_member_region [Term] id: SO:0005855 name: gene_group def: "A collection of related genes." [SO:ma] subset: SOFA is_a: SO:0000001 ! region [Term] id: SO:1000002 name: substitution def: "Any change in genomic DNA caused by a single event." [http://www.ebi.ac.uk/mutations/recommendations/mutevent.html] subset: SOFA is_a: SO:0000001 ! region [Term] id: SO:1000005 name: complex_substitution def: "When no simple or well defined DNA mutation event describes the observed DNA change, the keyword \"complex\" should be used. Usually there are multiple equally plausible explanations for the change." [http://www.ebi.ac.uk/mutations/recommendations/mutevent.html] subset: SOFA is_a: SO:1000002 ! substitution [Term] id: SO:1000008 name: point_mutation def: "A single nucleotide change which has occurred at the same position of a corresponding nucleotide in a reference sequence." [SO:immuno_workshop] subset: SOFA is_a: SO:1000002 ! substitution [Term] id: SO:1000036 name: inversion def: "A continuous nucleotide sequence is inverted in the same position." [http://www.ebi.ac.uk/mutations/recommendations/mutevent.html] subset: SOFA is_a: SO:0000001 ! region [Term] id: SO:1001284 name: regulon def: "A group of genes, whether linked as a cluster or not, that respond to a common regulatory signal." [ISBN:0198506732] subset: SOFA is_a: SO:0005855 ! gene_group [Term] id: SO:2000061 name: databank_entry def: "The sequence referred to by an entry in a databank such as Genbank or SwissProt." [SO:ke] subset: SOFA synonym: "accession" RELATED [] is_a: SO:0000695 ! reagent [Typedef] id: adjacent_to name: adjacent_to def: "A geometric operator, specified in Egenhofer 1989. Two features meet if they share a junction on the sequence." [SO:ke] subset: SOFA domain: SO:0000110 ! sequence_feature range: SO:0000110 ! sequence_feature [Typedef] id: associated_with name: associated_with comment: This relationship is vague and up for discussion. is_symmetric: true [Typedef] id: derives_from name: derives_from subset: SOFA is_transitive: true [Typedef] id: genome_of name: genome_of [Typedef] id: has_genome_location name: has_genome_location is_obsolete: true [Typedef] id: has_origin name: has_origin [Typedef] id: has_part name: has_part [Typedef] id: has_quality name: has_quality comment: The relationship between a feature and an atrribute. [Typedef] id: homologous_to name: homologous_to subset: SOFA is_symmetric: true is_a: similar_to ! similar_to [Typedef] id: member_of name: member_of comment: A subtype of part_of. Inverse is collection_of. Winston, M, Chaffin, R, Herrmann: A taxonomy of part-whole relations. Cognitive Science 1987, 11:417-444. subset: SOFA is_transitive: true [Typedef] id: non_functional_homolog_of name: non_functional_homolog_of def: "A relationship between a pseudogenic feature and its functional ancestor." [SO:ke] subset: SOFA is_a: homologous_to ! homologous_to [Typedef] id: orthologous_to name: orthologous_to subset: SOFA is_symmetric: true is_a: homologous_to ! homologous_to [Typedef] id: paralogous_to name: paralogous_to subset: SOFA is_symmetric: true is_a: homologous_to ! homologous_to [Typedef] id: part_of name: part_of namespace: BS subset: SOFA is_transitive: true [Typedef] id: position_of name: position_of [Typedef] id: regulated_by name: regulated_by is_obsolete: true [Typedef] id: sequence_of name: sequence_of [Typedef] id: similar_to name: similar_to subset: SOFA is_symmetric: true [Typedef] id: variant_of name: variant_of def: "A' is a variant (mutation) of A = definition every instance of A' is either an immediate mutation of s=ome instance of A, or there is a chain of immediate mutation processes linking A' to some instance of A." [SO:immuno_workshop] comment: Added to SO during the immunology workshop, June 2007. This relationship was approved by Barry Smith.