Mapping of the feature table terms and qualifiers to SO


Index:
Description of file
Features and definitions

Description of file


This file lists each of the Feature Table terms and the equivalent Sequence Ontology term. The definitions of the term are given as listed in each vocabulary. Terms that are not directly mapped to the Sequence Ontology are colored silver. A tab delimited version of this document can be downloaded here:FT_SO.txt

Features and definitions


-

SO term: located_sequence_feature SO:0000110
Feature Table Definition: "-" is a placeholder for no key; should be used when the need is merely to mark region in order to comment on it or to use it in another feature's location;
Sequence Ontology Definition: A biological feature that can be attributed to a region of biological sequence.

-10_signal

SO term: minus_10_signal SO:0000175
Feature Table Definition: Pribnow box; a conserved region about 10 bp upstream of the start point of bacterial transcription units which may be involved in binding RNA polymerase; consensus=TAtAaT [1,2,3,4];
Sequence Ontology Definition: A conserved region about 10-bp upstream of the start point of bacterial transcription units which may be involved in binding RNA polymerase; consensus=TAtAaT.

-35_signal

SO term: minus_35_signal SO:0000176
Feature Table Definition: a conserved hexamer about 35 bp upstream of the start point of bacterial transcription units; consensus=TTGACa or TGTTGACA;
Sequence Ontology Definition: A conserved hexamer about 35-bp upstream of the start point of bacterial transcription units; consensus=TTGACa or TGTTGACA.

3'UTR

SO term: three_prime_UTR SO:0000205
Feature Table Definition: region at the 3' end of a mature transcript (following the stop codon) that is not translated into a protein;
Sequence Ontology Definition: A region at the 3' end of a mature transcript (following the stop codon) that is not translated into a protein.

3'clip

SO term: three_prime_clip SO:0000557
Feature Table Definition: 3'-most region of a precursor transcript that is clipped off during processing;
Sequence Ontology Definition: 3'-most region of a precursor transcript that is clipped off during processing.

5'UTR

SO term: five_prime_UTR SO:0000204
Feature Table Definition: region at the 5' end of a mature transcript (preceding the initiation codon) that is not translated into a protein;
Sequence Ontology Definition: A region at the 5' end of a mature transcript (preceding the initiation codon) that is not translated into a protein.

5'clip

SO term: five_prime_clip SO:0000555
Feature Table Definition: 5'-most region of a precursor transcript that is clipped off during processing;
Sequence Ontology Definition: 5' most region of a precursor transcript that is clipped off during processing.

CAAT_signal

SO term: CAAT_signal SO:0000172
Feature Table Definition: CAAT box; part of a conserved sequence located about 75 bp up-stream of the start point of eukaryotic transcription units which may be involved in RNA polymerase binding; consensus=GG(C or T)CAATCT [1,2].
Sequence Ontology Definition: Part of a conserved sequence located about 75-bp upstream of the start point of eukaryotic transcription units which may be involved in RNA polymerase binding; consensus=GG(C|T)CAATCT.

CDS

SO term: CDS SO:0000316
Feature Table Definition: coding sequence; sequence of nucleotides that corresponds with the sequence of amino acids in a protein (location includes stop codon); feature includes amino acid conceptual translation.
Sequence Ontology Definition: A contiguous sequence which begins with, and includes, a start codon and ends with, and includes, a stop codon.

C_region

SO term: undefined
Feature Table Definition: constant region of immunoglobulin light and heavy chains, and T-cell receptor alpha, beta, and gamma chains; includes one or more exons depending on the particular chain
Sequence Ontology Definition:

D-loop

SO term: D_loop SO:0000297
Feature Table Definition: displacement loop; a region within mitochondrial DNA in which a short stretch of RNA is paired with one strand of DNA, displacing the original partner DNA strand in this region; also used to describe the displacement of a region of one strand of duplex DNA by a single stranded invader in the reaction catalyzed by RecA protein
Sequence Ontology Definition: Displacement loop; a region within mitochondrial DNA in which a short stretch of RNA is paired with one strand of DNA, displacing the original partner DNA strand in this region; also used to describe the displacement of a region of one strand of duplex DNA by a single stranded invader in the reaction catalyzed by RecA protein.

D_segment

SO term: D_gene SO:0000458
Feature Table Definition: Diversity segment of immunoglobulin heavy chain, and T-cell receptor beta chain;
Sequence Ontology Definition: germline genomic DNA including D-region with 5' UTR and 3' UTR, also designated as D-segment.

GC_signal

SO term: GC_rich_region SO:0000173
Feature Table Definition: GC box; a conserved GC-rich region located upstream of the start point of eukaryotic transcription units which may occur in multiple copies or in either orientation; consensus=GGGCGG;
Sequence Ontology Definition: A conserved GC-rich region located upstream of the start point of eukaryotic transcription units which may occur in multiple copies or in either orientation; consensus=GGGCGG.

J_segment

SO term: undefined
Feature Table Definition: joining segment of immunoglobulin light and heavy chains, and T-cell receptor alpha, beta, and gamma chains;
Sequence Ontology Definition:

LTR

SO term: long_terminal_repeat SO:0000286
Feature Table Definition: long terminal repeat, a sequence directly repeated at both ends of a defined sequence, of the sort typically found in retroviruses;
Sequence Ontology Definition: A sequence directly repeated at both ends of a defined sequence, of the sort typically found in retroviruses.

N_region

SO term: undefined
Feature Table Definition: extra nucleotides inserted between rearranged immunoglobulin segments.
Sequence Ontology Definition:

RBS

SO term: ribosome_entry_site SO:0000139
Feature Table Definition: ribosome binding site;
Sequence Ontology Definition: Region in mRNA where ribosome assembles.

STS

SO term: STS SO:0000331
Feature Table Definition: sequence tagged site; short, single-copy DNA sequence that characterizes a mapping landmark on the genome and can be detected by PCR; a region of the genome can be mapped by determining the order of a series of STSs;
Sequence Ontology Definition: Short (typically a few hundred base pairs) DNA sequence that has a single occurrence in a genome and whose location and base sequence are known.

S_region

SO term: undefined
Feature Table Definition: switch region of immunoglobulin heavy chains; involved in the rearrangement of heavy chain DNA leading to the expression of a different immunoglobulin class from the same B-cell;
Sequence Ontology Definition:

TATA_signal

SO term: TATA_box SO:0000174
Feature Table Definition: TATA box; Goldberg-Hogness box; a conserved AT-rich septamer found about 25 bp before the start point of each eukaryotic RNA polymerase II transcript unit which may be involved in positioning the enzyme for correct initiation; consensus=TATA(A or T)A(A or T) [1,2];
Sequence Ontology Definition: A conserved AT-rich septamer found about 25-bp before the start point of many eukaryotic RNA polymerase II transcript units; may be involved in positioning the enzyme for correct initiation; consensus=TATA(A|T)A(A|T).

V_region

SO term: undefined
Feature Table Definition: variable region of immunoglobulin light and heavy chains, and T-cell receptor alpha, beta, and gamma chains; codes for the variable amino terminal portion; can be composed of V_segments, D_segments, N_regions, and J_segments;
Sequence Ontology Definition:

V_segment

SO term: undefined
Feature Table Definition: variable segment of immunoglobulin light and heavy chains, and T-cell receptor alpha, beta, and gamma chains; codes for most of the variable region (V_region) and the last few amino acids of the leader peptide;
Sequence Ontology Definition:

attenuator

SO term: attenuator SO:0000140
Feature Table Definition: 1) region of DNA at which regulation of termination of transcription occurs, which controls the expression of some bacterial operons; 2) sequence segment located between the promoter and the first structural gene that causes partial termination of transcription
Sequence Ontology Definition: A sequence segment located between the promoter and a structural gene that causes partial termination of transcription.

conflict

SO term: undefined
Feature Table Definition: independent determinations of the "same" sequence differ at this site or region; Or /compare=[accession-number.sequence-version]
Sequence Ontology Definition:

enhancer

SO term: enhancer SO:0000165
Feature Table Definition: a cis-acting sequence that increases the utilization of (some) eukaryotic promoters, and can function in either orientation and in any location (upstream or downstream) relative to the promoter;
Sequence Ontology Definition: A cis-acting sequence that increases the utilization of (some) eukaryotic promoters, and can function in either orientation and in any location (upstream or downstream) relative to the promoter.

exon

SO term: exon SO:0000147
Feature Table Definition: region of genome that codes for portion of spliced mRNA, rRNA and tRNA; may contain 5'UTR, all CDSs and 3' UTR;
Sequence Ontology Definition: A region of the genome that codes for portion of spliced messenger RNA (SO:0000234); may contain 5'-untranslated region (SO:0000204), all open reading frames (SO:0000236) and 3'-untranslated region (SO:0000205).

gap

SO term: gap SO:0000730
Feature Table Definition: gap in the sequence
Sequence Ontology Definition: A gap in the sequence of known length. THe unkown bases are filled in with N's.

gene

SO term: gene SO:0000704
Feature Table Definition: region of biological interest identified as a gene and for which a name has been assigned;
Sequence Ontology Definition: A locatable region of genomic sequence, corresponding to a unit of inheritance, which is associated with regulatory regions, transcribed regions and/or other functional sequence regions

iDNA

SO term: iDNA SO:0000723
Feature Table Definition: intervening DNA; DNA which is eliminated through any of several kinds of recombination;
Sequence Ontology Definition: Genomic sequence removed from the genome, as a normal event, by a process of recombination.

intron

SO term: intron SO:0000188
Feature Table Definition: a segment of DNA that is transcribed, but removed from within the transcript by splicing together the sequences (exons) on either side of it;
Sequence Ontology Definition: A segment of DNA that is transcribed, but removed from within the transcript by splicing together the sequences (exons) on either side of it.

mRNA

SO term: mRNA SO:0000234
Feature Table Definition: messenger RNA; includes 5'untranslated region (5'UTR), coding sequences (CDS, exon) and 3'untranslated region (3'UTR);
Sequence Ontology Definition: messenger RNA is the intermediate molecule between DNA and protein. It includes UTR and coding sequences. It does not contain introns.

mat_peptide

SO term: mature_peptide SO:0000419
Feature Table Definition: mature peptide or protein coding sequence; coding sequence for the mature or final peptide or protein product following post-translational modification; the location does not include the stop codon (unlike the corresponding CDS);
Sequence Ontology Definition: The coding sequence for the mature or final peptide or protein product following post-translational modification.

misc_RNA

SO term: transcript SO:0000673
Feature Table Definition: any transcript or RNA product that cannot be defined by other RNA keys (prim_transcript, precursor_RNA, mRNA, 5'clip, 3'clip, 5'UTR, 3'UTR, exon, CDS, sig_peptide, transit_peptide, mat_peptide, intron, polyA_site, rRNA, tRNA, scRNA, and snRNA);
Sequence Ontology Definition: An RNA synthesized on a DNA or RNA template by an RNA polymerase.

misc_binding

SO term: binding_site SO:0000409
Feature Table Definition: site in nucleic acid which covalently or non-covalently binds another moiety that cannot be described by any other binding key (primer_bind or protein_bind);
Sequence Ontology Definition: A region on the surface of a molecule that may interact with another molecule.

misc_difference

SO term: sequence_difference SO:0000413
Feature Table Definition: feature sequence is different from that presented in the entry and cannot be described by any other Difference key (conflict, unsure, old_sequence, variation, or modified_base);
Sequence Ontology Definition: A region where the sequences differs from that of a specified sequence.

misc_feature

SO term: region SO:0000001
Feature Table Definition: region of biological interest which cannot be described by any other feature key; a new or rare feature;
Sequence Ontology Definition: Continous sequence.

misc_recomb

SO term: recombination_feature SO:0000298
Feature Table Definition: site of any generalized, site-specific or replicative recombination event where there is a breakage and reunion of duplex DNA that cannot be described by other recombination keys or qualifiers of source key (/insertion_seq, /transposon, /proviral);
Sequence Ontology Definition:

misc_signal

SO term: regulatory_region SO:0005836
Feature Table Definition: any region containing a signal controlling or altering gene function or expression that cannot be described by other signal keys (promoter, CAAT_signal, TATA_signal, -35_signal, -10_signal, GC_signal, RBS, polyA_signal, enhancer, attenuator, terminator, and rep_origin).
Sequence Ontology Definition: A DNA sequence that controls the expression of a gene.

misc_structure

SO term: sequence_secondary_structure SO:0000002
Feature Table Definition: any secondary or tertiary nucleotide structure or conformation that cannot be described by other Structure keys (stem_loop and D-loop);
Sequence Ontology Definition: A folded sequence.

modified_base

SO term: modified_base_site SO:0000305
Feature Table Definition: the indicated nucleotide is a modified nucleotide and should be substituted for by the indicated molecule (given in the mod_base qualifier value)
Sequence Ontology Definition: A modified nucleotide, i.e. a nucleotide other than A, T, C. G or (in RNA) U.

old_sequence

SO term: undefined
Feature Table Definition: the presented sequence revises a previous version of the sequence at this location; Or /compare=[accession-number.sequence-version]
Sequence Ontology Definition:

operon

SO term: operon SO:0000178
Feature Table Definition: region containing polycistronic transcript containing genes that encode enzymes that are in the same metabolic pathway and regulatory sequences
Sequence Ontology Definition: A group of contiguous genes transcribed as a single (polycistronic) mRNA from a single regulatory region.

oriT

SO term: origin_of_transfer SO:0000724
Feature Table Definition: origin of transfer; region of a DNA molecule where transfer is initiated during the process of conjugation or mobilization
Sequence Ontology Definition: A region of a DNA molecule whre transfer is initiated during the process of conjugation or mobilization.

polyA_signal

SO term: polyA_signal_sequence SO:0000551
Feature Table Definition: recognition region necessary for endonuclease cleavage of an RNA transcript that is followed by polyadenylation; consensus=AATAAA [1];
Sequence Ontology Definition: The recognition sequence necessary for endonuclease cleavage of an RNA transcript that is followed by polyadenylation; consensus=AATAAA.

polyA_site

SO term: polyA_site SO:0000553
Feature Table Definition: site on an RNA transcript to which will be added adenine residues by post-transcriptional polyadenylation;
Sequence Ontology Definition: The site on an RNA transcript to which will be added adenine residues by post-transcriptional polyadenylation.

precursor_RNA

SO term: primary_transcript SO:0000185
Feature Table Definition: any RNA species that is not yet the mature RNA product; may include 5' clipped region (5'clip), 5' untranslated region (5'UTR), coding sequences (CDS, exon), intervening sequences (intron), 3' untranslated region (3'UTR), and 3' clipped region (3'clip);
Sequence Ontology Definition: The primary (initial, unprocessed) transcript; includes five_prime_clip (SO:0000555), five_prime_untranslated_region (SO:0000204), open reading frames (SO:0000236), introns (SO:0000188) and three_prime_ untranslated_region (three_prime_UTR), and three_prime_clip (SO:0000557).

prim_transcript

SO term: primary_transcript SO:0000185
Feature Table Definition: primary (initial, unprocessed) transcript; includes 5' clipped region (5'clip), 5' untranslated region (5'UTR), coding sequences (CDS, exon), intervening sequences (intron), 3' untranslated region (3'UTR), and 3' clipped region (3'clip);
Sequence Ontology Definition: The primary (initial, unprocessed) transcript; includes five_prime_clip (SO:0000555), five_prime_untranslated_region (SO:0000204), open reading frames (SO:0000236), introns (SO:0000188) and three_prime_ untranslated_region (three_prime_UTR), and three_prime_clip (SO:0000557).

primer_bind

SO term: primer_binding_site SO:0005850
Feature Table Definition: non-covalent primer binding site for initiation of replication, transcription, or reverse transcription; includes site(s) for synthetic e.g., PCR primer elements;
Sequence Ontology Definition: Non-covalent primer binding site for initiation of replication, transcription, or reverse transcription.

promoter

SO term: promoter SO:0000167
Feature Table Definition: region on a DNA molecule involved in RNA polymerase binding to initiate transcription;
Sequence Ontology Definition: The region on a DNA molecule involved in RNA polymerase binding to initiate transcription.

protein_bind

SO term: protein_binding_site SO:0000410
Feature Table Definition: non-covalent protein binding site on nucleic acid;
Sequence Ontology Definition: A region of a molecule that binds to a protein.

rRNA

SO term: rRNA SO:0000252
Feature Table Definition: mature ribosomal RNA ; RNA component of the ribonucleoprotein particle (ribosome) which assembles amino acids into proteins.
Sequence Ontology Definition: RNA that comprises part of a ribosome, and that can provide both structural scaffolding and catalytic activity.

repeat_region

SO term: repeat_region SO:0000657
Feature Table Definition: region of genome containing repeating units;
Sequence Ontology Definition: A region of sequence containing one or more repeat units.

repeat_unit

SO term: repeat_unit SO:0000726
Feature Table Definition: single repeat element;
Sequence Ontology Definition: A single repeat element.

satellite

SO term: satellite_DNA SO:0000005
Feature Table Definition: many tandem repeats (identical or related) of a short basic repeating unit; many have a base composition or other property different from the genome average that allows them to be separated from the bulk (main band) genomic DNA;
Sequence Ontology Definition: The many tandem repeats (identical or related) of a short basic repeating unit; many have a base composition or other property different from the genome average that allows them to be separated from the bulk (main band) genomic DNA.

scRNA

SO term: scRNA SO:0000013
Feature Table Definition: small cytoplasmic RNA; any one of several small cytoplasmic RNA molecules present in the cytoplasm and (sometimes) nucleus of a eukaryote;
Sequence Ontology Definition: Any one of several small cytoplasmic RNA moleculespresent in the cytoplasm and sometimes nucleus of a eukaryote.

sig_peptide

SO term: signal_peptide SO:0000418
Feature Table Definition: signal peptide coding sequence; coding sequence for an N-terminal domain of a secreted protein; this domain is involved in attaching nascent polypeptide to the membrane leader sequence;
Sequence Ontology Definition: The sequence for an N-terminal domain of a secreted protein; this domain is involved in attaching nascent polypeptide to the membrane leader sequence.

snRNA

SO term: snRNA SO:0000274
Feature Table Definition: small nuclear RNA molecules involved in pre-mRNA splicing and processing
Sequence Ontology Definition: Small non-coding RNA in the nucleoplasm.

snoRNA

SO term: snoRNA SO:0000275
Feature Table Definition: small nucleolar RNA molecules mostly involved in rRNA modification and processing;
Sequence Ontology Definition: Small nucleolar RNAs (snoRNAs) are involved in the processing and modification of rRNA in the nucleolus. There are two main classes of snoRNAs: the box C/D class, and the box H/ACA class. U3 snoRNA is a member of the box C/D class. Indeed, the box C/D element is a subset of the six short sequence elements found in all U3 snoRNAs, namely boxes A, A', B, C, C', and D. The U3 snoRNA secondary structure is characterised by a small 5' domain (with boxes A and A'), and a larger 3' domain (with boxes B, C, C', and D), the two domains being linked by a single-stranded hinge. Boxes B and C form the B/C motif, which appears to be exclusive to U3 snoRNAs, and boxes C' and D form the C'/D motif. The latter is functionally similar to the C/D motifs found in other snoRNAs. The 5' domain and the hinge region act as a pre-rRNA-binding domain. The 3' domain has conserved protein-binding sites. Both the box B/C and box C'/D motifs are sufficient for nuclear retention of U3 snoRNA. The box C'/D motif is also necessary for nucleolar localization, stability and hypermethylation of U3 snoRNA. Both box B/C and C'/D motifs are involved in specific protein interactions and are necessary for the rRNA processing functions of U3 snoRNA.

source

SO term: databank_entry SO:2000061
Feature Table Definition: identifies the biological source of the specified span of the sequence; this key is mandatory; more than one source key per sequence is allowed; every entry/record will have, as a minimum, either a single source key spanning the entire sequence or multiple source keys which together span the entire sequence. /mol_type="genomic DNA", "genomic RNA", "mRNA", "tRNA", "rRNA", "snoRNA", "snRNA", "scRNA", "pre-RNA", "other RNA", "other DNA", "unassigned DNA", "unassigned RNA"
Sequence Ontology Definition: The sequence referred to by an entry in a databank such as Genbank or SwissProt.

stem_loop

SO term: stem_loop SO:0000313
Feature Table Definition: hairpin; a double-helical region formed by base-pairing between adjacent (inverted) complementary sequences in a single strand of RNA or DNA.
Sequence Ontology Definition: A double-helical region of nucleic acid formed by base-pairing between adjacent (inverted) complementary sequences.

tRNA

SO term: tRNA SO:0000253
Feature Table Definition: mature transfer RNA, a small RNA molecule (75-85 bases long) that mediates the translation of a nucleic acid sequence into an amino acid sequence;
Sequence Ontology Definition: Transfer RNA (tRNA) molecules are approximately 80 nucleotides in length. Their secondary structure includes four short double-helical elements and three loops (D, anti-codon, and T loops). Further hydrogen bonds mediate the characteristic L-shaped molecular structure. tRNAs have two regions of fundamental functional importance: the anti-codon, which is responsible for specific mRNA codon recognition, and the 3' end, to which the tRNA's corresponding amino acid is attached (by aminoacyl-tRNA synthetases). tRNAs cope with the degeneracy of the genetic code in two manners: having more than one tRNA (with a specific anti-codon) for a particular amino acid; and 'wobble' base-pairing, i.e. permitting non-standard base-pairing at the 3rd anti-codon position.

terminator

SO term: terminator SO:0000141
Feature Table Definition: sequence of DNA located either at the end of the transcript that causes RNA polymerase to terminate transcription;
Sequence Ontology Definition: The sequence of DNA located either at the end of the transcript that causes RNA polymerase to terminate transcription.

transit_peptide

SO term: transit_peptide SO:0000725
Feature Table Definition: transit peptide coding sequence; coding sequence for an N-terminal domain of a nuclear-encoded organellar protein; this domain is involved in post-translational import of the protein into the organelle;
Sequence Ontology Definition: The coding sequence for an N-terminal domain of a nuclear-encoded organellar protein: this domain is involved in post translational import of the protein into the organelle.

unsure

SO term: undefined
Feature Table Definition: author is unsure of exact sequence in this region;
Sequence Ontology Definition:

variation

SO term: sequence_variant SO:0000109
Feature Table Definition: a related strain contains stable mutations from the same gene (e.g., RFLPs, polymorphisms, etc.) which differ from the presented sequence at this location (and possibly others);
Sequence Ontology Definition: A region of sequence where variation has been observed.