MindMap Gallery Biochemistry and Molecular Biology - Eukaryotic Genes and Genomes
People's Medical Publishing House, Ninth Edition "Biochemistry and Molecular Biology" Chapter 11 Eukaryotic genes and genomes, including the genome structural characteristics of prokaryotes, the structure and function of eukaryotic genes, the structural characteristics of eukaryotic genomes, etc.
Edited at 2023-11-07 11:35:28This is a mind map about bacteria, and its main contents include: overview, morphology, types, structure, reproduction, distribution, application, and expansion. The summary is comprehensive and meticulous, suitable as review materials.
This is a mind map about plant asexual reproduction, and its main contents include: concept, spore reproduction, vegetative reproduction, tissue culture, and buds. The summary is comprehensive and meticulous, suitable as review materials.
This is a mind map about the reproductive development of animals, and its main contents include: insects, frogs, birds, sexual reproduction, and asexual reproduction. The summary is comprehensive and meticulous, suitable as review materials.
This is a mind map about bacteria, and its main contents include: overview, morphology, types, structure, reproduction, distribution, application, and expansion. The summary is comprehensive and meticulous, suitable as review materials.
This is a mind map about plant asexual reproduction, and its main contents include: concept, spore reproduction, vegetative reproduction, tissue culture, and buds. The summary is comprehensive and meticulous, suitable as review materials.
This is a mind map about the reproductive development of animals, and its main contents include: insects, frogs, birds, sexual reproduction, and asexual reproduction. The summary is comprehensive and meticulous, suitable as review materials.
Eukaryotic genes and genomes
introduce
Gene concept
A basic unit that can encode proteins or RNA and other products with specific functions and carry genetic information.
Usually refers to a DNA sequence of a chromosome or genome
coding sequence
Exon
spacer sequence between individual coding sequences
intron
genome concept
The sum of all genetic information in an organism
composition
nuclear chromosomal DNA
mitochondrial DNA
Genome structure characteristics of prokaryotes
Usually consists of only a circular double-stranded DNA (and plasmid)
There is only one origin of replication in the genome, with an operon structure
No introns, genes are usually contiguous
There are very few repetitive sequences in the genome, and most of the structural genes encoding proteins are single-copy genes.
Recognition regions in the genome with multiple functions
Copy starting area
copy termination area
transcription initiation region
Termination area
other
The coding region is larger than the non-coding region, and the non-coding region is mainly regulatory sequences
Mobile DNA sequences are present in the genome
insertion sequence
transposon
other
Structure and function of eukaryotic genes
Basic structure of eukaryotic genes
coding region
Encoding protein or RNA
Exon
The sequence in a gene sequence that appears on the mature mRNA molecule
non-coding region
spacer sequence between individual coding sequences
intron
Located between exons, part of the corresponding spacer sequence is deleted during the process of mRNA splicing
The 5'-untranslated region and the 3'-untranslated region of the gene after the transcription start point
Features
Each gene has one less intron than exon.
The number of exons of different genes is different and can vary greatly
The number of exons is one of the important features describing the structure of a gene
Most protein-coding genes in higher eukaryotes have introns (except for histone-coding genes)
The number and size of introns largely determines the size of higher eukaryotic genes
Genes encoding rRNA and some tRNA also have introns
There is a highly conserved sequence at the junction of exons and introns
Most of the 5'-ends of introns start with GT
Most of the 3'-ends of introns end with AG
Recognition signals for RNA splicing
academic terms
upstream
5'-end of a gene
downstream
3'-end of a gene
1
The base corresponding to the first nucleotide in a gene sequence that initiates RNA synthesis.
Sequences upstream of this base are recorded as negative numbers
Sequences downstream of this base are recorded as positive numbers
Zero is not used to mark base positions
broken gene
ppt explanation
Eukaryotic genes are composed of several coding regions and non-coding regions that are separated from each other but are continuously mosaic, encoding a complete protein composed of continuous amino acids, called a break gene
Medical Encyclopedia
Eukaryotic structural genes are composed of several coding regions and non-coding regions that are spaced apart but continuously mosaic. After removing the non-coding regions and reconnecting them, complete proteins composed of continuous amino acids can be translated. These genes are called break genes. (splite gene)
In genetics, genes that encode proteins are usually called structural genes. The structural genes of eukaryotes are fragmented genes.
gene coding sequence
Transcription product
mRNA
Translation produces polypeptide chain
specific RNA
rRNA
tRNA
small RNA
other
Features
The coding sequence of a gene determines the sequence and function of its product
The primary structure of DNA determines the primary structure of its transcription product, RNA molecule
A mutation or change in one base in the coding sequence may cause important changes in gene function
Codons are degenerate
Changes in the transcription start site or the presence of alternative splicing in the mRNA can affect the protein polypeptide chain
encode different protein polypeptide chains
gene regulatory sequences
definition
A region located before and after the gene transcription region that regulates gene expression. Because it is a closely adjacent DNA sequence, it is also called a flanking sequence, or a cis-acting element.
type
Promoter
definition
The sequence on the DNA molecule that mediates the binding of RNA polymerase and the formation of the transcription initiation complex
Features
Usually located upstream of the transcription start site and not transcribed
The promoter encoding tRNA is located downstream of the transcription start site and can be transcribed
There are three main types of promoters in eukaryotes
Corresponds to three different RNA polymerases and related proteins
Class I promoter
Genes with class I promoters are mainly genes encoding rRNA
Includes 5.8, 18 and 28S rRNA
Features
Rich in GC base pairs
composition
core promoter
upstream promoter element
Class II promoter
Genes with class II promoters are mainly genes that can transcribe mRNA and encode proteins and some snRNA genes.
Features
Has the characteristic structure of TATA box
composition
TATA box
-25bp or so, core sequence TATA (A/T) A (A/T), binds to TATA binding protein, Start gene transcription
Upstream regulatory elements (such as enhancers and initiation elements)
Some type II promoters also have characteristic sequences such as CAAT box and GC box upstream, which together form the promoter.
GC box
-50 bp or so, the core sequence CCGCC binds to the transcription factor SP1 to promote the transcription process
CAAT box
-70 bp or so, core sequence GGNCAATCT. Binds to C/EBP to regulate transcription efficiency
Class III promoter
Genes responsible for regulating transcription include 5S rRNA, tRNA, U6 snRNA (small nuclear RNA), etc.
composition
Contains A, B and C boxes
upstream regulatory elements
enhancer, silencer
enhancer
definition
Cis-acting elements that can enhance the efficiency of eukaryotic promoters are the most important regulatory sequences of eukaryotic genes and determine the expression level of each gene in the cell.
Features
any direction and any location
Mostly located upstream
Different enhancers bind different regulatory proteins
super enhancer
definition
It is a type of cis-regulatory element with strong transcription activation properties.
It is a large cluster of transcriptionally active enhancers that is highly enriched. degree of key transcription factors, cofactors, etc., so it is increased than ordinary The expression level of genes regulated by hadrons is much higher, which is very important for transcription factors. blocking is more sensitive
silent son
definition
Specific DNA sequences that can inhibit gene transcription, combine with trans-acting factors to repress gene transcription, and silence genes.
insulator
definition
An element that plays an important role in transcriptional regulation, either by blocking the action of enhancers on promoters or by protecting genes from nearby chromatin environments.
possible mechanism
By affecting the three-dimensional structure of chromatin, such as DNA bending or forming loops
tail signal
definition
There is a conserved AATAAA sequence in the last exon of the structural gene, and there is a GT-rich region downstream of this site. These two sequences together form a poly(A) tailing signal.
cell signaling response element
definition
It is a type of DNA sequence that can mediate the response of genes to certain signals outside the cell.
Example
glucocorticoid response element
other
Structure and function of eukaryotic genomes
The concept of genome
The sum of all genetic information in an organism
Structural features of eukaryotic genomes
The proportion of gene coding sequences is much smaller than that of non-coding sequences
Coding sequences occupy only 1% of the genome
Contains a large number of repetitive sequences
Humans can reach more than 50%
Classification based on repetition frequency of repeated sequences
highly repetitive sequence
Thousands to millions of copies
Classified according to structural characteristics
inverted repeat sequence
Complementary copies of the same sequence are arranged in reverse order on the DNA strand
300bp or slightly shorter, accounting for 5% of the human genome
Scattered rather than clustered throughout the genome
satellite DNA
Classification
satellite DNA
minisatellite DNA
Microsatellite DNA
Features
Separated from host DNA during density gradient centrifugation
Low GC content in base composition
Have different buoyant densities
Distributed in the centromere region of chromosomes
Accounts for more than 10% of the human genome
usually not transcribed
Function
Participate in the regulation of replication levels
Inverted repeats are binding sites for enzymes involved in the initiation of DNA replication
Involved in the regulation of gene expression
Hairpin structures (which some inverted repeats can form) help stabilize RNA molecules
Participate in chromosome pairing
Synapsis between homologous chromosomes may rely on specific satellite DNA sequences with chromosomal specificity
related to evolution
Highly repetitive nucleotide sequences are species specific
Moderately repetitive sequence
tens to thousands of times
Classification based on length of repeat sequence
Short interspersed nuclear elements (short interspersed repeats)
Short repeating sequences scattered throughout the genome
The average length is 300-500bp, and the number of copies can reach hundreds of thousands.
Such as Alu, KpnI, and Hinf families, etc.
Long interspersed nuclear elements (long interspersed repeats)
Scattered distribution of long repetitive sequences in the genome
The average length is more than 1000bp
Often have transposition activity
DNA transposition, also known as transposition, is a rearrangement of genetic material mediated by a mobile element (transposition element).
Function
Does not encode a protein and functions like a highly repetitive sequence
rRNA genes in eukaryotes are moderately repetitive sequences
rRNA genes usually exist in clusters rather than scattered throughout the genome
Such regions are called rDNA regions
For example, the nucleolar organizing region of a chromosome is the rDNA region.
Lowly repetitive sequences (single copy sequences)
once or several times
Features
Most protein-coding genes fall into this category
Usually arranged alternately with repeated sequences
Approximately 60-65% of the sequences in the human genome fall into this category
Huge amounts of genetic information are stored in low-level repetitive sequences, encoding proteins with various functions.
Existence of multigene families and pseudogenes
multigene family
definition
It is a group of structurally similar and functionally related genes produced by duplication and mutation of an ancestral gene.
Classification
distributed on one chromosome
Function simultaneously to synthesize certain proteins
Example
Histone gene family
distributed on different chromosomes
Different members encode a group of functionally closely related proteins
Example
human globin gene family
alpha globin gene cluster
beta globin gene cluster
gene superfamily
definition
Refers to a common ancestral gene that, through various mutations, has produced a larger gene family composed of a large number of genes with roughly the same structure but different functions.
Example
Immunoglobulin gene superfamily
The polypeptide chain folding pattern of many cell membrane surface and certain protein molecules in the body is similar to that of Ig, and it has high homology with IgV region or C region at the DNA level and amino acid sequence. The products encoded by this gene superfamily are called immunoglobulin superfamily (IGSF)
fake gene
definition
A DNA sequence present in the genome that is very similar to a normal gene but is generally not expressed
Classification
raw pseudogene
Contains introns
processed pseudogene
no introns
Speculated source
The mature mRNA generated by gene transcription is reverse transcribed to produce cDNA, which is then integrated into chromosomal DNA and may become a pseudogene.
There is a lot of alternative splicing
60% of genes are post-transcribed alternative splicing
And 80% of alternative splicing will is a protein sequence change
Mitochondrial DNA (mtDNA)
Structure similar to prokaryotic DNA
cyclic molecule
human mtDNA
Full length 16569bp
Encodes 37 genes
13 polypeptide genes encoding respiratory chain multi-enzyme systems
22 mt-tRNA genes
2 mt-rRNA genes
The human genome contains approximately 20,000 protein-coding genes