Добавлен: 10.02.2019
Просмотров: 10978
Скачиваний: 3
LWBK274-FM_i-xiv.qxd 06/02/2009 04:57 PM Page xiv Aptara
1
A.
The human genome refers to the haploid set of chromosomes (nuclear plus mitochondrial),
which is divided into the very complex
nuclear genome
and the relatively simple
mitochondr-
ial genome
(discussed in Chapter 6).
B.
The human nuclear genome consists of 24 different chromosomes (22 autosomes; X and Y
sex chromosomes). The human nuclear genome codes for
30,000 genes
(precise number is
uncertain) which make up
2% of human nuclear genome.
C.
There are
27,000 protein-coding genes
(i.e., they follow the central dogma of molecular biol-
ogy: DNA transcribes RNA
→
mRNA translates protein).
D.
There are
3,000 RNA-coding genes
(i.e., they do not follow the central dogma of molecular
biology: DNA transcribes RNA
→
RNA is not translated into protein).
E.
The fact that the
30,000 genes make up only 2% of the human nuclear genome means
that
2% of the human nuclear genome consists of coding DNA
and
98% of the human nuclear
genome consists of noncoding DNA.
F.
When the
Human Genome Project
identified
30,000 genes, it was somewhat of a surprise to
find such a low number especially when compared to the genome of the roundworm
(Caenorhabditis elegans). The roundworm genome codes for
19,100 protein-coding genes
and
1,000 RNA-coding genes. This means that there is no correspondence between biolog-
ical complexity of a species and the number of protein-coding genes and RNA-coding genes
(i.e., biological complexity
amount of coding DNA)
.
However, there is correspondence
between biological complexity of a species and the amount of noncoding DNA
(i.e., biologi-
cal complexity
amount of noncoding DNA).
G.
In order to fully understand how heritable traits (both normal and disease related) are
passed down, it is important to understand three aspects of the human nuclear genome,
which include the following:
1. Protein-coding genes.
For decades, protein-coding genes were enshrined as the sole
repository of heritable traits. A mutation in a protein-coding gene caused the formation of
an abnormal protein and hence an altered trait or disease. Today, we know that protein-
coding genes are not the sole repository of heritable traits and that the situation is more
complicated.
2. RNA-coding genes.
RNA-coding genes produce
active RNAs
that can profoundly alter nor-
mal gene expression and hence produce an altered trait or disease.
LWBK274-C01_01-11.qxd 06/02/2009 03:29 PM Page 1 Aptara
3. Epigenetic control.
Epigenetic control involves
chemical modification of DNA
(e.g., methy-
lation) and
chemical modification of histones
(e.g., acetylation, phosphorylation, addition
of ubiquitin), both of which can profoundly alter normal gene expression and hence pro-
duce an altered trait or disease.
A. Size.
The size of protein-coding genes varies considerably from the 1.7 kb insulin gene
→
45 kb
LDL receptor gene
→
2,400 kb dystrophin gene.
B. Exon-Intron Organization.
Exons (expression sequences) are coding regions of a gene with an
average size of
200 bp. Introns (intervening sequences) are noncoding regions of a gene
with a huge variation in size. A small number of human genes (generally small genes
10 kb)
consists only of exons (i.e., no introns). However, most genes are composed of exons and
introns. There is a direct correlation between gene size and intron size (i.e., large genes tend
to have large introns).
C. Repetitive DNA Sequences.
Repetitive DNA sequences may be found in both exons and introns.
D. Classic Gene Family.
A classic gene family is a group of genes that exhibit a high degree of
sequence homology over most of the gene length.
2
BRS Genetics
≈
45%
Transposons
≈
44%
Other
≈
7%
Heterochromatin
≈
2%
≈
30,000 genes
•
≈
27,000 protein-coding genes
•
≈
3,000 RNA-coding genes
Coding DNA
Noncoding DNA
Pie chart indicating the organization of the human nuclear genome.
LWBK274-C01_01-11.qxd 06/02/2009 03:29 PM Page 2 Aptara
E. Gene Superfamily.
A gene superfamily is a group of genes that exhibit a low degree of
sequence homology over most of the gene length. However, there is relatedness in the pro-
tein function and structure. Examples of gene superfamilies include the immunoglobulin
superfamily, globin superfamily, and the G-protein receptor superfamily.
F. Organization of Genes in Gene Families.
1. Single cluster.
Genes are organized as a
tandem repeated array; close clustering
(where the
genes are controlled by a single expression control locus); and
compound clustering
(where
related and unrelated genes are clustered) all on a single chromosome.
2. Dispersed.
Genes are organized in a dispersed fashion at two or more different chromo-
some locations all on a single chromosome.
3. Multiple clusters.
Genes are organized in multiple clusters at various chromosome loca-
tions and on different chromosomes.
G. Unprocessed Pseudogenes, Truncated Genes, Internal Gene Fragments.
1.
Gene families are typically characterized by the presence of unprocessed pseudogenes
(i.e., defective copies of genes that are not transcribed into mRNA); truncated genes (i.e.,
portions of genes lacking 5
or 3 ends); or internal gene fragments (i.e., internal portions
of genes), which are formed by
tandem gene duplication
.
2.
In humans, there is strong selection pressure to maintain the sequence of important genes.
So, in order to propagate evolutionary changes, there is a need for gene duplication.
3.
The surplus duplicated genes can diverge rapidly, acquire mutations, and either degener-
ate into nonfunctional pseudogenes or mutate to produce a functional protein that is evo-
lutionary advantageous.
H. Processed Pseudogenes.
Processed pseudogenes are transcribed into mRNA, converted to cDNA
by reverse transcriptase, and then the cDNA is integrated into a chromosome. A processed
pseudogene is typically not expressed as protein because it lacks a promoter sequence.
I. Retrogene.
A retrogene is a processed pseudogene where the cDNA integrates into a chro-
mosome near a promoter sequence by chance. If this happens, then the processed pseudo-
gene will express protein. If selection pressure ensures the continued expression of the
processed pseudogene, then the processed pseudogene is considered a
retrogene
.
J. The Human Proteome.
The Human Genome Project has allowed the construction of a num-
ber of databases based on the DNA sequences that are shared by multiple proteins and indi-
cate common functions. These databases have been organized into various protein families,
protein domains, molecular function, and biological process.
1. Protein families.
The largest protein family consists of
rhodopsin-like G protein coupled
receptors.
The second largest protein family consists of
protein kinases.
2. Protein domains.
The most abundant protein domain is a
zinc finger C2H2 type
domain.
3. Molecular functions.
The most common molecular function of a protein is
ligand binding
.
The second most common molecular function of a protein is
enzymatic
.
4. Biological processes.
The most common biological process that proteins are involved in
is
protein metabolism
.
The second most common biological process that proteins are
involved in is
DNA, RNA,
and other
metabolic processes
.
A. 45S and 5S Ribosomal RNA (rRNA) Genes.
1.
The rRNA genes encode for
rRNAs
that are used in
protein synthesis
.
2.
The nucleolar organizing regions are the portions of the short arm of five pairs of chro-
mosomes (i.e., 13, 14, 15, 21, and 22) that contain about
200 copies
of rRNA genes, which
code for
45S rRNA
.
Chapter 1
The Human Nuclear Genome
3
LWBK274-C01_01-11.qxd 06/02/2009 03:29 PM Page 3 Aptara
3.
The rRNA genes are arranged in tandem repeated clusters (i.e., the repeated genes are
located next to each other).
4. RNA polymerase I
catalyzes the formation of
45S rRNA
.
5.
Another set of rRNA genes located outside of the nucleolus are transcribed by
RNA
Polymerase III
to form
5S rRNA
.
B. Transfer RNA (tRNA) Genes.
1.
The tRNA genes encode for
tRNAs
that are used in
protein synthesis
.
2.
There are 497 tRNA genes.
3.
The 497 tRNA genes are classified into 49 families based on their anticodon specificity.
C. Small Nuclear RNA (snRNA) Genes.
1.
The snRNA genes encode for
snRNAs
that are components of the major GU-AG spliceo-
some and minor AU-AC spliceosome used in
RNA splicing during protein synthesis
.
2.
The snRNAs are
uridine-rich
and are named accordingly (i.e., U1snRNA is the first snRNA
to be classified).
3.
There are
70 snRNA genes that encode for
U1snRNA, U2snRNA, U4snRNA, U5snRNA, and
U6 snRNA,
which are components of the
major GU-AG spliceosome.
D. Small Nucleolar RNA (snoRNA) Genes.
1.
The snoRNA genes encode for
snoRNAs
that direct site-specific base modifications in
rRNA.
2.
The
C/D box snoRNAs
direct the 2’-O-ribose methylation in rRNA.
3.
The
H/ACA snoRNAs
direct site-specific pseudouridylation (uridine is isomerized to
pseudouridine) of rRNA.
E. Regulatory RNA Genes.
1.
The regulatory RNA genes encode for
RNAs
that are likened to mRNA because they are
transcribed by RNA polymerase II, 7-methylguanosine capped, and polyadenylated.
2.
The
SRA-1 (steroid receptor activator) RNA gene
encodes for
SRA-1 RNA
that functions as a
co-activator of several steroid receptors.
3.
The
XIST gene
encodes for
XIST RNA
that functions in X chromosome inactivation.
F. XIST Gene.
1.
X chromosome inactivation is a process whereby either the
maternal X chromosome (X
M
)
or
paternal X chromosome (X
P
)
is inactivated, resulting in a heterochromatin structures called
the
Barr body
which is located along the inside of the nuclear envelope in female cells. This
inactivation process overcomes the sex difference in
X gene dosage.
2.
Males have one X chromosome and are therefore
constitutively hemizygous,
but females
have two X chromosomes.
3.
Gene dosage is important because many X-linked proteins interact with autosomal pro-
teins in a variety of metabolic and developmental pathways so there needs to be a tight
regulation in the amount of protein for key dosage-sensitive genes.
4.
X chromosome inactivation makes females
functionally hemizygous
.
X chromosome inac-
tivation begins early in embryological development at about the
late blastula stage
.
Whether the X
M
or the X
P
becomes inactivated is a
random and irreversible event.
However,
once a progenitor cell inactivates the X
M
, for example, all the daughter cells within that cell
lineage will also inactivate the X
M
(the same is true for the X
P
). This is called
clonal selec-
tion
and means that
all females are mosaics
comprising mixtures of cells in which either
the X
M
or X
P
is inactivated.
5.
X chromosome inactivation does not inactivate all the genes;
20% of the total genes
on
the X chromosome escape inactivation. This
20% of genes that remain active includes
those genes that have a functional homolog on the Y chromosome (gene dosage is not
affected in this case) or those genes where gene dosage is not important.
6.
The mechanism of X chromosome inactivation involves two cis-acting DNA sequences
called
Xic (X-inactivation center)
and
Xce (X-controlling element).
4
BRS Genetics
LWBK274-C01_01-11.qxd 06/02/2009 03:29 PM Page 4 Aptara