Добавлен: 10.02.2019

Просмотров: 10164

Скачиваний: 3

ВНИМАНИЕ! Если данный файл нарушает Ваши авторские права, то обязательно сообщите нам.
background image

LWBK274-FM_i-xiv.qxd  06/02/2009  04:57 PM  Page xiv Aptara


background image

c h a p t e r

1

The Human Nuclear
Genome 

1

I. GENERAL FEATURES 

(Figure 1-1)

A.

The human genome refers to the haploid set of chromosomes (nuclear plus mitochondrial),
which is divided into the very complex 

nuclear genome

and the relatively simple 

mitochondr-

ial genome 

(discussed in Chapter 6). 

B.

The human nuclear genome consists of 24 different chromosomes (22 autosomes; X and Y
sex chromosomes). The human nuclear genome codes for 

30,000 genes

(precise number is

uncertain) which make up 

2% of human nuclear genome.

C.

There are 

27,000 protein-coding genes

(i.e., they follow the central dogma of molecular biol-

ogy: DNA transcribes RNA 

mRNA translates protein). 

D.

There are 

3,000 RNA-coding genes

(i.e., they do not follow the central dogma of molecular

biology: DNA transcribes RNA 

RNA is not translated into protein). 

E.

The fact that the 

30,000 genes make up only 2% of the human nuclear genome means

that 

2% of the human nuclear genome consists of coding DNA

and 

98% of the human nuclear

genome consists of noncoding DNA.

F.

When the 

Human Genome Project

identified 

30,000 genes, it was somewhat of a surprise to

find such a low number especially when compared to the genome of the roundworm
(Caenorhabditis elegans). The roundworm genome codes for 

19,100 protein-coding genes

and 

1,000 RNA-coding genes. This means that there is no correspondence between biolog-

ical complexity of a species and the number of protein-coding genes and RNA-coding genes

(i.e., biological complexity 

  amount of coding DNA)

.

However, there is correspondence

between biological complexity of a species and the amount of noncoding DNA 

(i.e., biologi-

cal complexity 

 amount of noncoding DNA). 

G.

In order to fully understand how heritable traits (both normal and disease related) are
passed down, it is important to understand three aspects of the human nuclear genome,
which include the following:

1. Protein-coding genes.

For decades, protein-coding genes were enshrined as the sole

repository of heritable traits. A mutation in a protein-coding gene caused the formation of
an abnormal protein and hence an altered trait or disease. Today, we know that protein-
coding genes are not the sole repository of heritable traits and that the situation is more
complicated.

2. RNA-coding genes.

RNA-coding genes produce 

active RNAs

that can profoundly alter nor-

mal gene expression and hence produce an altered trait or disease.

LWBK274-C01_01-11.qxd  06/02/2009  03:29 PM  Page 1 Aptara


background image

3. Epigenetic control.

Epigenetic control involves 

chemical modification of DNA

(e.g., methy-

lation) and 

chemical modification of histones

(e.g., acetylation, phosphorylation, addition

of ubiquitin), both of which can profoundly alter normal gene expression and hence pro-
duce an altered trait or disease. 

II. PROTEIN-CODING GENES

A. Size.

The size of protein-coding genes varies considerably from the 1.7 kb insulin gene 

45 kb

LDL receptor gene 

2,400 kb dystrophin gene. 

B. Exon-Intron Organization.

Exons (expression sequences) are coding regions of a gene with an

average size of 

200 bp. Introns (intervening sequences) are noncoding regions of a gene

with a huge variation in size. A small number of human genes (generally small genes 

10 kb)

consists only of exons (i.e., no introns). However, most genes are composed of exons and
introns. There is a direct correlation between gene size and intron size (i.e., large genes tend
to have large introns). 

C. Repetitive DNA Sequences.

Repetitive DNA sequences may be found in both exons and introns. 

D. Classic Gene Family.

A classic gene family is a group of genes that exhibit a high degree of

sequence homology over most of the gene length.

2

BRS Genetics

45%

Transposons

44%

Other

7%

Heterochromatin

2%

30,000 genes

    • 

27,000 protein-coding genes

    • 

3,000 RNA-coding genes

Coding DNA

Noncoding DNA

FIGURE 1-1.

Pie chart indicating the organization of the human nuclear genome.

LWBK274-C01_01-11.qxd  06/02/2009  03:29 PM  Page 2 Aptara


background image

E. Gene Superfamily.

A gene superfamily is a group of genes that exhibit a low degree of

sequence homology over most of the gene length. However, there is relatedness in the pro-
tein function and structure. Examples of gene superfamilies include the immunoglobulin
superfamily, globin superfamily, and the G-protein receptor superfamily. 

F. Organization of Genes in Gene Families.

1. Single cluster.

Genes are organized as a 

tandem repeated array; close clustering 

(where the

genes are controlled by a single expression control locus); and 

compound clustering

(where

related and unrelated genes are clustered) all on a single chromosome.

2. Dispersed.

Genes are organized in a dispersed fashion at two or more different chromo-

some locations all on a single chromosome. 

3. Multiple clusters.

Genes are organized in multiple clusters at various chromosome loca-

tions and on different chromosomes. 

G. Unprocessed Pseudogenes, Truncated Genes, Internal Gene Fragments. 

1.

Gene families are typically characterized by the presence of unprocessed pseudogenes
(i.e., defective copies of genes that are not transcribed into mRNA); truncated genes (i.e.,
portions of genes lacking 5

 or 3 ends); or internal gene fragments (i.e., internal portions

of genes), which are formed by 

tandem gene duplication

.

2.

In humans, there is strong selection pressure to maintain the sequence of important genes.
So, in order to propagate evolutionary changes, there is a need for gene duplication. 

3.

The surplus duplicated genes can diverge rapidly, acquire mutations, and either degener-
ate into nonfunctional pseudogenes or mutate to produce a functional protein that is evo-
lutionary advantageous. 

H. Processed Pseudogenes.

Processed pseudogenes are transcribed into mRNA, converted to cDNA

by reverse transcriptase, and then the cDNA is integrated into a chromosome. A processed
pseudogene is typically not expressed as protein because it lacks a promoter sequence. 

I. Retrogene.

A retrogene is a processed pseudogene where the cDNA integrates into a chro-

mosome near a promoter sequence by chance. If this happens, then the processed pseudo-
gene will express protein. If selection pressure ensures the continued expression of the
processed pseudogene, then the processed pseudogene is considered a 

retrogene

.

J. The Human Proteome.

The Human Genome Project has allowed the construction of a num-

ber of databases based on the DNA sequences that are shared by multiple proteins and indi-
cate common functions. These databases have been organized into various protein families,
protein domains, molecular function, and biological process.

1. Protein families.

The largest protein family consists of 

rhodopsin-like G protein coupled

receptors.

The second largest protein family consists of 

protein kinases.

2. Protein domains.

The most abundant protein domain is a 

zinc finger C2H2 type

domain. 

3. Molecular functions.

The most common molecular function of a protein is 

ligand binding

.

The second most common molecular function of a protein is 

enzymatic

.

4. Biological processes.

The most common biological process that proteins are involved in

is 

protein metabolism

.

The second most common biological process that proteins are

involved in is 

DNA, RNA,

and other 

metabolic processes

.

III. RNA-CODING GENES

A. 45S and 5S Ribosomal RNA (rRNA) Genes.

1.

The rRNA genes encode for 

rRNAs

that are used in 

protein synthesis

.

2.

The nucleolar organizing regions are the portions of the short arm of five pairs of chro-
mosomes (i.e., 13, 14, 15, 21, and 22) that contain about 

200 copies 

of rRNA genes, which

code for 

45S rRNA

.

Chapter 1

The Human Nuclear Genome

3

LWBK274-C01_01-11.qxd  06/02/2009  03:29 PM  Page 3 Aptara


background image

3.

The rRNA genes are arranged in tandem repeated clusters (i.e., the repeated genes are
located next to each other). 

4. RNA polymerase I

catalyzes the formation of 

45S rRNA

.

5.

Another set of rRNA genes located outside of the nucleolus are transcribed by 

RNA

Polymerase III

to form 

5S rRNA

.

B. Transfer RNA (tRNA) Genes.

1.

The tRNA genes encode for 

tRNAs

that are used in 

protein synthesis

.

2.

There are 497 tRNA genes.

3.

The 497 tRNA genes are classified into 49 families based on their anticodon specificity. 

C. Small Nuclear RNA (snRNA) Genes. 

1.

The snRNA genes encode for 

snRNAs

that are components of the major GU-AG spliceo-

some and minor AU-AC spliceosome used in 

RNA splicing during protein synthesis

.

2.

The snRNAs are 

uridine-rich

and are named accordingly (i.e., U1snRNA is the first snRNA

to be classified). 

3.

There are 

70 snRNA genes that encode for 

U1snRNA, U2snRNA, U4snRNA, U5snRNA, and

U6 snRNA,

which are components of the 

major GU-AG spliceosome.

D. Small Nucleolar RNA (snoRNA) Genes. 

1.

The  snoRNA genes encode for 

snoRNAs

that direct site-specific base modifications in

rRNA. 

2.

The 

C/D box snoRNAs

direct the 2’-O-ribose methylation in rRNA. 

3.

The 

H/ACA snoRNAs

direct site-specific pseudouridylation (uridine is isomerized to

pseudouridine) of rRNA. 

E. Regulatory RNA Genes. 

1.

The regulatory RNA genes encode for 

RNAs

that are likened to mRNA because they are

transcribed by RNA polymerase II, 7-methylguanosine capped, and polyadenylated. 

2.

The 

SRA-1 (steroid receptor activator) RNA gene

encodes for 

SRA-1 RNA

that functions as a

co-activator of several steroid receptors. 

3.

The 

XIST gene

encodes for 

XIST RNA

that functions in X chromosome inactivation. 

F. XIST Gene.

1.

X chromosome inactivation is a process whereby either the 

maternal X chromosome (X

M

)

or

paternal X chromosome (X

P

)

is inactivated, resulting in a heterochromatin structures called

the 

Barr body

which is located along the inside of the nuclear envelope in female cells. This

inactivation process overcomes the sex difference in 

X gene dosage. 

2.

Males have one X chromosome and are therefore 

constitutively hemizygous,

but females

have two X chromosomes. 

3.

Gene dosage is important because many X-linked proteins interact with autosomal pro-
teins in a variety of metabolic and developmental pathways so there needs to be a tight
regulation in the amount of protein for key dosage-sensitive genes. 

4.

X chromosome inactivation makes females 

functionally hemizygous

.

X chromosome inac-

tivation begins early in embryological development at about the 

late blastula stage

.

Whether the X

M

or the X

P

becomes inactivated is a 

random and irreversible event. 

However,

once a progenitor cell inactivates the X

M

, for example, all the daughter cells within that cell

lineage will also inactivate the X

M

(the same is true for the X

P

). This is called 

clonal selec-

tion

and means that 

all females are mosaics

comprising mixtures of cells in which either

the X

M

or X

P

is inactivated. 

5.

X chromosome inactivation does not inactivate all the genes; 

20% of the total genes

on

the X chromosome escape inactivation. This 

20% of genes that remain active includes

those genes that have a functional homolog on the Y chromosome (gene dosage is not
affected in this case) or those genes where gene dosage is not important. 

6.

The mechanism of X chromosome inactivation involves two cis-acting DNA sequences
called 

Xic (X-inactivation center)

and 

Xce (X-controlling element).

4

BRS Genetics

LWBK274-C01_01-11.qxd  06/02/2009  03:29 PM  Page 4 Aptara