Файл: Genetics [BRS].pdf

Скачать файл (3,95Мб)

Заказать решение

ВУЗ: Ростовский Государственный Медицинский Университет

Категория: Книга

Дисциплина: Медицина

Добавлен: 10.02.2019

Просмотров: 10164

Скачиваний: 3

ВНИМАНИЕ! Если данный файл нарушает Ваши авторские права, то обязательно сообщите нам.

LWBK274-FM_i-xiv.qxd 06/02/2009 04:57 PM Page xiv Aptara

c h a p t e r

The Human Nuclear
Genome

I. GENERAL FEATURES

(Figure 1-1)

The human genome refers to the haploid set of chromosomes (nuclear plus mitochondrial),
which is divided into the very complex

nuclear genome

and the relatively simple

mitochondr-

ial genome

(discussed in Chapter 6).

The human nuclear genome consists of 24 different chromosomes (22 autosomes; X and Y
sex chromosomes). The human nuclear genome codes for

30,000 genes

(precise number is

uncertain) which make up

2% of human nuclear genome.

There are

27,000 protein-coding genes

(i.e., they follow the central dogma of molecular biol-

ogy: DNA transcribes RNA

→

mRNA translates protein).

There are

3,000 RNA-coding genes

(i.e., they do not follow the central dogma of molecular

biology: DNA transcribes RNA

→

RNA is not translated into protein).

The fact that the

30,000 genes make up only 2% of the human nuclear genome means

that

2% of the human nuclear genome consists of coding DNA

and

98% of the human nuclear

genome consists of noncoding DNA.

When the

Human Genome Project

identified

30,000 genes, it was somewhat of a surprise to

find such a low number especially when compared to the genome of the roundworm
(Caenorhabditis elegans). The roundworm genome codes for

19,100 protein-coding genes

and

1,000 RNA-coding genes. This means that there is no correspondence between biolog-

ical complexity of a species and the number of protein-coding genes and RNA-coding genes

(i.e., biological complexity

amount of coding DNA)

However, there is correspondence

between biological complexity of a species and the amount of noncoding DNA

(i.e., biologi-

cal complexity

amount of noncoding DNA).

In order to fully understand how heritable traits (both normal and disease related) are
passed down, it is important to understand three aspects of the human nuclear genome,
which include the following:

1. Protein-coding genes.

For decades, protein-coding genes were enshrined as the sole

repository of heritable traits. A mutation in a protein-coding gene caused the formation of
an abnormal protein and hence an altered trait or disease. Today, we know that protein-
coding genes are not the sole repository of heritable traits and that the situation is more
complicated.

2. RNA-coding genes.

RNA-coding genes produce

active RNAs

that can profoundly alter nor-

mal gene expression and hence produce an altered trait or disease.

LWBK274-C01_01-11.qxd 06/02/2009 03:29 PM Page 1 Aptara

3. Epigenetic control.

Epigenetic control involves

chemical modification of DNA

(e.g., methy-

lation) and

chemical modification of histones

(e.g., acetylation, phosphorylation, addition

of ubiquitin), both of which can profoundly alter normal gene expression and hence pro-
duce an altered trait or disease.

II. PROTEIN-CODING GENES

A. Size.

The size of protein-coding genes varies considerably from the 1.7 kb insulin gene

→

45 kb

LDL receptor gene

→

2,400 kb dystrophin gene.

B. Exon-Intron Organization.

Exons (expression sequences) are coding regions of a gene with an

average size of

200 bp. Introns (intervening sequences) are noncoding regions of a gene

with a huge variation in size. A small number of human genes (generally small genes

10 kb)

consists only of exons (i.e., no introns). However, most genes are composed of exons and
introns. There is a direct correlation between gene size and intron size (i.e., large genes tend
to have large introns).

C. Repetitive DNA Sequences.

Repetitive DNA sequences may be found in both exons and introns.

D. Classic Gene Family.

A classic gene family is a group of genes that exhibit a high degree of

sequence homology over most of the gene length.

BRS Genetics

≈

45%

Transposons

≈

44%

Other

≈

Heterochromatin

≈

30,000 genes

•

≈

27,000 protein-coding genes

•

≈

3,000 RNA-coding genes

Coding DNA

Noncoding DNA

FIGURE 1-1.

Pie chart indicating the organization of the human nuclear genome.

LWBK274-C01_01-11.qxd 06/02/2009 03:29 PM Page 2 Aptara

E. Gene Superfamily.

A gene superfamily is a group of genes that exhibit a low degree of

sequence homology over most of the gene length. However, there is relatedness in the pro-
tein function and structure. Examples of gene superfamilies include the immunoglobulin
superfamily, globin superfamily, and the G-protein receptor superfamily.

F. Organization of Genes in Gene Families.

1. Single cluster.

Genes are organized as a

tandem repeated array; close clustering

(where the

genes are controlled by a single expression control locus); and

compound clustering

(where

related and unrelated genes are clustered) all on a single chromosome.

2. Dispersed.

Genes are organized in a dispersed fashion at two or more different chromo-

some locations all on a single chromosome.

3. Multiple clusters.

Genes are organized in multiple clusters at various chromosome loca-

tions and on different chromosomes.

G. Unprocessed Pseudogenes, Truncated Genes, Internal Gene Fragments.

Gene families are typically characterized by the presence of unprocessed pseudogenes
(i.e., defective copies of genes that are not transcribed into mRNA); truncated genes (i.e.,
portions of genes lacking 5

or 3 ends); or internal gene fragments (i.e., internal portions

of genes), which are formed by

tandem gene duplication

In humans, there is strong selection pressure to maintain the sequence of important genes.
So, in order to propagate evolutionary changes, there is a need for gene duplication.

The surplus duplicated genes can diverge rapidly, acquire mutations, and either degener-
ate into nonfunctional pseudogenes or mutate to produce a functional protein that is evo-
lutionary advantageous.

H. Processed Pseudogenes.

Processed pseudogenes are transcribed into mRNA, converted to cDNA

by reverse transcriptase, and then the cDNA is integrated into a chromosome. A processed
pseudogene is typically not expressed as protein because it lacks a promoter sequence.

I. Retrogene.

A retrogene is a processed pseudogene where the cDNA integrates into a chro-

mosome near a promoter sequence by chance. If this happens, then the processed pseudo-
gene will express protein. If selection pressure ensures the continued expression of the
processed pseudogene, then the processed pseudogene is considered a

retrogene

J. The Human Proteome.

The Human Genome Project has allowed the construction of a num-

ber of databases based on the DNA sequences that are shared by multiple proteins and indi-
cate common functions. These databases have been organized into various protein families,
protein domains, molecular function, and biological process.

1. Protein families.

The largest protein family consists of

rhodopsin-like G protein coupled

receptors.

The second largest protein family consists of

protein kinases.

2. Protein domains.

The most abundant protein domain is a

zinc finger C2H2 type

domain.

3. Molecular functions.

The most common molecular function of a protein is

ligand binding

The second most common molecular function of a protein is

enzymatic

4. Biological processes.

The most common biological process that proteins are involved in

protein metabolism

The second most common biological process that proteins are

involved in is

DNA, RNA,

and other

metabolic processes

III. RNA-CODING GENES

A. 45S and 5S Ribosomal RNA (rRNA) Genes.

The rRNA genes encode for

rRNAs

that are used in

protein synthesis

The nucleolar organizing regions are the portions of the short arm of five pairs of chro-
mosomes (i.e., 13, 14, 15, 21, and 22) that contain about

200 copies

of rRNA genes, which

code for

45S rRNA

Chapter 1

The Human Nuclear Genome

LWBK274-C01_01-11.qxd 06/02/2009 03:29 PM Page 3 Aptara

The rRNA genes are arranged in tandem repeated clusters (i.e., the repeated genes are
located next to each other).

4. RNA polymerase I

catalyzes the formation of

45S rRNA

Another set of rRNA genes located outside of the nucleolus are transcribed by

RNA

Polymerase III

to form

5S rRNA

B. Transfer RNA (tRNA) Genes.

The tRNA genes encode for

tRNAs

that are used in

protein synthesis

There are 497 tRNA genes.

The 497 tRNA genes are classified into 49 families based on their anticodon specificity.

C. Small Nuclear RNA (snRNA) Genes.

The snRNA genes encode for

snRNAs

that are components of the major GU-AG spliceo-

some and minor AU-AC spliceosome used in

RNA splicing during protein synthesis

The snRNAs are

uridine-rich

and are named accordingly (i.e., U1snRNA is the first snRNA

to be classified).

There are

70 snRNA genes that encode for

U1snRNA, U2snRNA, U4snRNA, U5snRNA, and

U6 snRNA,

which are components of the

major GU-AG spliceosome.

D. Small Nucleolar RNA (snoRNA) Genes.

The snoRNA genes encode for

snoRNAs

that direct site-specific base modifications in

rRNA.

The

C/D box snoRNAs

direct the 2’-O-ribose methylation in rRNA.

The

H/ACA snoRNAs

direct site-specific pseudouridylation (uridine is isomerized to

pseudouridine) of rRNA.

E. Regulatory RNA Genes.

The regulatory RNA genes encode for

RNAs

that are likened to mRNA because they are

transcribed by RNA polymerase II, 7-methylguanosine capped, and polyadenylated.

The

SRA-1 (steroid receptor activator) RNA gene

encodes for

SRA-1 RNA

that functions as a

co-activator of several steroid receptors.

The

XIST gene

encodes for

XIST RNA

that functions in X chromosome inactivation.

F. XIST Gene.

X chromosome inactivation is a process whereby either the

maternal X chromosome (X

)

paternal X chromosome (X

)

is inactivated, resulting in a heterochromatin structures called

the

Barr body

which is located along the inside of the nuclear envelope in female cells. This

inactivation process overcomes the sex difference in

X gene dosage.

Males have one X chromosome and are therefore

constitutively hemizygous,

but females

have two X chromosomes.

Gene dosage is important because many X-linked proteins interact with autosomal pro-
teins in a variety of metabolic and developmental pathways so there needs to be a tight
regulation in the amount of protein for key dosage-sensitive genes.

X chromosome inactivation makes females

functionally hemizygous

X chromosome inac-

tivation begins early in embryological development at about the

late blastula stage

Whether the X

or the X

becomes inactivated is a

random and irreversible event.

However,

once a progenitor cell inactivates the X

, for example, all the daughter cells within that cell

lineage will also inactivate the X

(the same is true for the X

). This is called

clonal selec-

tion

and means that

all females are mosaics

comprising mixtures of cells in which either

the X

or X

is inactivated.

X chromosome inactivation does not inactivate all the genes;

20% of the total genes

the X chromosome escape inactivation. This

20% of genes that remain active includes

those genes that have a functional homolog on the Y chromosome (gene dosage is not
affected in this case) or those genes where gene dosage is not important.

The mechanism of X chromosome inactivation involves two cis-acting DNA sequences
called

Xic (X-inactivation center)

and

Xce (X-controlling element).

BRS Genetics

LWBK274-C01_01-11.qxd 06/02/2009 03:29 PM Page 4 Aptara