82
Body Expression Map of Human Genome
the 5
0
-end until it fnds one 12mers that
Fully matches to the genome; some 12mers
are not immediately evident because oF er-
rors in the EST sequences. Similarly, the
algorithm scans the EST From the 3
0
-end.
The third step is to align the intervening
sequence, that is, the sequence between
the 12mers at the start and end. The lower
part oF the fgure illustrates a special case,
in which our algorithm Fractionates an EST
into two exons and aligns the exons to the
DNA sequence. This method is generally
capable oF processing ESTs that have more
than two exons or intronless ESTs. In situ-
ations in which there are multiple potential
candidates at the start and end positions,
the intervening sequences are investigated
with every combination oF ends. At the in-
tron boundaries between two exons, it is
necessary to move the windows oF the two
exons so that the exon/intron junctions
obey the so-called ‘‘GT
...
AG’’ rule.
3.3
Resolution of EST Orientations by
Alignments
Individual ESTs are aligned against both
the plus and minus strands oF the genome.
However, even iF an EST is aligned to
one strand, the EST might actually be
read From the other strand. To resolve
EST orientation, considering the combi-
nations oF EST alignments on genome is
eFFective. The standard test would be to
check whether the EST contained poly-
A stretches or polyadenylation signals.
The existence oF a poly-T subsequence
at the beginning oF an EST indicates
that the sequence should be reversed
and then complemented to achieve the
correct orientation. However, this check
does not always work, since the 3
0
-end
sequences are lacking For many ESTs.
Another useFul rule oF thumb is to look
at the introns oF each EST alignment
and to inFer the orientation by utiliz-
ing the ‘‘GT
...
AG’’ rule. ±or example,
iF the intron boundary is ‘‘CT
...
AC,’’
the alignment needs to be reversed and
complemented. However, these two rules
are not able to identiFy the Frequently
encountered, intronless EST alignments
that have neither poly-A stretches nor
polyadenylation signals. In this case, it is
necessary to see whether the alignment
overlaps another EST alignment whose
strand orientation has been already con-
frmed, thereby allowing one to assign
the ambiguous EST alignment to the con-
frmed strand.
4
Use of Human Genome for Observing
Gene Expression Patterns
Sequences such as cDNAs, mRNAs, ESTs,
and partial Fragments oF human genome
have been the primary sources oF de-
signing primers and oligomers For ob-
serving gene expression patterns beFore
the elucidation oF human genome. Be-
cause the human genome involves all
the sequence inFormation, Full utilization
oF the human genome may yield novel
methods useFul in designing primers and
oligomers.
4.1
IdentiFcation of Less-frequent
Subsequences
Traditional methods oF primer design
make it diFfcult to select primers that
hybridize at only one position. Although
RepBase encompasses well-known repet-
itive sequences, such as Alu, LTR, and
LINE, its coverage oF less-Frequent repeti-
tive sequences is incomplete. ±ortunately,
previous page 756 Encyclopedia of Molecular Cell Biology and Molecular Medicine read online next page 758 Encyclopedia of Molecular Cell Biology and Molecular Medicine read online Home Toggle text on/off