The 11th Naito Conference on
Structural Genomics
- Passage to Drug Development -
October 13 - 16, 1999
Shonan Village Center
Kanagawa, Japan

Functional Genomics

Yoshihide Hayashizaki      "Mouse Genome Encyclopedia"

New Technologies - Genome Encyclopedia

Genomic Sciences Center

High throughput sequencing - 40,000 samples / day
Full length cDNA library
        1.    Elongation method
        2.    Selection method
        3.
2° structure of mRNA blocks RT at 42° C.  Increasing temp to 60° C
        + Treharaze get full length transcripts at 60°C with fewer (or no) partial cDNAs

           Treharaze is a carbohydrate which stabilized the RT at high temperature.

Suppress abundant cDNA by hybridization with biotynated RNA and selection with magnetic beads. - subtraction

        Have 71,000 species of mouse gene (~71% of total genome) - clone 10,000 full length sequences have been done.

Making full-length cDNA array.  7500 species of genes on each block.

    Cell free screening system for Protein - Protein Interactions
        cDNA -> S35 - labeled proteins
                   - attach on labeled versions to beads
                   - analyze protein-protein interactions for many proteins

Daniel Shoemaker (Rosetta Inpharmatics)        "Using Expression Profiles to Follow Target Activities and
                                                                     Enable Drug Discovery"

Funders: Friend/Harwell/Hood/Rine

Use genomics/informatics for drug discovery

Standard array technologies
        Affinitive chips vs. cDNA spotting

        cDNA spotting has lower feature density, but provides two-color detection.  Spotting is laborous, even by robotics.

Rosetta Array Technologies -
        uses inkjet oligonucleotide synthesizer

Put 50,000 oligos down on chip in few hours.  Could put C. elegans genome down on a chip in sveral hours.  High flexibility, low cost.

Uses two-color strategy with cell lines that are green/red - get yellow when both bind.

Showed chip with entire C. elegans genome on it.  16, 332 genes - chip generated in 2 days.  Comparing developmental stages.

Look at differential expression patterns;
    - look at different disease states
    - screen drugs to get similar expression patterns

Try to correlate efforts of genetics disruption patterns with patterns generated by drugs.  Pattern matching.

Hierarchical clustering reveals co-regulated genes and conditions which regulate them.

Microarray analysis in combination with gentic approaches and informatics powerfull way of correlate drug activities and expression patterns.

Sanjay Tyagi           "Detecting SN Variations with Molecular Beacons"

  _________________        Target

                  +
                 _
               (    )
                |_|
                |_|
                |_|            Molecular Beacon - fluorescence energy transfer pair
                |_|
                |  |
                D A
___________________
    |  |  |  |  |  |  |  |  |                   Literally any fluorophoe can be used as donor - each provides
 /                              \                different color, so can look for different sequences together
D                                A

Can inject into live cellulose
Injected a red molecule beacon for bicoid - red fluorescence at one end of egg; another green probe for askar RNA gives green color at other endo f egg.  Can detect RNA localization in cell.

Could look for spec
   T              C                A                G
  __            __              __             __                <- Draw B like this
 (    )          (    )            (    )           (    )
  |  |           |  |              |  |            |  |
  |  |           |  |              |  |            |  |
  |  |           |  |              |  |            |  |
  |  |           |  |              |  |            |  |
  |  |           |  |              |  |            |  |
  D A          D A              D A           D A

Can specifically recognize very specific polynucleotide substitutions; ie. single nucleotide polymorphs

Detecting generation of PCR Product

Made molecular beacon for sequence inside PCR product - can do accurate quantification of cDNAs.

* Should consider this as alternate technology for looking for maternal RNAs *
        - But I guess it is expensive to make many probes"

Claim 0.2% error rate for detecting single-nucleotide polymorphs; comparable to sequencing by Sanger

Assay for drug resistance in MycTb.  Look for specific mutations that cause drug resistance using five different molecular beacons (each of different color) -- can tell
    1) do you have drug-resistant Tb;
    2) which drug resistant mutant do you have?

Can get reprints from him

For in vivo experiments, use non-natural nucleotides in probes since otherwise the resulting heteroduplex will be recognized by RNase H and detsroyed.

Molecular beacons can be purchased from several commercial services
    ~$500/probe.  Its low as $100 / beacon

Arthur Sands        "Saturation Gene Trapping, Functional Genomics, and DrugTarget Validation"

Funder of Lexicon
 

htp gene trapping
- entire genome can be saturated
- analysis of thousands of sequences indicated that ~50% are not in any public
    db
    Based on engineered retroviral vectors; gene traps insert into genome, can engineer in
        mutagenic events, sequence tag (for gene expression),
        eg htp gene trap vector for human sequence acquisition

    Human Gene Trap Database
        ~130,000 clones

    Also working in pig, dogs, cats, etc.

Target Protein Family Screening - use unknown protein domains, membrane, proteins, etc. to select targets

Talked about target validation using transgenic knockout mouse models

    Phenotype - Driven Drug Target Validation

Lexgen.com = Internet & Genome

will put 70,000 known gene clones on internet - people will be able to sift through these data.
 

From Genome to Structure
 
Chris Sander            "Completing the Map of the Protein Universe"

How many structures do we need to solve

                 % sem id                     # structures

    Need to solve 1:10 structures / 35%
            sequence identity ~
            10,000 structures in 5 years to solve all protein structures directly or by homology modeling

Can achieve susbstantial savings by this "representative" approach

With right kind of strategy, can get substantial return by focusing on families.

    www.structuralgenomics.org
    www.genome3d.org

    Can register work in progress
    Can select targets
    Avoid duplication

    Focused on value of diversity of targets

Sung-Hou Kim            "Structural genomics of a hyperthermophile: a Pilot test"

103 - 105 genes / organisms
20 - 80% of ORFs have unknown functions
25 - 35% of ORFs code for membrane protein
8 - 20% of functional annotation needs to be checked

Structural genomics - can provide next level of fundamental data
    Showed nice phylogenetic trees of various classes of organisms

Two classic examples -
    1) discovering molecular function
        when no function is known - hypothetical proteins
            MJ0577 - 18 kD - forms dimer-ATP bound.  new ATP binding motif

            Showed that biochemical activity of ATP hydrolysis requires binding partner

            MJ0226 - small part looks like tRNA synthetase - found that protein binds various nucleotides -
                could eventually find seq. alignment in context of 3D structure.

    3rd example- 2 domains. One domain new full - second domain looks like S-adenosyl methyl transferase.

    2nd Class- cellular function known, molecular function not known

        1. Small heat shock protein.

            Structure total surprise - looks like soccer ball - hollow inside - 24-mer multimer.  500,000 MW -
            looks like a chaperone-like protein.

S. Kuramitsu        "Structural and Functional Genomics of Extreme Thermophile"

Study of DNA Repair Enzymes from T. thermophiles.
3 3D structures
Several different proteins have been crystallized.

Proteins stable, easily crystallized.

Whole cell Project
        - sequence determination
        - overproduction
        - structural analysis
        understanding
 
 

Synthesize PCR Primer
460
Correct Ampl. by PCR
194
Overproduction of E. coli
74
Purification
45
Crystals
33
3D Structure
24
10

In working out purification - use "column scouting"

Doing high-level expression and purification.
    combine with other notes.

Will distribute expression plasmids and purified proteins.

G. Montelione            "Automated Analysis of Protein NMR Spectra: Prognosis for Structural and
                                Functional Genomics"

No Notes

H. Nakamura            "3D Modeling and the Applications"

Homology Modeling does not work well for:
    1) weakly homologous regions
    2) loops
    3) other point?

Structural Database - empirical statistical properties
Free Energy Calculations -
 
 

Structural Biology of Infectious Agents
Wayne Hendrickson       "Crystallographic Study of HIV gp120: Viral Evasion Mechanism of the Immune
                                     Response"
Variational crystallization - approach to structural genomics

        John Nerdrew did this by getting myoglobin from many species at 200
            got sperm what myoglobin

        Same approach used here to get crystals of gp120

Had to eliminate some loops to get crystals

Conserved interfacial cavities (2) between gp120 and Cd4.  One is filled with water molecules.

Calorimetry shows that thermodynamics is similiar for fully intact glycolylated complex as the complex used in crystallography.

Entropic penalty is large - indicates significant conformational change upon complex formation.

Conformational flexibility helps gp120 avoid the immune system.

Peter Colman          "New Drugs for Influenza: An Example of Drug Design Against a Moving Target"

Structure-based design of new drigs for influenza

Influenza, like HIV, has hogh error rates making them readily capable of drug resistance

Neuroamididase -     only residue in catalytic site are strain invariant.  Essentially all other residues in
                             protein can change
GG167 - drug that interferes with neuroamididase binding.
            60% ready -     8 days     placebos
                                  4 days    on drug
 

David Stuart        "CrystallographicStudy of HIV Reverse Transcriptase: Towards Drug Design"

HIV-1 RT

> 70 structures at 2.15 - 3.1 Å
> 40 compounds
~ 14 mutants

Interdomain flexibility evident by comparing structures from labs of Ox-Well, Rutgers (Arnold), and Yale

Discussed structures of drug-resistance RTs

K. Wüthrich        "New NMR Techniques for Studies of Prion Proteins"

Prion Protein        PrPc - ubiquitous cellular form
                          PrPsc - infectious scrapie form
 
 

Proteins that Bind DNA
Steven Kliewer          "Orphan Nuclear Receptors in Drug Receptors"

Orphan  receptors - nuclear reception homologous to steroid hormone receptors with no known ligands

PPAR receptors
        peroxisome proliferate - activated receptors. gamma
PPARg ligand - binding domain
        - like other nuclear receptors
        - huge cavity
PPARd - 3D crystal structure
6 new pathways identified by "reverse endocrinology"
 
 

Handouts:
 
Page 1
Page 2
Page 3
Page 4
Page 5
Page 6