Functional Genomics
Yoshihide Hayashizaki "Mouse Genome Encyclopedia"From Genome to StructureNew Technologies - Genome Encyclopedia
Genomic Sciences Center
High throughput sequencing - 40,000 samples / day
Full length cDNA library
1. Elongation method
2. Selection method
3.
2° structure of mRNA blocks RT at 42° C. Increasing temp to 60° C
+ Treharaze get full length transcripts at 60°C with fewer (or no) partial cDNAsTreharaze is a carbohydrate which stabilized the RT at high temperature.
Suppress abundant cDNA by hybridization with biotynated RNA and selection with magnetic beads. - subtraction
Have 71,000 species of mouse gene (~71% of total genome) - clone 10,000 full length sequences have been done.
Making full-length cDNA array. 7500 species of genes on each block.
Cell free screening system for Protein - Protein Interactions
cDNA -> S35 - labeled proteins
- attach on labeled versions to beads
- analyze protein-protein interactions for many proteinsDaniel Shoemaker (Rosetta Inpharmatics) "Using Expression Profiles to Follow Target Activities and
Enable Drug Discovery"Funders: Friend/Harwell/Hood/Rine
Use genomics/informatics for drug discovery
Standard array technologies
Affinitive chips vs. cDNA spottingcDNA spotting has lower feature density, but provides two-color detection. Spotting is laborous, even by robotics.
Rosetta Array Technologies -
uses inkjet oligonucleotide synthesizerPut 50,000 oligos down on chip in few hours. Could put C. elegans genome down on a chip in sveral hours. High flexibility, low cost.
Uses two-color strategy with cell lines that are green/red - get yellow when both bind.
Showed chip with entire C. elegans genome on it. 16, 332 genes - chip generated in 2 days. Comparing developmental stages.
Look at differential expression patterns;
- look at different disease states
- screen drugs to get similar expression patternsTry to correlate efforts of genetics disruption patterns with patterns generated by drugs. Pattern matching.
Hierarchical clustering reveals co-regulated genes and conditions which regulate them.
Microarray analysis in combination with gentic approaches and informatics powerfull way of correlate drug activities and expression patterns.
Sanjay Tyagi "Detecting SN Variations with Molecular Beacons"
_________________ Target
+
_
( )
|_|
|_|
|_| Molecular Beacon - fluorescence energy transfer pair
|_|
| |
D A
___________________
| | | | | | | | | Literally any fluorophoe can be used as donor - each provides
/ \ different color, so can look for different sequences together
D ACan inject into live cellulose
Injected a red molecule beacon for bicoid - red fluorescence at one end of egg; another green probe for askar RNA gives green color at other endo f egg. Can detect RNA localization in cell.Could look for spec
T C A G
__ __ __ __ <- Draw B like this
( ) ( ) ( ) ( )
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
D A D A D A D ACan specifically recognize very specific polynucleotide substitutions; ie. single nucleotide polymorphs
Detecting generation of PCR Product
Made molecular beacon for sequence inside PCR product - can do accurate quantification of cDNAs.
* Should consider this as alternate technology for looking for maternal RNAs *
- But I guess it is expensive to make many probes"Claim 0.2% error rate for detecting single-nucleotide polymorphs; comparable to sequencing by Sanger
Assay for drug resistance in MycTb. Look for specific mutations that cause drug resistance using five different molecular beacons (each of different color) -- can tell
1) do you have drug-resistant Tb;
2) which drug resistant mutant do you have?Can get reprints from him
For in vivo experiments, use non-natural nucleotides in probes since otherwise the resulting heteroduplex will be recognized by RNase H and detsroyed.
Molecular beacons can be purchased from several commercial services
~$500/probe. Its low as $100 / beaconArthur Sands "Saturation Gene Trapping, Functional Genomics, and DrugTarget Validation"
Funder of Lexicon
htp gene trappingTarget Protein Family Screening - use unknown protein domains, membrane, proteins, etc. to select targets- entire genome can be saturatedBased on engineered retroviral vectors; gene traps insert into genome, can engineer in
- analysis of thousands of sequences indicated that ~50% are not in any public
db
mutagenic events, sequence tag (for gene expression),
eg htp gene trap vector for human sequence acquisitionHuman Gene Trap Database
~130,000 clonesAlso working in pig, dogs, cats, etc.
Talked about target validation using transgenic knockout mouse models
Phenotype - Driven Drug Target Validation
Lexgen.com = Internet & Genome
will put 70,000 known gene clones on internet - people will be able to sift through these data.
Chris Sander "Completing the Map of the Protein Universe"Structural Biology of Infectious AgentsHow many structures do we need to solve
% sem id # structures
Need to solve 1:10 structures / 35%
sequence identity ~
10,000 structures in 5 years to solve all protein structures directly or by homology modelingCan achieve susbstantial savings by this "representative" approach
With right kind of strategy, can get substantial return by focusing on families.
www.structuralgenomics.org
www.genome3d.orgCan register work in progress
Can select targets
Avoid duplicationFocused on value of diversity of targets
Sung-Hou Kim "Structural genomics of a hyperthermophile: a Pilot test"
103 - 105 genes / organisms
20 - 80% of ORFs have unknown functions
25 - 35% of ORFs code for membrane protein
8 - 20% of functional annotation needs to be checkedStructural genomics - can provide next level of fundamental data
Showed nice phylogenetic trees of various classes of organismsTwo classic examples -
1) discovering molecular function
when no function is known - hypothetical proteins
MJ0577 - 18 kD - forms dimer-ATP bound. new ATP binding motifShowed that biochemical activity of ATP hydrolysis requires binding partner
MJ0226 - small part looks like tRNA synthetase - found that protein binds various nucleotides -
could eventually find seq. alignment in context of 3D structure.3rd example- 2 domains. One domain new full - second domain looks like S-adenosyl methyl transferase.
2nd Class- cellular function known, molecular function not known
1. Small heat shock protein.
Structure total surprise - looks like soccer ball - hollow inside - 24-mer multimer. 500,000 MW -
looks like a chaperone-like protein.S. Kuramitsu "Structural and Functional Genomics of Extreme Thermophile"
Study of DNA Repair Enzymes from T. thermophiles.
3 3D structures
Several different proteins have been crystallized.Proteins stable, easily crystallized.
Whole cell Project
- sequence determination
- overproduction
- structural analysis
understanding
Synthesize PCR Primer 460 Correct Ampl. by PCR 194 Overproduction of E. coli 74 Purification 45 Crystals 33 3D Structure 24 10 In working out purification - use "column scouting"
Doing high-level expression and purification.
combine with other notes.Will distribute expression plasmids and purified proteins.
G. Montelione "Automated Analysis of Protein NMR Spectra: Prognosis for Structural and
Functional Genomics"No Notes
H. Nakamura "3D Modeling and the Applications"
Homology Modeling does not work well for:
1) weakly homologous regions
2) loops
3) other point?Structural Database - empirical statistical properties
Free Energy Calculations -
Wayne Hendrickson "Crystallographic Study of HIV gp120: Viral Evasion Mechanism of the Immune
Response"
Variational crystallization - approach to structural genomicsProteins that Bind DNAJohn Nerdrew did this by getting myoglobin from many species at 200
got sperm what myoglobinSame approach used here to get crystals of gp120
Had to eliminate some loops to get crystals
Conserved interfacial cavities (2) between gp120 and Cd4. One is filled with water molecules.
Calorimetry shows that thermodynamics is similiar for fully intact glycolylated complex as the complex used in crystallography.
Entropic penalty is large - indicates significant conformational change upon complex formation.
Conformational flexibility helps gp120 avoid the immune system.
Peter Colman "New Drugs for Influenza: An Example of Drug Design Against a Moving Target"
Structure-based design of new drigs for influenza
Influenza, like HIV, has hogh error rates making them readily capable of drug resistance
Neuroamididase - only residue in catalytic site are strain invariant. Essentially all other residues in
protein can change
GG167 - drug that interferes with neuroamididase binding.
60% ready - 8 days placebos
4 days on drug
David Stuart "CrystallographicStudy of HIV Reverse Transcriptase: Towards Drug Design"
HIV-1 RT
> 70 structures at 2.15 - 3.1 Å
> 40 compounds
~ 14 mutantsInterdomain flexibility evident by comparing structures from labs of Ox-Well, Rutgers (Arnold), and Yale
Discussed structures of drug-resistance RTs
K. Wüthrich "New NMR Techniques for Studies of Prion Proteins"
Prion Protein PrPc - ubiquitous cellular form
PrPsc - infectious scrapie form
Steven Kliewer "Orphan Nuclear Receptors in Drug Receptors"Handouts:Orphan receptors - nuclear reception homologous to steroid hormone receptors with no known ligands
PPAR receptors
peroxisome proliferate - activated receptors. gamma
PPARg ligand - binding domain
- like other nuclear receptors
- huge cavity
PPARd - 3D crystal structure
6 new pathways identified by "reverse endocrinology"
Page 1
Page 2
Page 3
Page 4
Page 5
Page 6