Connecting expression data with EntrzSteve Gullans
connect with Medline and other data, including 3Dprotein structure data
The problem with a database is that it must be constantly maintained/updated
novel, important - throw away words
Medical subject heading
Full text terms
Can use gene seed - get PubMedUIDs which mention that gene
Neighboring finds "related" papers (very relevant to listing potential functions)
seed gene --> gene neighbor
get clusters of functionally - related genes
Only 19% of automated identified neighbors were included in expert lists
(Can consider to do this analysis on each protein in PDB - or each DALI domain)
HuFL Gene ChipJim Eberwine
6606 genes currently in HuGe Index (for 5000 - full length cDNA info available)
- locus link IDs
- gene functions - only a fraction (few hundred) have annoted fcns; trying to develop this classification
Most of the genes are expressed at very low levels in various tissues.
Would be interesting to know which of those have structures in PDB.
look at Locus Link ID
now 5 chips; 30,000 human genes, mostly unannotated ESTs
Some people believe that schizophrenia is a developmental disease; can we correlate genes differentially expressed in brains of tissues from schizophrenia with developmental pathways.Barbara Dunn (Pat Brown/Botstein groups)
~6000 ORFsGerry Rubin
~7000 intergenic regions
1000 experiments X 6000 genes - much of these are on Pat Weber website - effects of stress etc.
e.g. response to stress, look at gene induction, gene repression
Common Stress Response
Response of cell is common (up or down) for many different stresses, ranging from gamma irradiation, to N2, to hydrogen peroxide, etc...
Need to figure outwhat the genes are - how they are clustered
Gene Ontology- A language for annotation of genes - www.geneontology.org
Process, Function, Cell Component
Now can figure out in clusters of genes why they are clustered
Process: electron transport
Function: cytic reductase
Cell Component: mitochondrion ? membrane
Quantitative Linguistics - guy at EBI started this
almost done ?? but only ~2000 see function annotated
Full length cDNAs from fly; then clone into vectors that allow rapid transfer to other vectorsMark Videl
Vector- Gateway system; Ed Harlow's lab most expert in this.
Generate a unique set of full-ORF clones of half of all Drosophila genes.
12,000 clones - distributed by Research Genetics
Have ~7000 unique cDNA clones
Programs using HMM works better than other like GRAIL:
- Compared gene finding programs - GAJP
very successful experiment to compare gene predicts programs
- New - validating the gene predicting by per-ing the cDNAs.
Celera - whole genome shotgun (10X coverage) + 28 Mb finished sequence from BG DP
By end of year - will have ?
sequence - Dec. 1999
paper - Feb. 2000
annotations - Feb 2000 distributed
Half of sequence done in last 2 weeks
Initial assemblies look good - working well
Same technology will be used for human
Affymetrix will make a fly chip.
Genome -----------------------------------> Signal
Sequence -----------------------------------> Transduction
Standardized Functional Assays
From microarray analysis - 500 genes in sporulationStuart Kim
Large scale 2-hybrid analysis; focused on C. elegans. Other supporting evidence; GST pu;;-down, Related phenotypes, etc.
All functional genomics methods provide way to formulate hypothesis.
All these methods require ORFs cloned into specific open reading frames - want to clone open reading frame into "universal" vector.
ORFeome - not proteome since proteins can be phosphoryl, etc.
- Bacteriophage Lambda Recombination in Phase Landing
- Full Length cDNAs from C. elegans
Gateway PCR Cloning
Select for integrating
Have very extensive cDNA library.
Pick gene from Ace DB
PCR success rate ~80%
Gateway success rate ~97%
Fold induction typically > 100. Of 35 clonings, 20 (ie 100%) gave correct inserts. Sequencing these clones is BIG effort - corresponds to 1/4 of genome sequencing project
Using thes for RNAs, 2 hybrid - focus on ? development
30 (clone proteins) X 30 2 - hybrid anlaysis ---> Paper just accepted- data not yet out on web, maybe can get preprint
55% of interact in literature confirmed
2 new ones
The C. elegans ORFeome project
Throughput: One 96 - well plate / day
Goal: 80% of ORFs in one year
Collaborators: Research Genetics (Primer)
Life Technologies (Gateway)
Priorities: 1) cloned genes
2) ORFs from EST projects
3) ORFs predicted by GeneFinder - seems these predictions are not very good
Doing PCR on 96-well plate:
Microarray - Stuart Kim
RNAi - Tony Hymen 2000 RNAi expert - cell div. early lethals
Deletions - Alan Coulson
2-Hybrids - Mark Vitali
Vectors will be in public domain.
Appears that there is no proprietary issues related to using these.
Research Genetics will have distribution rights to these vectors
DNA microarrays with 11,990 genes on chipHandouts:
Have 150 microarrays with 12,000 genes once you can make 1 - can make tons of chips; has full geome chips - can send RNA to S. Kim - ~40 labs have signed up to do this analysis with Kim
Looking at germ line development. Using these chips to look at changes in gene expression during development. Done 5X each - so you have good statistics.
653 sperm-line genes expressed
258 oocyte genes expressed
intrinsic genes - expressed in both
Sperm-line specific expression 14% kinases/phosphates (RNAi does not work in sperm - may help to explain mechanism)
Oocytes (Good targets for RNAi; good target for structure analysis) - 258 total showed 4 cell embryo notch, others in notch pathway ? - other ??? pathway
Have done RNAi on J; they have function
oocyte genes - on all chromosomes
sperm gees - few on X - on other chromosomes
intrinsic genes - hardly any on X - on other chromosome
On web site (VALUABLE DATA FOR SG PROJECT) - find which worm genes are co-regulated and which genes are differtially expressed.
Connectio with Rosetta - Stew Scherer (Rosetta) - helping with Bioinformatics
Yeast and worm the furthest ahead in this ? of array data and RNAi.
also collab. with Steve Jones (Sanger Center), Pat Brown (Stanford)
This grant is a service grant (should be able to get S. Kim to run maternal RNA analysis - but see some competitions possible.) - 40 investigating labs need to screen against his chip.
Should be able to make MGRI grant that leverages this investment by MGRI - Tom Caskey very interested
S. Tilham asked when chips would be available - S. Kim said that oligo printers (Roseta, HP) will open this up.
Research Genetics has primers - perhaps these would soon be released. Everyone has asked him for primers, but this is something he cannot amplify - so the ? is very valuable.