DAY 1 WEDSDAY APRIL 4
International Organization -- Tom Terwilliger
* Proposes that we create an International Structural Genomics Organzation
* Later in the meeting, the group identified T. Terwilliger, S. Yokoyama, and U.
Heineman to head this Organization.
* Heineman is also organizing next Intl Structural Genomics Meeting for Nov
2002 in Berlin.
Report of Task Force: 1. Data Capture (Helen Berman)
* Task Force report attached
* Aim to capture in the PDB all the information provided in the Methods and
Materials section of a good protein crystallography or NMR paper,
including extensive information about protein production
* A complete set of data items for archiving will be developed by Helen's
committee by May 2001.
Report of Task Force: 2. Target Tracking (Steve Bryant)
* Task Force report attached
* Open exchange of target lists and progress on each target
* Data exchange must be simple
* Each SG Project would provide the following information in a common format
- lab assigned target name
- lab name
- date of creation, date of last modification
- target sequence
- status of work
* Discussion later in the meeting (including S. Bryant, A Godzik, G Montelione,
M Gerstein, S Brenner and others) refined this list and the "status" definitions.
Report of Task Force: 3. Data Quality Assurance (Randy Read)
* Task Force report attached
* Is automatic triggering feasible. For X-ray, maybe. For NMR, not yet.
Methods to validate structure (e.g. R_free) are still in flux. Recommend
against "automatic-triggered release"
* Proposes some guidelines for when structure is "done", but we must leave
the decision to investigators to decide when the structure is complete.
* Encourage rapid (< 3 weeks) deposition of structure into public domain once
it is "complete". In some cases, can put 6 month hold on structure to
evaluate scientific and/or IP issues.
* Recommend depositing also raw data (diffraction data, NMR FIDs). Also
recommend depositing chemical shift data as soon as it is available.
Project Bottlenecks (W Hendrickson, M Linial, W Studier, chairs)
I. Organization / Administration
- Process Management
I.1 -- Process Management
- Automated Progress Reports
- Identify bottlenecks
- Improve process
- Tom Terwilliger described Integrated database for coordination
- A Joachimiak and Montelione/Gerstein described similar db's
- BC Wang requires weekly written progress reports
I.2. -- Administration
- Project coordination
- motivation of key scientists
- physical coordination; control and remote sites
- some people concerned about long-term personal motivation
- tracking of credit for peoples roles in each structure is important
II. Protein Expression / Solubility
Some comments made in the discussion:
- Small project with Thermotoga maritime. 7 ORFs gave good expression,
- Success on each project is enhanced by expending a lot of time on each target
- Berlin group - every SG project should be on the same floor in the same building
-- when in separate institutions, not enough communication; need to put
people as close together as possible to ensure success
- ~ 20% of proteins provide crystals; but only a portion of these can be
optimized for data collection
Relationships Between Industry and Adcademia 1. The Structural
Consortium (B Skeen, Burroughs Wellcome Foundation)
Stuctural Genomics Consortium - not yet a reality.
Idea - 3D structures of important human targets could be shared in a
2000 ORFs -> 1000 protein samples -> 200 structures in 5 years
~ 15 companies at ~ $1 million each = ~ $15 million total budget
Most of activities will be in a single dedicated center
Unlikely to make target lists available
Michelle Browner (Roche, Palo Alto) -- expressed that Roche want to partner
with any structural genomics efforts that can enhance drug discovery
Relationships Between Industry and Adcademia 1. Structural
Aim - get more structures so that more good drugs can be developed
Can access NMR as needed (Peter Wright on SAB)
Strategy - express many homologues from the family
His Tags, Ni affinity / gel filtration. Not removing tags prior to crystallization
New SGX beamline, next door to COM-CAT
- on line in November
- $8 MM cost
4.5 people at SGX developing "joining" software to integrate data analysis efforts.
8 hrs from data collection to structure (result of S Burley)
SGX aquires Prospect Genomics
- ab initio modeling by David Baker
- MODBASE by Sali
- docking software by Tac Kuntz
- Dan Santi also involved
- Bill Rutter will join board. Philip Chambon (Sprout) already on board.
- want to forward integrate to drug design
- nuclear receptors
- ion channels, GPCRs
- metabolic disease
- infectious disease (pathogens)
Have generated ~ 20 X-ray crystal structures. Have learned more about function
from these structures that we expected to
- at least 40 sequences completed
- over 100 in progress
- express targets from 5 different genomes per target. Get at least one
- 1852 bacterial targets to date
- Cystic Fibrosis Foundation
- 5 yr, $13 MM agreement to solve CFTR structure
- Caliper Technologies
- joint development of HT microfluid systems
- Yale - collaboration on membrane protein structure determinations
- Argonne - beam line construction
Business development plan:
- $85 MM in private capital; Agaronne on steroids.
Partnerships with NIH Structural Genomics Centers are in progress
Discussion - how to improve synergies between commercial and academic activities.
-Syrxx representative -- crystallization robot (~ $4 MM) will be
shared with academic group at Scripps.
- Marv Cassman (head of NIH General Medical Sciences: "Process of
Academic and industrial efforts will converge. The distinction is the
DAY 2 THURSDAY APRIL 5
Breakfast Meeting of P50 PIs
Marv Cassman (head of NIH General Medical Sciences)
- we will need to put a lot more money into the centers in the coming years.
- the goal of the Protein Structure Initiative in "completeness"
John Norvell (head of NIH Structural Genomics Initiative)
- Needs summary of Organizational Process, Mid Year Report. Due April 23.
- The goal is to get representative structure from families with no known
NIH Common Public Archive of All Target Sequences
Items - for each target
- Center id
- Target id
- Canonical sequence
- Date stamp
- Hot link to the group
- GenBank link if appropriate
Guy to send around Mark's site to all center Pis. NIH will ask PDB to provide
public archive of all targets.
Intellectual Property in Structural Genomics (Joseph Strauss)
Joseph Strauss, Patent Lawyer and Professor, George Washington University
The decision of whether to put coordinates into the public domain may
important legal consequences.
USA - "First to Invent", based on laboratory documentation
Most of World - "First to File"
Some things that can be patented:
* Gene Sequences.
- identification of function is necessary - but it does not need to be
a biological function
- there are more than 3000 patents on human gene sequences
* Amino Acid Sequences
* Proteins - if isolated, technically produced, modified, etc
* Methods of isolation
* Uses of these products, if it is an inventive step
- "who first made the invention and when". Priority is based on date invention was made, based on laboratory documentation, not on the date of filing.
- also applies to patent rights in the US for all member states of World Trade Organization
- priority is a matter of proof - laboratory notebooks, witnesses, publications
Rest of World
- priority date is determined by date of patent filing
Paris Convention Right of Priority - can file in all states within 12 months; does not control "first to invent" principle
1. Novelty - generally an invention is novel if it does not form part of the state-of-the-art
Relevant state of prior art:
USA: - 12 month grace period. Can apply for patent within 12 months
- use or oral disclosure abroad to not constitute prior art
- do not have a standard for disclosure on the internet
Europe - everything, - no matter where - made available to the public
filing date forms part of the prior art
Any substance (e.g. protein) composed in the state-of-the-art, for use
therapeutical, diagnostics, or surgical methods is considered "new" if
such USE is not comprised in the state-of-the-art.
Grace period in the "first to file" system provides immunity only against
OWN disclosures, not immunity against 3rd party disclosures. But - in
the US - no matter what others disclose, you have priority based on date
Novelty examination -- comparison 1:1 (all features must correspond)
2. Obviousness - whether, in view of the prior art, it is obvious
to try the
invention with reasonable chance of success.
A "surprising" function identification would not be obvious, even if
methods used to provide that functional information or structure are trivial.
3. Sufficiency of Disclosure - must disclose in a way that
allows a person
skilled in the art to review the disclosed invention at will.
Types of Patents
1. Product Patents
- if the product meets patentability requirements
- one indicated use is sufficient
- first patent applicant to describe ANY first medical use, gets
product patent on ALL medical uses
2. Process Patents - cover not only the process, but also what
from the process.
3. Use Patents - additional uses of products or patents.
Scope of Protection
Patents claiming genetic information cover "anything derived" using
Special Dependency Rule: If the overlapping sequence is not essential
invention, the two patents will be regarded as independent Relative to
Research Exemption: Statutory in Europe. In USA, not clear.
freedom to use the product for R&D work prior to the time when you start
to commercialize a product. Provides right to use inventions to improve
them, but not to use them as research tools.
National laws override these intellectual property agreements.
Conclusion: Want to get product patents on a medical use of each
structure, but to do this need to identify biological function. Perhaps,
academic groups should try to at least guess a function and medical use
for each protein structure released.
Report of Task Force: 4. Intellectual Propety Rights (Marv Cassman)
Marv Cassman - The goal of the program is to get the coordinates
the community as soon as possible. You want to characterize function
for IP, you do it on your own time.
If it is not high throughput - it is nothing.
Stevenson-Weidler Act and Technology Transfer Act of US - require efforts
to protect intellectual property. Should look carefully at SNP consortium
to see how well the pre-competitive strategy of free data release works.
Pointed out that there are lost opportunity costs to investigators and
institutes if patents are not pursued. Brought up concept of "software
patent" - patenting coordinates as "machine readable code for drug discovery
Report of Task Force: 5. Publication (Guy Dodson)
* Task Force report attached
* Encourage publication of at least a short "Structure Note", in format
like Acta Cryst. C uses for small molecules. Several structural biology
journals will support these.
* People expect 100 - 200 structures from SG projects by end of 2001
* People will not be obliged to publish as they release coordinates. Could
hold back publication until enough data is available for full paper - but
would have to release coordinates in 3 wks - 6 months after "completing"
cDNA Repository (Josh Labaer, Harvard Medical School)
Complete(extensive?) set of cDNAs for:
Prevalidated and sequenced. Enable rapid transfer to GateWay vectors
with tags at either end, no tags, etc.
Automated NMR Structure Determination and Refinement (M. Nilges)
Excellent progress on automated NOESY analysis.
Proposes all data analysis could be done in 1 day.
HMMs based on SCOP (Cyrus Chothia)
Can identify folds for 45% of bacterial genomes and 30% of metazoan
Claims 1% false positive rate.
Summary of International Structural Genomics Projects (3 - 10 min reports)
I Bertini (Italy) - provided nice summary of SG around the world.
A Joachimiak (USA) - developed web-based data base for accessing SG data
S Burley (USA) - claims 80% of targets were soluble. Developed "automated
bioinformatics pipeline" to generate targets. Test - place protein at 1 mg/mL
in low saly - proteins which are monodisperse under these conditions have 70%
likelihood to provide crystals
G Montelione / M Gerstein - ~ 20 structures from NESG so far. SPINE db
provides approach for integrating efforts across project and for data mining.
CryoProbes and automated analysis methods provide resonance assignments
for BPTI in 4 hrs of data collection plus 2 hrs of processing -- can expect
major breakthrough in NMR for SG using cryoprobes
BC Wang (USA) aiming to have 3D structure in 30 min. Focus on single-
wavelength anamolous dispersion using S-S or S groups in proteins on home
Ian Wilson (USA) - effort at Scripps is a close collaboration with Syrxx. No
structures yet. Impressive robotization effort. Pilot project on expressing
proteins in yeast in progress at Salk.
Berlin Structure Factory (Germany) - expressing in both E coli and yeast.
Nice effort in hpt crystallization with robotics. Have effort in SAR-by-NMR.
RIKEN (Japan) - see progress report at www.rsgi.riken.go.jp, Found much better
expression in pET11a without IPTG induction (this is funny??). Using "normal
L broth". Get much better solubilization of some proteins at pH 6. Suggests
it helps to try solubilization with different pH values as some proteins have
pH dependent solubility. Claims to now have 15,000 human cDNAs. They are
putting ~ 1000 of these into GateWay. Sequence data is being released in
PDBJ? (check this and provide info to Burkhard).
Conclusions of Meeting
1. A policy statement outlining policy conclusions will be released
as Press Release
and Document in mid April.
2. Policy calls for rapid release of protein structures determined in
funded SG efforts. In general, release into PDB would follow soon (~ 3
weeks) after completion of the structure. Some of these may be put "on
hold" for up to 6 months to evaluate scientific and IP issues.
3. Policy discourages patenting of coordinates without a clear use.
like a vague statement, since use is a requirement for patenting.
4. Policy encourages relationships between publicly funded structural
centers and private entities. This is a turn around from previous statements.
5. Major obstacles to structural genomics remain protein production
|Agenda||Handout 5||Handout 6|