DAY 1 WEDSDAY APRIL 4
International Organization -- Tom Terwilliger
* Proposes that we create an International Structural Genomics Organzation
* Later in the meeting, the group identified T. Terwilliger, S. Yokoyama,
and U.
Heineman to head this Organization.
* Heineman is also organizing next Intl Structural Genomics Meeting
for Nov
2002 in Berlin.
Report of Task Force: 1. Data Capture (Helen Berman)
* Task Force report attached
* Aim to capture in the PDB all the information provided in the Methods
and
Materials section of a good
protein crystallography or NMR paper,
including extensive information
about protein production
* A complete set of data items for archiving will be developed by Helen's
committee by May 2001.
Report of Task Force: 2. Target Tracking (Steve Bryant)
* Task Force report attached
* Open exchange of target lists and progress on each target
* Data exchange must be simple
* Each SG Project would provide the following information in a common
format
- lab assigned target name
- lab name
- date of creation, date of last modification
- target sequence
- status of work
* Discussion later in the meeting (including S. Bryant, A Godzik, G
Montelione,
M Gerstein, S Brenner and
others) refined this list and the "status" definitions.
Report of Task Force: 3. Data Quality Assurance (Randy Read)
* Task Force report attached
* Is automatic triggering feasible. For X-ray, maybe. For
NMR, not yet.
Methods to validate structure
(e.g. R_free) are still in flux. Recommend
against "automatic-triggered
release"
* Proposes some guidelines for when structure is "done", but we must
leave
the decision to investigators
to decide when the structure is complete.
* Encourage rapid (< 3 weeks) deposition of structure into public
domain once
it is "complete".
In some cases, can put 6 month hold on structure to
evaluate scientific and/or
IP issues.
* Recommend depositing also raw data (diffraction data, NMR FIDs).
Also
recommend depositing chemical
shift data as soon as it is available.
Project Bottlenecks (W Hendrickson, M Linial, W Studier, chairs)
General bottlenecks:
I. Organization / Administration
- Process Management
- Administration
I.1 -- Process Management
- LIMS
- Automated Progress Reports
- Identify bottlenecks
- Improve process
- Tom Terwilliger
described Integrated database for coordination
- A Joachimiak and
Montelione/Gerstein described similar db's
- BC Wang requires
weekly written progress reports
I.2. -- Administration
- Project coordination
- hiring
- motivation of key scientists
- physical coordination; control and remote
sites
- some people concerned
about long-term personal motivation
- tracking of credit
for peoples roles in each structure is important
II. Protein Expression / Solubility
Some comments made in the discussion:
- Small project with Thermotoga maritime. 7 ORFs gave good expression,
5
provided crystals
- Success on each project is enhanced by expending a lot of time on
each target
- Berlin group - every SG project should be on the same floor in the
same building
-- when in separate institutions,
not enough communication; need to put
people as close together as possible to ensure success
- ~ 20% of proteins provide crystals; but only a portion of these can
be
optimized for data collection
Relationships Between Industry and Adcademia 1. The Structural
Genomics
Consortium (B Skeen, Burroughs Wellcome Foundation)
Stuctural Genomics Consortium - not yet a reality.
Idea - 3D structures of important human targets could be shared in
a
pre-competitive way
2000 ORFs -> 1000 protein samples -> 200 structures in 5 years
~ 15 companies at ~ $1 million each = ~ $15 million total budget
Most of activities will be in a single dedicated center
Unlikely to make target lists available
Michelle Browner (Roche, Palo Alto) -- expressed that Roche want to
partner
with any structural genomics
efforts that can enhance drug discovery
Relationships Between Industry and Adcademia 1. Structural
GenomiX
(Tim Harris)
Aim - get more structures so that more good drugs can be developed
Can access NMR as needed (Peter Wright on SAB)
Strategy - express many homologues from the family
His Tags, Ni affinity / gel filtration. Not removing tags prior
to crystallization
New SGX beamline, next door to COM-CAT
- on line in November
- $8 MM cost
4.5 people at SGX developing "joining" software to integrate data analysis
efforts.
8 hrs from data collection to structure (result of S Burley)
SGX aquires Prospect Genomics
- ab initio modeling by David Baker
- MODBASE by Sali
- docking software by Tac Kuntz
- Dan Santi also involved
- Bill Rutter will join board. Philip
Chambon (Sprout) already on board.
- want to forward integrate to drug design
Targets
- nuclear receptors
- kinases
- phosphatases
- proteases
- ion channels, GPCRs
Therepeutic Areas
- cancer
- inflammation
- metabolic disease
- infectious disease (pathogens)
Have generated ~ 20 X-ray crystal structures. Have learned more
about function
from these structures that
we expected to
Bacterial Genomes
- at least 40 sequences completed
- over 100 in progress
- express targets from 5 different genomes
per target. Get at least one
soluble protein
- 1852 bacterial targets to date
Business partners:
- Cystic Fibrosis Foundation
- 5 yr, $13 MM agreement to solve CFTR structure
- Caliper Technologies
- joint development of HT microfluid systems
- Yale - collaboration on membrane protein
structure determinations
- Argonne - beam line construction
Business development plan:
- $85 MM in private capital; Agaronne on steroids.
Partnerships with NIH Structural Genomics Centers are in progress
Discussion - how to improve synergies between commercial and academic
activities.
-Syrxx representative -- crystallization robot (~
$4 MM) will be
shared with academic group
at Scripps.
- Marv Cassman (head of NIH General Medical Sciences:
"Process of
Academic and industrial
efforts will converge. The distinction is the
"target selection"
DAY 2 THURSDAY APRIL 5
Breakfast Meeting of P50 PIs
Marv Cassman (head of NIH General Medical Sciences)
- we will need to put a lot more money into the
centers in the coming years.
- the goal of the Protein Structure Initiative in
"completeness"
John Norvell (head of NIH Structural Genomics Initiative)
- Needs summary of Organizational Process,
Mid Year Report. Due April 23.
- The goal is to get representative structure from
families with no known
structure
NIH Common Public Archive of All Target Sequences
Items - for each target
- Center id
- Target id
- Canonical
sequence
- Date stamp
- Hot link to
the group
- GenBank link
if appropriate
Guy to send around Mark's site to all center Pis. NIH will ask
PDB to provide
public archive of all targets.
Intellectual Property in Structural Genomics (Joseph Strauss)
Joseph Strauss, Patent Lawyer and Professor, George Washington University
The decision of whether to put coordinates into the public domain may
have
important legal consequences.
USA - "First to Invent", based on laboratory documentation
Most of World - "First to File"
Some things that can be patented:
* Gene Sequences.
- identification of function is necessary - but
it does not need to be
a biological function
- there are more than 3000 patents on human gene
sequences
* Amino Acid Sequences
* Proteins - if isolated, technically produced, modified, etc
* Methods of isolation
* Uses of these products, if it is an inventive step
USA
- "who first made the invention and when". Priority is based
on date invention was made, based on laboratory documentation, not on the
date of filing.
- also applies to patent rights in the US for all member states of
World Trade Organization
- priority is a matter of proof - laboratory notebooks, witnesses,
publications
Rest of World
- priority date is determined by date of patent filing
Paris Convention Right of Priority - can file in all states within 12 months; does not control "first to invent" principle
Patentability Requirements
1. Novelty - generally an invention is novel if it does not form part of the state-of-the-art
Relevant state of prior art:
USA: - 12 month grace period. Can apply for patent within 12 months
of invention
- use or oral disclosure
abroad to not constitute prior art
- do not have a standard
for disclosure on the internet
Europe - everything, - no matter where - made available to the public
before
filing date forms part of the prior art
Any substance (e.g. protein) composed in the state-of-the-art, for use
in
therapeutical, diagnostics, or surgical methods
is considered "new" if
such USE is not comprised in the state-of-the-art.
Grace period in the "first to file" system provides immunity only against
OWN disclosures, not immunity against 3rd party
disclosures. But - in
the US - no matter what others disclose, you have
priority based on date
of invention.
Novelty examination -- comparison 1:1 (all features must correspond)
2. Obviousness - whether, in view of the prior art, it is obvious
to try the
invention with reasonable chance of success.
A "surprising" function identification would not be obvious, even if
the
methods used to provide that functional information
or structure are trivial.
3. Sufficiency of Disclosure - must disclose in a way that
allows a person
skilled in the art to review the disclosed invention
at will.
Types of Patents
1. Product Patents
- if the product meets patentability requirements
- one indicated use is sufficient
- first patent applicant to describe ANY first
medical use, gets
product patent on ALL medical
uses
2. Process Patents - cover not only the process, but also what
you get
from the process.
3. Use Patents - additional uses of products or patents.
Scope of Protection
Patents claiming genetic information cover "anything derived" using
that
genetic information
Special Dependency Rule: If the overlapping sequence is not essential
to the
invention, the two patents will be regarded as independent
Relative to
splicing issues.
Research Exemption: Statutory in Europe. In USA, not clear.
Provides
freedom to use the product for R&D work prior
to the time when you start
to commercialize a product. Provides right
to use inventions to improve
them, but not to use them as research tools.
National laws override these intellectual property agreements.
Conclusion: Want to get product patents on a medical use of each
protein
structure, but to do this need to identify biological
function. Perhaps,
academic groups should try to at least guess a function
and medical use
for each protein structure released.
Report of Task Force: 4. Intellectual Propety Rights (Marv Cassman)
Marv Cassman - The goal of the program is to get the coordinates
out to
the community as soon as possible. You want
to characterize function
for IP, you do it on your own time.
If it is not high throughput - it is nothing.
J Strauss:
Stevenson-Weidler Act and Technology Transfer Act of US - require efforts
to protect intellectual property. Should look
carefully at SNP consortium
to see how well the pre-competitive strategy of
free data release works.
S Burley:
Pointed out that there are lost opportunity costs to investigators
and
institutes if patents are not pursued. Brought
up concept of "software
patent" - patenting coordinates as "machine readable
code for drug discovery
and design".
Report of Task Force: 5. Publication (Guy Dodson)
* Task Force report attached
* Encourage publication of at least a short "Structure Note", in format
like Acta Cryst. C uses for small molecules.
Several structural biology
journals will support these.
* People expect 100 - 200 structures from SG projects by end of 2001
* People will not be obliged to publish as they release coordinates.
Could
hold back publication until enough data is available
for full paper - but
would have to release coordinates in 3 wks - 6 months
after "completing"
structure.
cDNA Repository (Josh Labaer, Harvard Medical School)
Complete(extensive?) set of cDNAs for:
- human
- yeast
- fly
Prevalidated and sequenced. Enable rapid transfer to GateWay
vectors
with tags at either end, no tags, etc.
Automated NMR Structure Determination and Refinement (M. Nilges)
Excellent progress on automated NOESY analysis.
Proposes all data analysis could be done in 1 day.
HMMs based on SCOP (Cyrus Chothia)
Can identify folds for 45% of bacterial genomes and 30% of metazoan
genomes.
Claims 1% false positive rate.
Server:
stash.mrc-lmb.cam.ak/superfamily
Summary of International Structural Genomics Projects (3 - 10 min reports)
I Bertini (Italy) - provided nice summary of SG around the world.
A Joachimiak (USA) - developed web-based data base for accessing SG
data
S Burley (USA) - claims 80% of targets were soluble. Developed
"automated
bioinformatics pipeline" to generate targets.
Test - place protein at 1 mg/mL
in low saly - proteins which are monodisperse under
these conditions have 70%
likelihood to provide crystals
G Montelione / M Gerstein - ~ 20 structures from NESG so far.
SPINE db
provides approach for integrating efforts across
project and for data mining.
CryoProbes and automated analysis methods
provide resonance assignments
for BPTI in 4 hrs of data collection plus 2 hrs
of processing -- can expect
major breakthrough in NMR for SG using cryoprobes
BC Wang (USA) aiming to have 3D structure in 30 min. Focus on
single-
wavelength anamolous dispersion using S-S or S groups
in proteins on home
source
Ian Wilson (USA) - effort at Scripps is a close collaboration with
Syrxx. No
structures yet. Impressive robotization effort.
Pilot project on expressing
proteins in yeast in progress at Salk.
Berlin Structure Factory (Germany) - expressing in both E coli and
yeast.
Nice effort in hpt crystallization with robotics.
Have effort in SAR-by-NMR.
RIKEN (Japan) - see progress report at www.rsgi.riken.go.jp,
Found much better
expression in pET11a without IPTG induction (this
is funny??). Using "normal
L broth". Get much better solubilization of
some proteins at pH 6. Suggests
it helps to try solubilization with different pH
values as some proteins have
pH dependent solubility. Claims to now have
15,000 human cDNAs. They are
putting ~ 1000 of these into GateWay. Sequence
data is being released in
PDBJ? (check this and provide info to Burkhard).
Conclusions of Meeting
1. A policy statement outlining policy conclusions will be released
as Press Release
and Document in mid April.
2. Policy calls for rapid release of protein structures determined in
publicly
funded SG efforts. In general, release into
PDB would follow soon (~ 3
weeks) after completion of the structure.
Some of these may be put "on
hold" for up to 6 months to evaluate scientific
and IP issues.
3. Policy discourages patenting of coordinates without a clear use.
This seems
like a vague statement, since use is a requirement
for patenting.
4. Policy encourages relationships between publicly funded structural
genomic
centers and private entities. This is a turn
around from previous statements.
5. Major obstacles to structural genomics remain protein production
and
crystallization.
| Agenda | Handout 5 | Handout 6 |
| Handout 1 |
|
|
| Handout 2 |
|
|
| Handout 3 |
|
|
| Handout 4 |
|
|
|
|
||
|
|
||
|
|
||
|
|
||
|
|
||
|
|
||
|
|
||
|
|
||
|
|
||
|
|