Lab Documents
Development of Spectral Archive of NMR Data
Presently,
the Montelione lab is working to automate the process of NMR analysis.
Automated NMR analysis techniques will play an essential role in the field
of structural genomics. As a member of the Northeast Structural Genomics
Consortium the Montelione lab will be working to solve protein structures
for a representative of every protein family in the human genome.
As one can image a project of this magnitude will generate an enormous
amount of spectral data. The purpose of the SAND project is to design
and implement a database to store the labs spectral data.
The
Spectral Archive of NMR Data (SAND) was developed using ORACLE 8i on a
Linux 6.2 platform. The primary job of the database is to track NMR
spectral files as well as related experimental data. One of the main
concerns was whether the database was capable of holding the large amount
of information it was anticipated to hold. Due to this concern it
was decided to store spectral files within the file system. In order
to organize the directory structure a dynamic directory creation system
was implemented. SAND is unique in that it creates directories on
the fly within the file system which correspond to the correct NMR file.
The program then physically moves the file from directory it was uploaded
to the corresponding directory which was just created. This
allows users to access data either by querying the database or by simply
searching through a logically created directory structure. The dynamic
directory creation and file migration is coded in java servlets, Java Server
Pages (JSP), PL/SQL, SQL, and PERL.
SAND
features a fully web based user interface which can be accessed at SANDContents.html.
To enter data the user simply enters a directory where their experimental
data is stored. SAND is then smart enough to AutoFill most of the
data entry fields and move the users experimental files to the appropriate
directory. The interface allows one to enter new spectral data or
query the database for existing datasets. Every table in the database
is able to be queried and makes use of relational operators. For
example, users can search for particular experiments conducted at specific
temperatures.
Lastly,
users can search the procpar variable values for the FIDs. The procpar
file is parsed by a PERL program which makes use of the DBI module to connect
to the database and insert procpar variables as well as their values.
In order to call the PERL program from within ORACLE it is instantiated
from a java program "wrapped" in PL/SQL which makes a LINUX system call.
Future
directions of SAND include a multi-Autofill tool for inserting multiple
experiments at once, A tool to repopulate the database from the directory
structure, a tight security system, automated backup and recovery procedures,
interfacing SAND with the labs reagents database and possibly providing
functionality to store structures.


