Azara, v2.7, copyright (C) 1993-2002 Wayne Boucher and Department of Biochemistry, University of Cambridge.
Date help created: 28 Dec 1993 Date last updated: 30 Oct 2002If accessing this information from the azara program: When the help pauses, hit <carriage return> to continue, or any character followed by <carriage return> to stop.
e-mail address (bugs, etc.): azara@bioc.cam.ac.uk
Azara is a suite of programs to process and view NMR data. Copies of the source code are available from the above address. See the LICENSE for the terms and conditions of use. See the INSTALL notes for installation information.
The programs are available via anonymous ftp. See README-2.7 in the top-level directory. CHANGES are occasionally made to the release code.
The problem of support still needs to be worked out, but one statement definitely holds: no LICENSE, no support.
First, a quick guide to the programs currently available. [Motif] means that the Motif libraries are needed to compile the program, and an X server is needed to run it. [GL] means that the GL and X libraries are needed to compile the program, and an X server is needed to run it.
process :
A general multi-dimensional NMR processing program. It can be used just to convert unblocked data to blocked data, for use in the other programs.
plot2 : [Motif]
Allows contouring and viewing of (2-dimensional) planes from one or more data files, with hardcopy output also available. Also allows (approximate) phasing of (1-dimensional) slices (rows or columns) of the planes.
plot1 : [Motif]
Allows processing and viewing of 1-dimensional data, with 'real-time' control over arbitrary parameters. Hardcopy output is also available.
connect :
Matches crosspeaks to one or more pairs of shifts.
contours :
Contours (two-dimensional) planes from multi-dimensional data. The contours are output in a format suitable for use by Per Kraulis' program Ansig.
viewer : [GL] Allows viewing (but nothing else) of contour files created by 'contours' program.
Finds extrema in a spectrum, and optionally allows a simple parabolic fit of the extrema centers.
peak_fit :
Fits extrema (magnitude, phase, center and linewidth) in a spectrum using process scripts.
combine :
Combines two or more separate data sets, e.g. by adding them together. Only a couple of combining functions are currently defined.
project :
Projects multi-dimensional data onto chosen dimensions. It can also be used to permute the ordering of the dimensions of data. In particular, any 2 dimensions of a multi-dimensional data file can be transposed.
extract :
Extracts (hyper)planes from multi-dimensional data. Useful for testing process on smaller data sets.
deflate :
Compresses data by zeroing all data below a specified level (in absolute value), and then using 'run-length' encoding. Can be used as input to Per Kraulis' program Ansig.
reflate :
Uncompresses data compressed using deflate.
slides :
Allows multiple Postscript files to be combined into one Postscript file.
Calculates the principal components of a group of spectra.
unblock :
Converts blocked data to unblocked (i.e., sequential) data. This provides a possible route to importing data into other programs.
Each program has (most of) its source code in its own directory. To find out more about a given program, type
<program> help
There are some other directories.
global :
Contains source code that is used by more than one program.
utility :
Contains miscellaneous utility programs (e.g. 'bin2asc', which converts binary data to ascii). See the README file in the utility directory for a description of programs.
bin :
Contains copies of (links to) the programs (executables).
help :
Contains the source (text) for all the help files.
html :
Contains HTML files for use with Web browsers. This is the recommended way for viewing the help files.
azara :
Typing 'azara help' prints out this information.
The normal entry point into the suite is via the program process. All the other programs assume that the data has a 'blocked' structure. process automatically creates blocked data from unblocked (sequential) data. process accepts data that is blocked or unblocked for input.
Blocked data files do not have headers as part of the data file. Instead, associated with every data file is a so-called par file which describes the data. This par file must also exist for unblocked data files. The par file must be created by hand for unblocked data files. All the other par files needed will be created by the programs (except the referencing may need changing).
The par files are in text and so can be edited, but beware, it is important that only the referencing and file name be edited.
The processing programs all have an associated 'script' file, which specifies the input par file, the output data file, and whatever other parameters are needed. An output par file will be created, if that makes sense (it does not for contours or unblock, for example). Thus, a typical script file will look like
input <par file of input data file> output <output data file> [other parameters]
The output par file will have the name '<output data file>.par' and will appear in the directory in which the program is run, unless it cannot be created, in which case it will appear in the same directory as the <output data file>.
If another name for the output par file is desired then the following script can be used instead
input <par file of input data file> output <output data file> par <par file of output data file> [other parameters]
For explicit examples of script files for a given program, type
<program> help
The general structure of these script files, and also par files, is that each line will have the form (except for comments)
<keyword> <one or more parameter values>
All parameter values must be given explicitly for every keyword (i.e. there are no implicit default values), but some keywords are optional (such as par above).
Comments in (non-data) files are everything in a line following an occurence of the character '!'. Blanks lines are allowed. White space separates parameters in a line.
Dimensions of data always present a problem with conventions. In Azara the dimension of data that is 'fastest' on disk is 'dim 1', the dimension that is second fastest is 'dim 2', etc. Thus the acquisition dimension in NMR experiments will be dimension 1.
Point counting is another place where there is a problem with conventions. In par files, points are counted in real points, even for complex data. This is to avoid having to specify whether the data is real or complex in the given dimensions. Thus a dimension with 16 complex points would have 32 points in the par file. However, in the program process, commands that need points assume that the count is given in terms of complex points for complex data.
All binary data files exported from the processing programs have the data as 4-byte floating point (with exceptions of the programs contours and deflate, which have some integer data).
For more information about blocked data (and the casual user will not need to know any more), type
azara help blocked
and for more information about par files (and every user will need to know more), type
azara help par
It is easiest to describe blocked data by considering an example. The corresponding statements are true no matter what the dimension.
Let N1, N2, N3 be the number of (real) points in the three dimensions of a three-dimensional data set.
A sequential ordering of the data has N3 sets of (N2 sets of N1 data points). A blocked data file chops up this 'cube' of data into sub-cubes. This makes for faster access of the data in dimensions 2 and 3.
Let B1, B2, B3 be the number of points in the three dimensions of a block (here, sub-cube). Then B = B1 x B2 x B3 is the size of one block.
The first B points in the blocked data file correspond to the first sub-cube of the cube of the sequential data file, the next B points correspond to the second sub-cube, etc. The ordering of the data in a sub-cube is inherited from the ordering of the sequential data file.
A block may be specified by its position in the (blocked) cube in the same way a point may be specified by its position in the (sequentially ordered) cube. This position may either be specified as a 3-vector (thinking of the data geometrically) or as a single number (thinking of the sequential ordering).
Blocked data files always have an integral number of blocks, even if N1 (resp. N2, N3) is not a multiple of B1 (resp. B2, B3). This padding of data can waste a bit of disk space, but such is life. Let M1 (resp. M2, M3) be the smallest multiple of B1 (resp. B2, B3) that is >= N1 (resp. N2, N3). Then the blocked data file is actually of size M1 x M2 x M3.
As an example, consider the (3-vector) point (x1, x2, x3) in the cube. This is position x1 + x2*N1 + x3*N1*N2 in the sequentially ordered file. It is also point (x1 % B1, x2 % B2, x3 % B3) in block (x1/B1, x2/B2, x3/B3). (Here, % means remainder, and x/B means the integral part of the quotient.) Conversely, point (y1, y2, y3) in block (b1, b2, b3) corresponds to the point (y1 + b1*B1, y2 + b2*B2, y3 + b3*B3) in the cube.
In Azara, B1, B2 and B3 are powers of 2. This is for the convenience of typical NMR processing. However, all of the block access routines are written so that B1, B2 and B3 could be anything. Again, the corresponding statements are true no matter what the dimension.
Blocked files do not have headers, they are just rearrangements of sequential data files. In place of headers there are par files. To find out more information about par files, type
azara help par
A par file is used to describe the dimensions, referencing, etc., of a data set. It is a text file, hence can be edited. A par file must have the following at the very least:
ndim <number of dimensions of associated data set> file <file name of assocated data set>
and then for each dimension
dim <dimension number, from 1 to number of dimensions> npts <number of (real) points for this dimension>
Optionally, before the occurence of the first 'dim', there may be one or more of
head <length of header of data file, in (4-byte) words> int ! integer (i.e. not floating point) data swap ! data has wrong byte ordering big_endian ! data file has big endian byte ordering little_endian ! data file has little endian byte ordering deflate <level> ! data has been compressed at <level> reflate <level> ! data has been compressed at <level> ! and then uncompressed blocks <desired block sizes each of the dimensions> varian <dimension ordering>
and for each dimension there may optionally be one or more of
block <block size for this dimension> sw <spectral width in Hz, e.g. 8065> sf <spectrometer frequency in MHz, e.g. 600> refppm <ref. ppm of ref. point for this dim., e.g. 4.72> refpt <reference point for this dimension, e.g. 512.5> nuc <nucleus for this dimension, e.g. 1H or 13C or 15N>
and for one dimension there may optionally be
params ! list of length npts parameters
If 'block' occurs for one dimension it must occur for all. If it does not occur then the associated data file is assumed to be unblocked (i.e. sequential). Only process allows the data to be unblocked.
If 'blocks' occurs then the data file must be unblocked, and hence is only relevant for data files that are used as input to process. The specified block sizes are then used for the output data file (otherwise process determines the block sizes).
'varian' should be used for data acquired on Varian spectrometers, and in this case <dimension ordering> specifies the ordering of the data. For example, for a 3D experiment where the data for each FID in the acquistion dimension is ordered as RR, IR, RI, II (R = real, I = imaginary) then 'varian 2 3' would be used, whereas if the data is ordered as RR, RI, IR, II then 'varian 3 2' would be used. Currently this is only allowed (and should only be needed) for process.
'params' is currently only used in plot2 in the fitting module. It can specify, for example, relaxation time or temperature or pH that is being varied from one plane to the next in a 3D experiment.
Any of the referencing information that is missing will be given a default (which will be wrong of course). Only some of the programs make use of the referencing information.
It is suggested that the correct referencing be put in the initial par file (i.e., for the unblocked data), but it should be remembered that process does not modify this information in any way. Alternatively, the referencing in the par file of the spectrum should be edited.
SPECIAL WARNING: Incorrect referencing can be a source of error. You have been warned.
The par file must be created by hand for unblocked data files. All the other par files needed will be created by the programs.
A typical par file for input to process might be (if the data was collected on a Bruker AMX and processed on a Silicon Graphics Indigo, or other Unix machine with the same byte ordering)
! /usr/people/wb104/edl387/edl387_5.bin.par
ndim 3 ! data is 3 dimensional
file /usr/people/wb104/edl387/edl387_5.bin ! name of data file ! use of the full path name is recommended
int ! data is integer swap ! data has wrong byte ordering
dim 1 ! dimension 1 parameters npts 1024 ! 1024 (real) points
dim 2 ! dimension 2 parameters npts 256 ! 256 (real) points
dim 3 ! dimension 3 parameters npts 64 ! 64 (real) points
and using this, a typical par file output by process might be
! /usr/people/wb104/edl387/edl387_5.spc.par
ndim 3 file /usr/people/wb104/edl387/edl387_5.spc
dim 1 npts 512 block 64 sw 1000.00 sf 500.00 refppm 1.00 refpt 1.0 nuc 1H
dim 2 npts 256 block 16 sw 1000.00 sf 500.00 refppm 1.00 refpt 1.0 nuc 1H
dim 3 npts 64 block 4 sw 1000.00 sf 500.00 refppm 1.00 refpt 1.0 nuc 1H
and this might be then correctly referenced to read
! /usr/people/wb104/edl387/edl387_5.spc.par
ndim 3 file /usr/people/wb104/edl387/edl387_5.spc
dim 1 npts 512 block 64 sw 4032.8 sf 600.1 refppm 4.72 refpt 512.5 nuc 1H
dim 2 npts 256 block 16 sw 8065.5 sf 600.1 refppm 4.72 refpt 128.5 nuc 1H
dim 3 npts 64 block 4 sw 1016.7 sf 60.82 refppm 117.4 refpt 32.5 nuc 15N
This correct referencing could have been put in the original par file.
Azara help: azara / W. Boucher / azara@bioc.cam.ac.uk