Date help created: 25 Sep 1995 Date last updated: 18 Oct 1996'peak_fit' fits peaks in multi-dimensional NMR spectra. The companion program peak_find finds extrema to fit.
The input data has to be blocked. The input data must be real.
To run the program type
peak_fit [<memory in Mwords>] <peak_fit script file>
The <memory in Mwords> is optional. By default 2 Mwords (8 Mbytes) are allocated for the main storage. The more storage that is allocated the less i/o to and from disk is required, in general.
The program is still in its early stages of development (even more so than the companion program peak_find). It can be used in conjunction with Per Kraulis' Ansig program by using the macros written by Colin Hardman.
The program works by starting with an (ideal) oscillator in time-space, and using a specified process script to determine how this oscillator is transformed into frequency-space. The transformed data is compared with the actual data, and the oscillator parameters adjusted until an acceptable fit (if one exists) is obtained.
It is expected that the script will be similar to that used to process the spectrum in the first place, but, for example, there is normally no phasing required starting from the ideal oscillator even if the actual spectrum was phased.
Only peaks that satisfy a quality criterion are output.
More details of the fitting procedure may be obtained by typing
peak_fit help fitting
The peak_fit script file has dimension-independent key words followed by, for zero or more dimensions, dimension-dependent key words. The dimension-independent key words must appear before the dimension-dependent ones.
There must be no more than one key word per line.
Below <...> represents an argument for a key word and [...] represents a key word that is optional.
The syntax for the dimension-independent key words are
input_data <par file of input data file> input_peak <input peak file> [ output_peak <output peak file> ] [ output_ansig <output peak file, Ansig variant> ] [ output_ideal <output ideal file> ] [ output_rest <output rest file> ] [ par_ideal <par file of output ideal file> ] [ par_rest <par file of output rest file> ] chisq <maximum chisq> [ subtract ] [ group ] [ deflate <level> ]
The syntax for the dimension-dependent key words (the non-optional ones must appear for any dimension that is fitted) are
dim <dimension> width <half-width in points> npts <number of points for oscillator> [ complex ] [ script <process script file> ] [ script_com ] [ freq <zero-frequency point> <fraction of sw> ] [ range <first point> <last point> ] [ phase <fixed phase> ] [ decay <fixed decay> ]
A description of the key words may be obtained by typing
peak_fit help <key word>
A description of the format of the input peak file (the output peak file for peak_find) may be obtained by typing
peak_find help peak_format
A description of the format of the output peak file may be obtained by typing
peak_fit help peak_format
or
peak_fit help peak_ansig
for the Ansig variant.
The peak file produced by the program has an ascii tab-separated format, with two header lines followed by one line (record; row) per extremum. The first header line contains the column titles. The second header line contains an 'N' in each column.
The first column contains the extremum value (title 'extr') and the remaining columns contain the extremum position in points (title 'pntD') and ppm (title 'ppmD') in each dimension, D, then the same data adjusted for the previous peaks if their fitting is to be subtracted (titles 'extrs', 'pntsD' and 'ppmsD'), then the magnitude of the fitted peak (title 'magn'), which can be negative, the chi sqaured of the fit (title 'chisq'), and finally the phase (title 'phaseD'), decay (title 'decayD') and fitted position in points (title 'pntfD') and ppm (title 'ppmfD') in each dimension.
The value of 'magn' is the magnitude of the fitted oscillator. This thus depends on the process script used to do the fitting.
The value of 'chisq' is a scaled fitted chisq.
The value of 'phaseD' is in degrees, and represents the amount by which the peak should have been phased in addition to whatever it (and all else) is phased in the process script.
The value of 'decayD', represents the amount to which the oscillator has decayed at its last point relative to its first point, and is the same as that used in 'decay' in process and plot1.
As an example, in 2D if both dimensions are fitted, the titles would be, if the fitting of previous peaks are subtracted,
extr pnt1 ppm1 pnt2 ppm2 extrs pnts1 ppms1 pnts2 ppms2 magn chisq phase1 decay1 pntf1 ppmf1 phase2 decay2 pntf2 ppmf2
and, if the fitting of previous peaks are not subtracted,
extr pnt1 ppm1 pnt2 ppm2 magn chisq phase1 decay1 pntf1 ppmf1 phase2 decay2 pntf2 ppmf2
Note, these titles all appear on one (the first) line.
More details of the fitting procedure may be obtained by typing
peak_fit help fitting
There is an alternative peak file produced by the program for import into Ansig. This has multiple lines for the header and also for each extremum (record; row) because Ansig has a limit of 80 characters per line. The data has an ascii tab-separated format. The data output is a subset of the normal peak_format output.
The first line (of the header and each extremum) contains the extremum value (title 'extr'), the magnitude of the fitted peak (title 'magn'), which can be negative, the chi sqaured of the fit (title 'chisq') and finally the adjusted extremum value (title 'extrs') if the subtract option is used.
The second line contains the extremum position in ppm (title 'ppmD') in each dimension, D.
The third line contains the adjusted extremum position in ppm (title 'ppmsD') in each dimension, D, if the subtract option is used.
The final line contains the fitted extremem position in ppm (title 'ppmfD') in each dimension, D.
As an example, in 3D if all dimensions are fitted, the titles would be, if the fitting of previous peaks are subtracted,
extr magn chisq extrs ppm1 ppm2 ppm3 ppms1 ppms2 ppms3 ppmf1 ppmf2 ppmf3
and, if the fitting of previous peaks are not subtracted,
extr magn chisq ppm1 ppm2 ppm3 ppmf1 ppmf2 ppmf3
More details of the fitting procedure may be obtained by typing
peak_fit help fitting
The program assumes that in an N-dimensional spectrum, the time-space representation of each peak is a product of N decaying oscillators, one oscillator for each dimension. Each oscillator can be specified by three parameters: the frequency, the (constant) phase and the decay. Including an overall magnitude means that the time-space representation is specified by 3N+1 parameters. This time-space ideal data is transformed to frequency-space via a user-specified script, and the parameters are adjusted until an acceptable fit is obtained (if one exists).
The script is similar to that used in process. However, currently there are some restrictions that do not occur in process, namely there can only be one script for each dimension, and there is no maximum entropy in two or three dimensions.
Currently the fitting is done using the data in a fixed-size hypercube around the extremum.
Not all dimensions have to be fitted, but it is generally recommended that they are.
input_data <par file of input data file>
This specifies the input data file, and must be the first key word in the script file.
input_peak <input peak file>
This specifies the input peak file. A description of the format (the output peak file for peak_find) may be obtained by typing
peak_find help peak_format
[ output_peak <output peak file> ]
This specifies the output peak file. A description of the format may be obtained by typing
peak_fit help peak_format
[ output_ansig <output peak file, Ansig variant> ]
This specifies the Ansig variant output peak file. A description of the format may be obtained by typing
peak_fit help peak_ansig
[ output_ideal <output ideal file> ]
This specifies the file of the ideal fitted data, which is the sum of the contributions due to all of the peak fittings. This is useful as a means of checking the fittings. However, this can be a bit slow to calculate, so be warned.
[ output_rest <output rest file> ]
This specifies the file of the 'remaining' data, which is the difference between the input data and ideal fitted data.
[ par_ideal <par file of output ideal file> ]
This specifies the par file of the output ideal file.
[ par_rest <par file of output rest file> ]
This specifies the par file of the output rest file.
chisq <maximum chisq>
This gives the value of chi-squared below which a fitting of a given extremum is considered to be good.
[ subtract ]
This specifies that the 'previously' fitted peaks are subtracted from the input data, before the given extremum is itself fitted. By default this is not done.
[ group ]
This specifies that nearby extrema (listed in the input peak file) should be grouped before they are fitted. In particular two extrema, e(1) and e(M) are grouped if there are extrema e(2), ..., e(M-1) such that each pair of extrema, e(i), e(i+1) have intersecting fitting boxes (size as specified by the width). By default each extremum is fitted by itself.
[ deflate <level> ]
This specifies that the output ideal spectrum should be compressed, via deflate, using the specified <level>. The output rest spectrum is not compressed. By default the output ideal spectrum is not compressed.
dim <dimension>
If any dimension-dependent key word is used for a given dimension then dim must be the first such key word. A given dimension will appear if and only if the data is fitted in that dimension.
width <half-width in points>
This specifies the half-width of the fitting in points, with a half-width of w meaning that (in general) the 2*w+1 points around the extremum point are used to do the fitting. The methodology for choosing the fitting width might change in future (e.g. depend on the extremum).
npts <number of points for oscillator>
This is the number of (complex) data points of the oscillator in the given dimension. This can be determined from the process script, but for ease of coding this has been forced to be specified independently. This might change in future.
complex
This specifies that the process script transforms the oscillator into complex data. By default it is assumed that the script transforms the oscillator into real data. This can be determined from the process script, but for now this is used as an independent check. Complex transformed data requires the script to be run twice for each iteration of the fitting algorithm, whereas real transformed data requires the script to be run four times, hence it is recommended that this key word be used. Note that the input data must be real, and generally the script that is used for the fitting can be obtained from that which produced the input data (spectrum) by removing the reduction to real data in the latter script.
[ script <process script file> ]
This specifies the process script to be used in the fitting. Either this or script_com must appear for any dimension that is being fitted. For more information about process scripts, type
process help
[ script_com ]
This indicates the start of the commands used in the process script for the fitting. The end of the commands is indicated by 'end_script'. Either this or script must appear for any dimension that is being fitted. For more information, type
process help
[ freq <zero-frequency point> <fraction of sw> ]
This determines what the <zero-frequency point> is, as well as the <fraction of sw> represented in the input data file. By default, if there are 2P real points in the input data file, then the <zero-frequency point> is at point P (starting the count at 1) and the entire spectral width is present, so the <fraction of sw> is 1.
[ range <first point> <last point> ]
This restricts the range of points in the given dimension for the output of the ideal and rest spectra. By default there is no restriction.
[ phase <fixed phase> ]
This specifies that the phase for the spectrum of the ideal oscillator in the given dimension should be set to be the <fixed phase>. In particular the phase is not fit. By default the phase is not fixed.
[ decay <fixed decay> ]
This specifies that the decay for the spectrum of the ideal oscillator in the given dimension should be set to be the <fixed decay>. In particular the decay is not fit. By default the decay is not fixed. Azara help: peak_fit / W. Boucher / azara@bioc.cam.ac.uk