Help for 'peak_fit' program.

Date help created:  25 Sep 1995
Date last updated:  18 Oct 1996
'peak_fit' fits peaks in multi-dimensional NMR spectra. The companion program peak_find finds extrema to fit.

The input data has to be blocked. The input data must be real.

To run the program type

	peak_fit [<memory in Mwords>] <peak_fit script file>

The <memory in Mwords> is optional. By default 2 Mwords (8 Mbytes) are allocated for the main storage. The more storage that is allocated the less i/o to and from disk is required, in general.

The program is still in its early stages of development (even more so than the companion program peak_find). It can be used in conjunction with Per Kraulis' Ansig program by using the macros written by Colin Hardman.

The program works by starting with an (ideal) oscillator in time-space, and using a specified process script to determine how this oscillator is transformed into frequency-space. The transformed data is compared with the actual data, and the oscillator parameters adjusted until an acceptable fit (if one exists) is obtained.

It is expected that the script will be similar to that used to process the spectrum in the first place, but, for example, there is normally no phasing required starting from the ideal oscillator even if the actual spectrum was phased.

Only peaks that satisfy a quality criterion are output.

More details of the fitting procedure may be obtained by typing

	peak_fit help fitting

The peak_fit script file has dimension-independent key words followed by, for zero or more dimensions, dimension-dependent key words. The dimension-independent key words must appear before the dimension-dependent ones.

There must be no more than one key word per line.

Below <...> represents an argument for a key word and [...] represents a key word that is optional.

The syntax for the dimension-independent key words are

	input_data <par file of input data file>
	input_peak <input peak file>
	[ output_peak <output peak file> ]
	[ output_ansig <output peak file, Ansig variant> ]
	[ output_ideal <output ideal file> ]
	[ output_rest <output rest file> ]
	[ par_ideal <par file of output ideal file> ]
	[ par_rest <par file of output rest file> ]
	chisq <maximum chisq>
	[ subtract ]
	[ group ]
	[ deflate <level> ]

The syntax for the dimension-dependent key words (the non-optional ones must appear for any dimension that is fitted) are

	dim <dimension>
	width <half-width in points>
	npts <number of points for oscillator>
	[ complex ]
	[ script <process script file> ]
	[ script_com ]
	[ freq <zero-frequency point> <fraction of sw> ]
	[ range <first point> <last point> ]
	[ phase <fixed phase> ]
	[ decay <fixed decay> ]

A description of the key words may be obtained by typing

	peak_fit help <key word>

A description of the format of the input peak file (the output peak file for peak_find) may be obtained by typing

	peak_find help peak_format

A description of the format of the output peak file may be obtained by typing

	peak_fit help peak_format

or

	peak_fit help peak_ansig

for the Ansig variant.

peak_format

The peak file produced by the program has an ascii tab-separated format, with two header lines followed by one line (record; row) per extremum. The first header line contains the column titles. The second header line contains an 'N' in each column.

The first column contains the extremum value (title 'extr') and the remaining columns contain the extremum position in points (title 'pntD') and ppm (title 'ppmD') in each dimension, D, then the same data adjusted for the previous peaks if their fitting is to be subtracted (titles 'extrs', 'pntsD' and 'ppmsD'), then the magnitude of the fitted peak (title 'magn'), which can be negative, the chi sqaured of the fit (title 'chisq'), and finally the phase (title 'phaseD'), decay (title 'decayD') and fitted position in points (title 'pntfD') and ppm (title 'ppmfD') in each dimension.

The value of 'magn' is the magnitude of the fitted oscillator. This thus depends on the process script used to do the fitting.

The value of 'chisq' is a scaled fitted chisq.

The value of 'phaseD' is in degrees, and represents the amount by which the peak should have been phased in addition to whatever it (and all else) is phased in the process script.

The value of 'decayD', represents the amount to which the oscillator has decayed at its last point relative to its first point, and is the same as that used in 'decay' in process and plot1.

As an example, in 2D if both dimensions are fitted, the titles would be, if the fitting of previous peaks are subtracted,

	extr	pnt1	ppm1	pnt2	ppm2
	extrs	pnts1	ppms1	pnts2	ppms2
	magn	chisq
	phase1	decay1	pntf1	ppmf1
	phase2	decay2	pntf2	ppmf2

and, if the fitting of previous peaks are not subtracted,

	extr	pnt1	ppm1	pnt2	ppm2
	magn	chisq
	phase1	decay1	pntf1	ppmf1
	phase2	decay2	pntf2	ppmf2

Note, these titles all appear on one (the first) line.

More details of the fitting procedure may be obtained by typing

	peak_fit help fitting

peak_ansig

There is an alternative peak file produced by the program for import into Ansig. This has multiple lines for the header and also for each extremum (record; row) because Ansig has a limit of 80 characters per line. The data has an ascii tab-separated format. The data output is a subset of the normal peak_format output.

The first line (of the header and each extremum) contains the extremum value (title 'extr'), the magnitude of the fitted peak (title 'magn'), which can be negative, the chi sqaured of the fit (title 'chisq') and finally the adjusted extremum value (title 'extrs') if the subtract option is used.

The second line contains the extremum position in ppm (title 'ppmD') in each dimension, D.

The third line contains the adjusted extremum position in ppm (title 'ppmsD') in each dimension, D, if the subtract option is used.

The final line contains the fitted extremem position in ppm (title 'ppmfD') in each dimension, D.

As an example, in 3D if all dimensions are fitted, the titles would be, if the fitting of previous peaks are subtracted,

	extr	magn	chisq	extrs
		ppm1	ppm2	ppm3
		ppms1	ppms2	ppms3
		ppmf1	ppmf2	ppmf3

and, if the fitting of previous peaks are not subtracted,

	extr	magn	chisq
		ppm1	ppm2	ppm3
		ppmf1	ppmf2	ppmf3

More details of the fitting procedure may be obtained by typing

	peak_fit help fitting

fitting

The program assumes that in an N-dimensional spectrum, the time-space representation of each peak is a product of N decaying oscillators, one oscillator for each dimension. Each oscillator can be specified by three parameters: the frequency, the (constant) phase and the decay. Including an overall magnitude means that the time-space representation is specified by 3N+1 parameters. This time-space ideal data is transformed to frequency-space via a user-specified script, and the parameters are adjusted until an acceptable fit is obtained (if one exists).

The script is similar to that used in process. However, currently there are some restrictions that do not occur in process, namely there can only be one script for each dimension, and there is no maximum entropy in two or three dimensions.

Currently the fitting is done using the data in a fixed-size hypercube around the extremum.

Not all dimensions have to be fitted, but it is generally recommended that they are.

input_data

input_data <par file of input data file>

	This specifies the input data file, and must be the
	first key word in the script file.

input_peak

input_peak <input peak file>

	This specifies the input peak file.  A description of the
	format (the output peak file for peak_find) may be
	obtained by typing

		peak_find help peak_format

output_peak

[ output_peak <output peak file> ]

	This specifies the output peak file.  A description of the
	format may be obtained by typing

		peak_fit help peak_format

output_ansig

[ output_ansig <output peak file, Ansig variant> ]

	This specifies the Ansig variant output peak file.  A
	description of the format may be obtained by typing

		peak_fit help peak_ansig

output_ideal

[ output_ideal <output ideal file> ]

	This specifies the file of the ideal fitted data, which is
	the sum of the contributions due to all of the peak fittings.
	This is useful as a means of checking the fittings.
	However, this can be a bit slow to calculate, so be warned.

output_rest

[ output_rest <output rest file> ]

	This specifies the file of the 'remaining' data, which is the
	difference between the input data and ideal fitted data.

par_ideal

[ par_ideal <par file of output ideal file> ]

	This specifies the par file of the output ideal file.

par_rest

[ par_rest <par file of output rest file> ]

	This specifies the par file of the output rest file.

chisq

chisq <maximum chisq>

	This gives the value of chi-squared below which a fitting
	of a given extremum is considered to be good.

subtract

[ subtract ]

	This specifies that the 'previously' fitted peaks are
	subtracted from the input data, before the given extremum
	is itself fitted.  By default this is not done.

group

[ group ]

	This specifies that nearby extrema (listed in the input
	peak file) should be grouped before they are fitted.
	In particular two extrema, e(1) and e(M) are grouped if
	there are extrema e(2), ..., e(M-1) such that each pair
	of extrema, e(i), e(i+1) have intersecting fitting boxes
	(size as specified by the width).
	By default each extremum is fitted by itself.

deflate

[ deflate <level> ]

	This specifies that the output ideal spectrum should be
	compressed, via deflate, using the specified <level>.
	The output rest spectrum is not compressed.
	By default the output ideal spectrum is not compressed.

dim

dim <dimension>

	If any dimension-dependent key word is used for a given
	dimension then dim must be the first such key word.
	A given dimension will appear if and only if the data is
	fitted in that dimension.

width

width <half-width in points>

	This specifies the half-width of the fitting in points,
	with a half-width of w meaning that (in general) the
	2*w+1 points around the extremum point are used to do
	the fitting.  The methodology for choosing the fitting
	width might change in future (e.g. depend on the extremum).

npts

npts <number of points for oscillator>

	This is the number of (complex) data points of the
	oscillator in the given dimension.
	This can be determined from the process script, but
	for ease of coding this has been forced to be specified
	independently.  This might change in future.

complex

complex

	This specifies that the process script transforms the
	oscillator into complex data.  By default it is assumed
	that the script transforms the oscillator into real data.
	This can be determined from the process script, but
	for now this is used as an independent check.
	Complex transformed data requires the script to be run twice
	for each iteration of the fitting algorithm, whereas real
	transformed data requires the script to be run four times,
	hence it is recommended that this key word be used.
	Note that the input data must be real, and generally the
	script that is used for the fitting can be obtained from
	that which produced the input data (spectrum) by removing
	the reduction to real data in the latter script.

script

[ script <process script file> ]

	This specifies the process script to be used in the
	fitting.  Either this or script_com must appear for any
	dimension that is being fitted.  For more information about
	process scripts, type

		process help

script_com

[ script_com ]

	This indicates the start of the commands used in the
	process script for the fitting.  The end of the
	commands is indicated by 'end_script'.  Either this or
	script must appear for any dimension that is being
	fitted.  For more information, type

		process help

freq

[ freq <zero-frequency point> <fraction of sw> ]

	This determines what the <zero-frequency point> is, as
	well as the <fraction of sw> represented in the input
	data file.  By default, if there are 2P real points in
	the input data file, then the <zero-frequency point> is
	at point P (starting the count at 1) and the entire
	spectral width is present, so the <fraction of sw> is 1.

range

[ range <first point> <last point> ]

	This restricts the range of points in the given
	dimension for the output of the ideal and rest spectra.
	By default there is no restriction.

phase

[ phase <fixed phase> ]

	This specifies that the phase for the spectrum of the ideal
	oscillator in the given dimension should be set to be the
	<fixed phase>.  In particular the phase is not fit.
	By default the phase is not fixed.

decay

[ decay <fixed decay> ]

	This specifies that the decay for the spectrum of the ideal
	oscillator in the given dimension should be set to be the
	<fixed decay>.  In particular the decay is not fit.
	By default the decay is not fixed.

Azara help: peak_fit / W. Boucher / azara@bioc.cam.ac.uk