RNMRTK Tutorial
What is the Rowland NMR ToolKit (RNMRTK)? (top)
The Rowland NMR Toolkit (RNMRTK) is a software package for processing multi-dimensional NMR data. The main RNMRTK web site is located at http://rnmrtk.uchc.edu. In addition to efficient algorithms for traditional signal processing (FT and linear prediction methods), RNMRTK implements a powerful and general algorithm for computing maximum entropy (MaxEnt) reconstructions. MaxEnt reconstructions may be performed on traditional uniformly collected NMR data as well as data collected from a non-uniform sample schedule. RNMRTK has some built-in tools for visualizing NMR data, but is best used in conjunction with a more powerful analysis tool.
Note about the document (top)
This document is a step by step tutorial illustrating how to use the RNMRTK program suite to load data, process data with conventional techniques (FT and linear prediction), using maximum entropy reconstruction methods, processing non-uniformly sampled data, and some general notes about interoperability with other programs. Common problems that users encounter are illustrated. In addition, a section on how to install and configure RNMRTK is covered. The on-line tool (http://sbtools.uchc.edu/nmr/nmr_toolkit) for generating processing scripts is used throughout the document. The nmrDraw component of the nmrPipe processing package is used for data visualization in addition to the visualization tools built into RNMRTK. It is also strongly encouraged to have a text editor capable of saving text files in a unix type format. The editors vi, emacs, Text Wrangler, nedit, BBEdit all do this. The standard Windows and OSX editors notepad, wordpad, textedit do not.
In the tutorial commands entered by the user are bold and underlined and in Courier font as shown below. Note that on some browsers the underline can obsucure an undercore in a filename.
cd new command
Installation (top)
Installation is composed of four steps
OSX Notes
In addition to the above steps the following must also be done on OSX systems
Linux Notes
In addition to the above steps g77 must be installed.
Installation script
Inside the rnmrtk.v3.1.tgz tar ball is an installation script that will perform installation of files, update configuration files, and adjust shared memory parameters. It can also be found on the sbtools website.
To run the script simply place the rnmrtk.v3.1.tgz file in the same folder as the installer and execute with the command
./rnmrtk_installer.com
Follow the instructions on the screen.
Shared memory and the section command (top)
All programs in the RNMRTK program suite access data stored in a shared memory section. As the name implies the memory space is shared and different programs can access the same data in memory at the same time. Shared memory sections persist until they are deleted or until the computer is restarted. It is therefore good practice to remove them when you are finished processing.
The RNMRTK tool, section, is used to build shared memory sections.
To illustrate how to create a shared memory section with the program section change directory to the 3d_noesyhsqc.fid directory
cd 3d_noesyhsqc.fid
Inside the folder there is a perl program called procpar.prl. This program extracts information needed for processing from the Varian procpar file. To run the program type the program name, the file to be parsed, and the 1H reference at the center of the spectrum like this:
./procpar.prl procpar 4.772
The program generates a file called procpar.txt and also displays the file to the screen. In other sections of the tutorial the procpar.txt file will be already present.
A similar type tool for Bruker data sets, bruker2sbtools.prl, exists for extracting relevant information from Bruker acqus files. These tools can be downloaded at http://sbtools.uchc.edu/downloads/nmr/nmr_toolkit.
To view the file simply use the more command like this;
more procpar.txt
This experiment was collected with 512 x 110 x 32 complex points. In addition, like most modern NMR data, the data is 32 bits (4 bytes per data point). Thus when we use the section command to create a shared memory section we need to create it large enough to contain the whole data set.
Next create a shared memory section with the section command as follows:
section -c 512 110 32
After issuing the command a message should appear stating "Created shared memory section".
Now load the NMR data with the loadvnmr command as follows:
rnmrtk loadvnmr ./fid
This command loads the Varian data into the shared memory section. Notice that an "Insufficient Memory" error occurred. This error occurred because the shared memory section that was created was too small to hold the whole NMR data set. Note that while the program encountered an error, and reported it, it still loaded the data; at least a subset of the data. From the output notice that it read in 512 x 110 x 4 hypercomplex data points.
Why was the shared memory section too small? It is because the shared memory section was created large enough to only handle real data and the size of each dimension needs to be doubled to be large enough to hold complex data in all three dimensions.
With many commands in the RNMRTK program suite, errors will NOT abort the program or script that they are being executed from, but rather the programs and scripts will continue to operate, but often in an unpredictable manner. It is thus important to pay close attention to messages that are reported back to the screen.
Now lets create a shared memory section which is large enough to hold the entire data set.
section -c 1024 220 64
rnmrtk loadvnmr ./fid
This time all 32 complex planes in t2 were loaded and no insufficient memory errors occurred. Note that when the section -c command was issued, at a time when a current shared memory section existed, that the current shared memory section was deleted prior to creating the new shared memory section.
Note: In this example the shared memory section is being created just large enough to load the raw time domain data. During processing the data may be zero-filled or linear predicted which would cause the data size to be larger than the initial time domain data. In this case the shared memory section should be created large enough to hold the data at its largest size in the NMR data processing pathway. Also note that it is common to shrink dimensions and to delete imaginaries both of which reduce the size of the data.
The shared memory section can be created larger than what is needed for NMR data processing. Also, the creation of a shared memory section does not by itself automatically use the amount of memory equal to the size of the shared memory section. The only memory being used by the system is for holding the NMR data. Thus if a shared memory section was created that was 1 GB in size and NMR data was loaded that consumed 256 MB of memory then only 256 MB of system memory would be in use, not 1 GB.
More about the section command (top)
The shared memory section creates a shared memory section which is equal to the product of all values entered after the section command times 4. The size is multiplied by 4 as it assumes the NMR data is in 32 bit (4 bytes) format. In addition a small extra size is added by default which is used for a shared memory section header.
Thus the commands "section -c 1024 220 64" and "section -c 14417920" are equivalent. It is just generally easier to know the size of each of the dimensions in your experiment than the total size of the whole data set.
Shared memory sections can be deleted with the command
section -d
This command deletes an existing shared memory section if it exists. Note that when the command section -c is executed the first thing that the program does is to issue the command section -d to delete an existing shared memory section if it exists. Thus if you have unsaved data in a shared memory section and you create a new shared memory section with the command section -c it will be deleted.
Re-create the shared memory section for the tutorial
section -c 1024 220 64
The section command can also be issued without the -c or -d options. Run the command
section
When the program is started in this manner it looks to determine if any shared memory sections exist and reports their size and ID number if they do. In this example a shared memory section exists and it has a ID number of 500, is owned by the user markm, and has a size of 112641 pages (Your system will report a different ID number ans user). Note that there are 512 bytes per page. Therefore if you multiply 112641 by 512 the product will be close to 1024 x 220 x 64 x 4. The actual value will be slightly larger as section adds a small area for the shared memory section header.
The section program waits for a response to see if the user wants to remove the existing shared memory section. Enter 'y' or 'n' to delete or keep the existing shared memory section. For the tutorial enter 'y'
y
The section program is stopped and the shared memory section is deleted.
Issue the section command again
section
Now the program finds 0 shared memory sections. The program waits for the size of the shared memory section to be created to be entered. Again the size of the shared memory section will be the product of all the values entered. Enter the following:
1024 220 64
After entering the values choose 'n' to keep the existing shared memory section.
n
What if you want to have two shared memory sections open at the same time? (top)
By default the shared memory section ID number is the users UID number. In the above example it is 500 which is the UID of the user markm. However, this default can be changed with the environmental variable RNMRTK_UID. To do this a different shell must be open. Inside the new shell set the environmental variable RNMTK_UID to some different value. Run the section command in the new shell and another shared memory section will be created. Note that rnmrtk commands should be issued from this new shell where the RNMRTK_UID is defined.
This is an advanced feature and will not be pursued further in the tutorial.
General notes on the RNMRTK program (top)
RNMRTK is the main program for processing NMR data using conventional processing methods, loading and saving data, and other common task in NMR data processing.
Command-line feature of rnmrtk (top)
If rnmrtk is invoked with a command on the command-line, it will carry out that command and exit immediately. As an example try the following commands
section -c 1024 220 64
rnmrtk loadvnmr ./fid
Notice that the command prompt returns after issuing the command.
If no command is present on the command-line , rnmrtk enters interactive mode. In interactive mode the command prompt changes to a rnmrtk prompt and the following sub-commands may be invoked. Details of each of these commands can be found at http://rnmrtk.uchc.edu/rnmrtk/Commands.html.
a
Executing commands in interactive mode (top)
To enter interactive mode type rnmrtk without any commands afterwards.
rnmrtk
Notice that the command prompt changes to RNMRTK>
In this interactive mode commands can be entered one after the other.
RNMRTK>loadvnmr ./fid
RNMRTK>seepar
RNMRTK>exit
The seepar command shows parameters in the shared memory section header.
The exit command exits out of interactive mode.
Should rnmrtk commands be upper or lower case? (top)
The sub-commands inside the rnmrtk program are case insensitive. Thus the following two commands are equivalent.
RNMRTK>loadvnmr ./fid
RNMRTK>LOADVNMR ./fid
Note that in both cases the ./fid is lowercase as it is a filename and thus case sensitive.
What happens to data when we exit and re-enter rnmrtk? (top)
To see lets start rnmrtk again and check with seepar.
rnmrtk
RNMRTK>seepar
RNMRTK>exit
Notice that even though we exited the rnmrtk program with the exit command, when we re-entered the rnmrtk program the data was still present. This is because the data is stored in the shared memory section which is persistent as long as the computer remains on and the shared memory section is not deleted or altered by another program. The data in the shared memory section will even survive a user logout. However, I recommend saving your data at appropriate times.
Executing commands one at a time from the command line (top)
We have now observed that rnmrtk commands can be executed one at a time from the command-line by typing rnmrtk with a command afterwards. For example, "rnmrtk seepar". Commands can also be issued in interactive mode as we were just doing. Lets try some more commands from the command prompt directly to illustrate some more concepts. Type the following commands.
rnmrtk setpar SF1 125.10
rnmrtk seepar (Note that the Carrier for T1 was changed)
rnmrtk fft
The loadvnmr command loaded the data, the setpar command set the t1 dimension to have a spectrometer frequency of 125.1 for referencing, and the fft command failed to perform a FFT of the data. The reason the FFT command failed is that the dimension that the FFT was to be performed along was not defined. To define the active dimension use the dim command as follows.
rnmrtk dim t3
rnmrtk fft
Notice that the rnmrtk dim t3 command selected the t3 (acquisition) dimension as the active dimension. However, the fft command still failed with the "The current active dimension is not set!" error. Why?
The reason is that the dim variable is only set while the rnmrtk program is active. Once the program is exited the dim variable is lost. Because of this it is often more useful to use the rnmrtk program in interactive mode. Other rnmrtk program suite commands allow the dimension to be acted upon to be entered from the command line, however, the rnmrtk program needs to have the dim variable set. An even better method for using rnmrtk is to build scripts that run rnmrtk in interactive mode and redirects input into the rnmrtk interactive mode. I strongly recommend scripts as it give you a written record of the processing steps.
Scripts (top)
For most of the tutorial we will be using scripts. Feel free to create the scripts yourself, but to speed the demonstration up most of the scripts are provided already created.
Running scripts
Scripts can be run with a command such as this
sh script_name.com
or by changing the script to be executable and then running the script directly as such
chmod u+x script_name.com
./script_name.com
Note that the ./ forces the script in the current directory to be run rather than searching in the search path. I recommend doing this to avoid any issues with accidentally running a different script with the same name that may appear in the system path. Also some shells do not include the present working directory as part of the path and the ./ fixes that issue.
Use the command "ls -l" to determine if a script has execute permissions. A script with -rw-r--r-- would not be executable by anyone, permissions set to -rwxr--r-- would be executable by the owner of the file, and permissions set to -rwxr-xr-x would be executable by any user.
A note about text editors
Note that different operating systems and text editors have different hidden characters that they use to define the end of a line. Especially with Mac OSX it is common to create a valid script but to have it fail when you attempt to execute it. It is possible to fix this problem by using tr, sed, or awk to change the hidden characters to different hidden characters which will work. However, I find it better to use a text editor that will not cause the problem in the first place. The unix text editors vi and emacs always will work. The X11 editor nedit works very well. Generic text editors such as Apple's textedit generally will not work. For OSX systems I strongly recommend a great FREE text editor called TextWrangler. TextWrangler, and its non-free cousin BBEdit, in addition to being very powerful yet easy to use editors allow the user to define the type of character used at the end of lines.
At the bottom of the TextWrangler window is a pull down menu allowing the user to change between Classic Mac (CR), Unix (LF), Windows (CRLF), and Unicode characters to act as new line feeds. As long as you choose Unix (LF) your scripts will work fine. In the image below you can see that Unix (LF) is selected.
How to create a rnmrtk script in interactive mode
Try creating this simple script in your favorite text editor and save it as script1.com then execute the script.
sh script1.com
This script runs. Data is loaded with loadvnmr, the carrier frequency is set to 125.1, experimental parameters are displayed with seepar, and the fft failed with the "The current active dimension is not set!" error again because rnmrtk was not run in interactive mode. To fix this, edit the script1.com script as such and save as script2.com. Then execute the script.
sh script2.com
This time the script should complete all the steps including the FFT command. What is happening here is that rnmrtk is being started in interactive mode. All commands after the << EOF are passed one at a time to the interactive rnmrtk program up until the last EOF. Note that the EOF, while a common nomenclature, can be replaced with other characters. There is nothing special about EOF. The only thing that matters is that the characters after the << match the characters at the end of the processing section.
Scripts can be broken into multiple interactive modes to allow other commands to be executed inside the script and to separate different processing steps into different steps for easy viewing of the script. Below is an example. This script is created already. Lets run it now as the result is used later in the tutorial.
./script3.com
Note that the script is already executable so it could be run directly without the sh command. Blank lines are ignored.
We will come back to viewing the processed data shortly.
The rnmrtk parser. Classes of parameters, order of precedence, and default parameters (top)
There are four classes of parameters that RNMRTK programs use: integers, floats, text, and filenames.
Order of precedence
The order of precedence is critical inside RNMRTK, but only within a given class. For example if there is a command that has three integer arguments which may be passed to the command it is critical that the three integer values be entered in the correct order. Likewise for floats, text, and filenames. However, if there are commands with multiple integers, floats, text, or filenames the order within each class is important, but multiple classes can be mixed together.
The following hypothetical command would be valid because each class of parameters were entered in order.
command integer1 float1 integer2 filename float2 text1 float3 text2
This command would be invalid because the floats were not in order.
command integer1 float3 integer2 filename float2 text1 float1 text2
Default values
Many rnmrtk command have default values. For example the command zerofill can be used with a argument passed to it or by itself. If no argument is passed then it will zerofill to the next Fourier number by default. In most of the scripts we will use the parameters will be entered explicitly. I prefer this method as we get a write record of exactly what the processing is doing rather thn relaying on default values.
Examples of entering parameters, order of precedence, and default values with the SSTDC (solvent suppression) command.
Manual entry for SSTDC
section -c 1024 220 64
rnmrtk
RNMRTK>loadvnmr ./fid
RNMRTK>dim t3
RNMRTK>sstdc
The SSTDC command executes with default parameters of width=16, endpoints=20, and freq=0.0
RNMRTK>sstdc 10 25 0.0
The SSTDC command executes with parameters of width=10, endpoints=25, and freq=0.0
RNMRTK>sstdc 12 0.0
The SSTDC command executes with parameters of width=12, endpoints=20(Default), and freq=0.0
RNMRTK>sstdc 10.0 15 30
The SSTDC command executes with parameters of width=15, endpoints=30, and freq=10.0
RNMRTK>sstdc 10 30. 15.
The SSTDC command executes with parameters of width=10, endpoints=20(Default), freq=30.0 and the 15. is ignored.
Note: Some commands will fail if incorrect arguments are passed to the command while others, like sstdc in this last example, work fine and simply ignore the incorrect parameter. It is thus very important to be sure that you do not have syntax errors in your scripts.
RNMRTK>exit
Processing a uniformly sampled 2D spectrum (top)
Change to the 2d_linear_ft.fid directory
cd 2d_linear_ft.fid
touch proc.com; chmod u+x proc.com
Open proc.com in your favorite text editor and add the following minus the comments. The file proc2.com with the script pre built is also present if you do not want to create the script by hand. A file proc_comments.com is also present which is the script below with comments. Note that the script with comments will not execute properly.
Once complete save the script and execute it.
./proc.com
Viewing 1D NMR data with seepln (top)
seepln is the 1D graphical display utility in the RNMRTK software package. seepln will display whatever is currently in the current shared memory section. It is useful for observing what happens to 1D slices of your spectrum as various processing steps are applied. Typically a processing script is created that performs all processing steps at one time, but in some cases such as determining phase values, observing if a apodization function is set well, diagnosing problems, etc, it is useful to graphically observe intermediate processing steps.
Using the 2D HNCO that was processed earlier we will demonstrate some of the features of seepln
First, lets create a shared memory section, load some data, and start seepln. Also, lets use two terminals (or tabs) - one for issuing rnmrtk commands and a second one for seepln commands. That way we can process the spectra and observe changes at the same time.
cd 2d_linear_ft.fid (In terminal 1, if not already there)
section -c 2048 2048 (In terminal 1)
rnmrtk load time.sec (In terminal 1)
seepln (In terminal 2. Note that it does not matter what directory the terminal is located in)
Note: The command prompt will change to SEEPLN(T2)> after starting seepln as well as the first FID being displayed.
There are several commands that can be issued from the seepln command prompt. I will discuss only a few here and leave it to you to read the manual.
Row command
To see additional rows of FID's use the row command. You can enter the row you want to see or simply type row to cycle through the FID's by hitting the "return" key. NOTE: In addition to displaying each of the FID's, additional information is shown on the command prompt.
SEEPLN(T2)>row 2
SEEPLN(T2)>row
Exit command
Type "exit" to exit any mode in seepln and return to the SEEPLN> prompt
SEEPLN>exit
Go back to the terminal where rnmrtk is running and type the following commands
RNMRTK>dim t2
RNMRTK>sstdc
RNMRTK>gm 20.0 20.0
RNMRTK>zerofill 1024
RNMRTK>fft 0.5
Clear and Reset Commands
The clear command erases the contents of the seepln window and reset re-draws the spectrum.
Now go back to the seepln terminal and type clear and reset. You should now observe the frequency spectrum. Of course, seepln could have been used to show what had happened at every processing step if desired. Lets use seepln to phase the spectrum
SEEPLN>clear
SEEPLN>reset
SEEPLN>phase (exit phase mode after playing around)
In phase mode, use the number keys (top of keyboard, not number pad) to phase. Numbers 1-4 alter the phase in one direction, 6-9 in the other direction. Values 5 and 0 switch between altering phase0 (constant) and phase1 (linear). Values near 5 alter phase to a smaller extent while numbers further away from 5 alter phase to a greater amount. The mouse can be used to set a pivot point if needed for a linear phase correction. Hit "return/enter" to exit phase mode. You should find that a phase value close to -152.0.
NOTE: Phase values determined from vnmrj or nmrDraw are negated relative to rnmrtk. Therefore, if you use nmrDraw to phase a spectrum and get a value of say 152.0, then you would want to use -152.0 in rnmrtk when processing with a command such as "phase -152.0 0.0". In addition, if a first order phase correction is needed and values were determined in nmrDraw, then you need to use the FIRST argument when phasing in rnmrtk with a command such as "phase FIRST -152.0 35.0". The FIRST forces rnmrtk to use the same definition as nmrDraw for where the pivot is defined.
SEEPLN>clear
SEEPLN>reset
Peak command
It is often desirable to see where a peak is located or what its amplitude is. To do that in seepln use the peak command. The peak command, like the phase command, uses the number keys to move a cursor on the spectrum defining its location. Use "return/enter" to exit peak mode.
SEEPLN>peak
Move the cursor around to see how the peak command works. Note the position of a strong peak, such as point 168.
Scale command
The scale command will draw a ppm scale on the spectrum.
SEEPLN>scale
Col command
To see interferrograms along the t1 dimension use the col command. Like the row command you can select a column initially or use the col command alone to cycle through columns. Note that when switching to column mode for the first time a specific column must be chosen.
SEEPLN>col 1
This column is mostly noise as there is no signal along column 1
SEEPLN>col 168
Now we see a strong interfereogram.
SEEPLN>col
Allows you to move sequentially through the interferrograms by hitting the return/enter key.
SEEPLN>exit
RNMRTK>exit
Viewing nD NMR data with contour (top)
The program contour is used to visualize 2D planes from nD data sets.
Let's work from a 3D data set (the one we used earlier), but the same commands will work with a 2D spectrum as well
cd 3d_noesyhsqc.fid
section -c 256 64 512
rnmrtk load nnoesyhsqc.sec
NOTE: If the file nnoesyhsqc.sec does not exist execute the command ./script3.com
rnmrtk seepar
From seepar we can see that F1 is a 1H dimension, F2 is 15N, and F3 is 1H. Typically for 3D NOESY data the 1H-1H planes are viewed. In this case that would correspond to F1 F3 planes.
contour
This opens the contour window and changes the command prompt in the terminal to CONTOUR (F2-F3)>
There are many commands that can be used with contour. See the manual for a complete listing. In this tutorial we will focus on some of the more common commands.
CONTOUR (F2-F3)>level 7.0 mul 1.3 20
CONTOUR (F2-F3)>go
You should now see the first F2-F3 plane corresponding to F1=1
CONTOUR (F2-F3)>plane
Plane:(hit return to cycle through planes or type a number to jump to that plane)
Plane:quit
This exits plane mode. Note that in plane mode the screen is refreshed each time a new plane is displayed.
Normally one would view 1H-1H planes in a 15N edited noesy, which corresponds to F1-F3 planes in this experiment. To change the orientation use the following command
CONTOUR (F2-F3)>dim f1 f3
Notice that the prompt changes to reflect the dimension change.
CONTOUR (F1-F3)>plane
Plane:28
You should now see something like this
Plane:quit
CONTOUR (F1-F3)>clear
This clears the screen. Some commands such as plane clear the screen by default, but most others do not. Thus if you do not clear the screen you will see the overlay of multiple contours.
CONTOUR (F1-F3)>color 3
CONTOUR (F1-F3)>go
This draws the same contours, but in green. See the manual for color choices.
CONTOUR (F1-F3)>slice
This will display a horizontal slice through zero frequency. The command prompt shows details about the position of the slice.
Using the instructions below play around with slice for a few moments.
Use the number keys 1-4 and 6-9 to alter the amplitude of the slice and the slice position. Use the number key 5 to toggle between changing the amplitude of the slice or altering the slice location.
Use the keys x and y to change between horizontal and vertical slices.
Use the s key to autoscale the slice.
Hit enter (return) to exit slice mode.
CONTOUR (F1-F3)>peak
This command enters peak mode. The peak cursor is displayed in the contour window and information is displayed in the command prompt. Like the slice command use the keyboard keys 1-4 nd 6-9 to move the peak cursor to your desired location. Use the x and y keys to change between altering the horizontal or vertical position.
Use the enter / return key to exit peak mode.
What if we want to draw negative contours.
CONTOUR (F1-F3)>clear
CONTOUR (F1-F3)>level -7.0 mul 1.3 20
CONTOUR (F1-F3)>go
The negative contours are drawn in green as color 2 is still selected.
CONTOUR (F1-F3)>color 2
CONTOUR (F1-F3)>level 7.0 mul 1.3 20
CONTOUR (F1-F3)>go
Now we see both the positive and negative contour levels. Notice that without the clear command between the two go commands the screen is not refreshed so we can see both negative and positive contours at the same time.
Another way to draw positive and negative contours.
CONTOUR (F1-F3)>clear
CONTOUR (F1-F3)>clevel 2 7.0 mul 1.3 20
CONTOUR (F1-F3)>clevel 4 -7.0 mul 1.3 20
CONTOUR (F1-F3)>go
You should now see red and blue contours. Note that no go was needed between the two clevel commands, but it would work fine if the go command was used as well.
Lets zoom
CONTOUR (F1-F3)>zoom
Once the zoom command is entered move the mouse to the contour window, click on a corner and drag the mouse to make a zoom box. Once you are happy with the box size and position hit the mouse button again and the spectrum is zoomed.
CONTOUR (F1-F3)>unzoom
unzoom unzooms all the way out. Even if multiple zooms were performed, unzoom zooms all the way back to the original size.
CONTOUR (F1-F3)>exit
Exits the contour program.
Processing a uniformly sampled 2D data set with maximum entropy reconstruction in the indirect dimension using msa (top)
cd 2d_linear_msa.fid
We are going to start by processing the direct dimension with a conventional FT. Lets look at a pre built script called process_f2.com
more process_f2.com
A basic script that builds the section file, loads the data, sets referencing information, does a basic FFT with conventional commands, and saves the data as f2_proc.sec
Also note that we end with the commands: dim t1, zerofill 512. This is because, like the flip (linear prediction) program, the msa program needs to have the final size of the spectrum after maximum entropy reconstruction defined prior to issuing the msa command.
./process_f2.com
Maximum entropy reconstruction using msa, msa2d, or msa3d can be run in one of two modes; constant aim mode and constant lambda mode. In constant aim mode there are two user adjustable parameters that must be set; def and aim. In constant lambda mode there are also two user adjustable parameters; def and lambda. Setting these parameters with reasonable values is critical in getting good results with maximum entropy reconstructions.
It is typical to use a FT when processing the direct dimension even when using maximum entropy reconstruction in the indirect dimensions, although maximum entropy reconstruction may be used in the direct dimension as well. When the direct dimension is processed with the FT, and thus not all dimensions are processed together in a single maximum entropy calculation, it is important that the constant lambda mode is used. If all dimensions of a spectrum are processed together with a single maximum entropy calculation then it is typical to use the constant aim mode.
Thus for a 2D data set we have two options:
For a 3D data set we also have two options:
For a 4D data set we only have a single option:
How do we estimate values for def, aim, and lambda? One reasonable method for selecting def and aim is to use a value near the RMS noise level. We have a tool called noisecalc which can be used in an automated way to do just that, but more on that in a bit. Estimating values for lambda are a bit trickier. One way to do it is to run an msa calculation with constant aim, with aim and def set near the RMS noise level, and then examine the output to see what lambda value was converged upon. The msa calculation can then be repeated with the same def value and the lambda value that the msa program converged to.
Back to processing our 2D HNCO spectrum with msa. At this point we have processed the F2 dimension with a conventional FFT approach and we need to select parameters for def, aim, and lambda. We are initially going to set def and aim to a value near the RMS noise level. We are going to use the command seepln to do this.
seepln
After issuing seepln the command prompt will change to SEEPLN(F2)> and we see a 1D spectrum with peaks on the left hand part of the spectrum. Use the peak command and the number keys 1-4 and 6-9 to move the peak cursor to the location of a reasonably size single, such as index point 335 and make a note where the peak is located. Also move the peak cursor to the right to a region where no signals are located, somewhere around index point 590 and make a note of that point as well.
SEEPLN(F2)>peak
After exiting peak mode by hitting return / enter use the RMS command to measure the RMS noise level from point 590 to 1024 (the last point).
SEEPLN(F2)>RMS 590 1024
The output should show a RMS value around 132, a maximum value near 428, and a S/N near 3.2.
SEEPLN(F2)>exit
Now lets use the msa program to perform a maximum entropy reconstruction in constant aim mode:
msa t1 300 48 132.0 132.0 0.0 0.0 0.0 0.0 2 single | tee aim.txt
The msa calculation is performed and the output saved to a file aim.txt. Here is a brief description of the msa command
Now that the msa calculation is finished lets look at the spectrum.
contour
CONTOUR(F1-F2)>level 300. mul 1.3 20
CONTOUR(F1-F2)>go
CONTOUR(F1-F2)>exit
rnmrtk save msa_aim.sec
In constant aim mode each column converges to a different lambda value. Lambda acts as a scaling factor and hence each column will have a slightly different scaling and thus the contours will show distortions. In this example the distortions are not strong, but they are present.
To eliminate the scaling distortions we need to recalculate the spectrum in constant lambda mode. The best way to set lambda is to examine the output from the constant aim calculation for a column where the signal was strong (say column 335 which was noted earlier in seepln as having a strong signal) to determine what lambda value was converged to. To do this either use more and scroll through the output of aim.txt or use grep
more aim.txt or grep 'Chunk 335' aim.txt (2 spaces after Chunk)
For Chunk 335 (The location of the strong signal we found with seepln a bit earlier) the msa calculation converged in 23 loops and converged to a lambda value of 2.04907 (L = 2.04907).
Lets run the msa calculation again in constant lambda mode using the def from earlier (132.0) and the lambda value 2.05.
First we have to reload the f2_proc.sec file.
rnmrtk load f2_proc.sec
msa t1 300 48 132.0 2.05 LAMBDA 0.0 0.0 0.0 0.0 2
contour
CONTOUR(F1-F2)>level 300. mul 1.3 20
CONTOUR(F1-F2)>go
CONTOUR(F1-F2)>exit
rnmrtk save msa_lambda.sec
Processing a uniformly sampled 3D data set with maximum entropy reconstruction in the indirect dimensions using msa2d (top)
cd 3d_linear_msa2d.fid
This is a uniformly sampled 3D experiment with 38 and 32 increments in the t1 and t2 indirect dimensions and 512 complex points in each FID.
more process_ft.com
To start we are going to process the experiment with conventional FT methods with linear prediction in both indirect dimensions. Note that the t1 dimension is transformed without LP, and then re-processed with LP later so that the LP is always performed when two of the dimensions are in the frequency domain.
Also note that after processing the F3 dimension that the transformed file is saved twice as noisecalc.sec and f3_proc.sec. The noisecalc.sec file is saved prior to deleting the imaginaries and shrinking the amide region, while the f3_proc.sec file is saved after these steps. These two files will be used shortly is processing the data using maximum entropy reconstruction with the msa2d program.
./process_ft.com
contour [Alternative, use nmrDraw and the ft_lp_f3f1.ft3 file]
CONTOUR(F2-F3)>dim f1 f3
CONTOUR(F1-F3)>level 15.0 mul 1.3 20
CONTOUR(F1-F3)>go
CONTOUR(F1-F3)>plane
Look through planes to confirm that the spectrum processed properly. Plane 54 is shown here.
Plane:quit
CONTOUR(F1-F3)>exit
Now lets move on to performing a maximum entropy calculation to process the two indirect dimensions. Like the 2D example earlier we need to make a judgment on what def, aim, and lambda values to use.
Initially we will process a 2D plane in constant aim mode (with def and aim set) and then we will process the whole experiment in constant lambda mode (with def and lambda set). Like before we will set aim and def to a value based on the rms noise of the experiment. Unlike before where we just measured the RMS from within seepln this time we will use a utility called noisecalc. noisecalc measures the RMS noise and performs scaling depending on the number of samples that were collected and what the final data set size will be.
To run noisecalc we need a single 1D spectrum in the frequency domain, preferably with a small signal if possible. In a uniformly sampled data set this would likely be the last FID, the one with the largest t1 and t2 evolution delays. The data also must be complex, that is why we saved the noisecal.sec file earlier prior to deleting imaginaries or shrinking the amide region. The following commands perform a little trick to load the last 1D spectra from the 3D dataset.
rnmrtk
RNMRTK>load noisecalc.sec
RNMRTK>seepar
RNMRTK>dim t2
RNMRTK>shrink 1 32 (shrinks the t2 dimension to a single plane starting from plane 32, the last plane)
RNMRTK>seepar
RNMRTK>setpar dom f1 (Sets the dimensionality to be 1D)
RNMRTK>seepar
RNMRTK>exit
seepln (View the 1D)
SEEPLN(F1)>exit
At this point we should have a single 1D spectrum with 512 complex points. You can view the 1D if you desire with seepln, but it is not necessary.
Now lets run noisecalc
noisecalc 20.0 512 1216 1 32768 1 1 noisecalc1.txt
more nosiecalc1.txt
We can see from this file that noisecalc determined a value for aim of 17.468 and def of 3.365.
How do we enter parameters for msa2d?
Unlike msa where the parameters were passed on the command line, msa2d (and msa3d) have their arguments passed via a parameter file. In this example I have created the msa2d parameter file and named it msa2d_aim.param.
more msa2d_aim.param
This parameter file has a value for AIM (and no value for lambda) so it will process in constant aim mode. There is also a NUSE parameter which is used when processing uniformly sampled data. There is a SCHED parameter which will be used instead of NUSE for non-uniformly sampled data which we will explore shortly.
We could process the who experiment in constant aim mode, view which planes have strong signals, and then determine the lambda value which was converged upon for processing the whole data set in constant lambda mode. However, to speed things up we will only process a single 2D plane, plane 123, which has strong signals.
rnmrtk load f3_proc.sec dim f3 t1 t2 num 1 start 123
This load command loads in a t1 t2 plane starting at F3 plane number 123. The number of planes to be loaded is 1 (num 1).
Now lets run the msa2d program.
msa2d t1 t2 ./msa2d_aim.param | tee msa2d_aim.txt
The maximum entropy calculation should occur quickly. Lets take a look at the output.
contour
CONTOUR(F2-F3)>dim f1 f2
CONTOUR(F1-F2)>level 30.0 mul 1.3 20
CONTOUR(F1-F2)>go
CONTOUR(F1-F2)>exit
Now lets see what lambda value the msa2d program converged to.
more msa2d_aim.txt
Move to the bottom of the file and we can see that the calculation converged in about 28 steps and converged to a lambda value of 0.32855.
Now lets process the whole experiment in constant lambda mode with a lambda value set to 0.329. There is a msa2d parameter file already created with the correct parameters. AIM has been replaced with LAMBDA
more msa2d_lambda.param
It is possible to load all the 2D planes at a single time and execute the msa2d program. However, that could be memory intensive so it is generally better to process the data in smaller chunks. In this example the chunk size will be set to 8 planes. To handle this we need a script that will loop through each of the chunks and process them and then some way to combine all the data back together.
Note: If you have multiple CPU's in system and the environmental variable MP_SET_NUMTHREADS is set equal to the number of CPUs then the msa2d program will distribute the chunk over all of your CPU's to speed the calculaion time.
The script to process all the planes and combine them is called process_msa2d.com
more process_msa2d.com
Note that the process_msa2d.com script calls another script, combine.sh.
more combine.sh
Now lets process the 3D data set.
./process_msa2d.com
contour [Alternative, use nmrDraw and the msa2d_f3f1.ft3 file]
CONTOUR(F2-F3)>dim f1 f3
CONTOUR(F1-F3)>level 30.0 mul 1.3 20
CONTOUR(F1-F3)>go
CONTOUR(F1-F3)>plane
Look through planes to confirm that the spectrum processed properly. Plane 54 is shown here.
Plane:quit
CONTOUR(F1-F3)>exit
Processing a uniformly sampled 3D data set as if it was collected non-uniformly with msa2d (top)
The msa2d program can process uniformly sampled data as an alternative to conventional FT techniques as shown in the earlier part of the demo. In addition the msa2d program can process uniformly sampled data as if it was collected non-uniformly. In this mode the uniformly sampled data is loaded, but when the data is processed a sample schedule is used. The msa2d program will remove all data that is not in the sample schedule and process the data with only the data in the schedule. In this manner one can collect a uniformly sampled data and experiment with different types of sample schedules.
Lets process a HNCACB experiment non-uniformly with a sample schedule.
cd hncacb.fid
more procpar.txt
This experiment was collected uniformly with 512 complex points in each FID and a total of 64 and 32 complex points in the t1 and t2 dimensions.
Lets start by processing the data with conventional FT methods using the script process_ft.com
./process_ft.com
If you care you can view the data with the following commands
contour [Alternative - use nmrDraw and the file hncacb_ft.ft3]
CONTOUR(F2-F3)>dim f1 f3
CONTOUR(F1-F3)>clevel 2 10.0 mul 1.3 20
CONTOUR(F1-F3)>clevel 3 -10.0 mul 1.3 20
CONTOUR(F1-F3)>go
CONTOUR(F1-F3)>plane
Examine multiple planes. Plane 53 with a slice drawn is shown here
CONTOUR(F1-F3)>exit
Now lets process the data non-uniformly using a sample schedule, schedule.scd. First lets look at the sample schedule.
more schedule.scd
Notice that the sample schedule is a simple text file with two columns. Column 1 represents t1 and column 2 represents t2. There are 300 lines in the sample schedule, thus when we process the data with this sample schedule only 300 of the 2048 actual FID's will be used (~15%). The integer values in the sample schedule represent which evolution delays will be used. For example the 11th line in the file is "13 1" which means that the FID which corresponds to the 13th increment along t1 and the 1st increment along t2 will be used. It is often useful to view the sample schedules graphically as shown here:
Each red dot represents one of the 300 data points out of 2048 that will be used when processing the data. Note that the data is skewed to early t1 and t2 evolution time points where the signal is stronger. Also note that the maximum increment size along t1 and t2 are 64 and 32. This is limited by the fact that when the experiment was collected that those were the largest t1 and t2 evolution times collected. If the experiment were actually collected non-uniformly these values would likely have been set higher to archive better resolution.
Lets examine the processing script process_msa2d.com.
more process_msa2d.com
The script is broken into a few sections.
Lets run the script
./process_msa2d.com
View the data with contour
contour [Alternative - use nmrDraw and the file hncacb_msa2d.ft3]
CONTOUR(F2-F3)>dim f1 f3
CONTOUR(F1-F3)>clevel 2 5.0 mul 1.3 20
CONTOUR(F1-F3)>clevel 3 -5.0 mul 1.3 20
CONTOUR(F1-F3)>go
CONTOUR(F1-F3)>plane
Examine multiple planes. Plane 53 with a slice drawn is shown here
CONTOUR(F1-F3)>exit
In this example we processed that hncacb throwing away 85% of the data points and were able to achieve good sensitivity and resolution as compared to conventional FT methods for the complete uniformly sampled experiment.
Web based generation of processing scripts and automatic parameter selection using SBTOOLS (top)
We have created a web-based tools for generating RNMRTK processing scripts. The tool currently handles 2D and 3D data sets. Processing scripts using FT or MSA methods can be generated. In addition, an auto mode exists for maximum entropy calculations that will automatically determine reasonable values for def, aim, and lambda. The processing scripts are very verbose with lots of comments and error checking. Click here to open the sbtools web site in a new window.
Lets create a maximum entropy processing script for a non-uniformly collected 3D HNCO
cd 3d_nonlinear_msa2d.fid
more procpar.txt
Note the important information needed for processing:
It is also important to know what the maximum increment in the t1 and t2 dimensions are so that appropriate choices can be made as to the final size of the processed spectrum. One easy way to do this is with sort
sort -n -k 1 3d_test.scd (25 should be the maximum value for t1)
sort -n -k 2 3d_test.scd (44 should be the maximum value for t2)
wc -l 3d_test.scd (326 lines in the sample schedule)
Go to the sbtools site and enter the following information into the form. Note that there are two images below showing how the form should look when properly filled out. Some of the information comes from the parameters which were used to setup the experiment, some are set automatically as you fill out the form, and some are user defined choices.
After filling out the form with the values from above select Create Script to generate the processing script.
Once the script is created cut and paste it into the file process.com. Note that on macs there can be issues with newline characters. If this is an issue use vi or a text editor that allows the newline characters to be defined such as TextWrangler.
After saving the script execute it
./process.com
section -c 128 128 256
rnmrtk load test.sec
Use contour to view the data
contour [Alternative - use nmrDraw and the file test_f3f1.ft3]
CONTOUR(F2-F3)>dim f1 f3
CONTOUR(F1-F3)>level 10.0 mul 1.3 20
CONTOUR(F1-F3)>go
CONTOUR(F1-F3)>plane
Plane:quit
CONTOUR(F1-F3)>exit
Processing with maximum entropy reconstruction with linewidth deconvolution (top)
As an inverse processing method maximum entropy reconstruction is ideally suited to being able to stably deconvolve spectra. Built into msa, msa2d, and msa3d is the ability to deconvolve J-couplings, allowing virtual decoupling, and linewidth, allowing linewidths to be narrowed.
We will use a 2D 15N-HSQC as an example and perform linewidth decoupling in the indirect dimension. The same principles apply for 3D data sets. Linewidth deconvolution makes calculations times longer and hence the reason to demonstrate on a 2D.
cd hsqc_msa_lw.com
Inside the directory is the raw NMR data, the sample schedule (schedule.scd) and two processing scripts. The script process.com is a generic script for processing the data in auto mode without any linewidth deconvolution and is only there for reference. The process_lw.com script will perform the same processing as process.com with the addition of linewidth deconvolution. To run the script you must enter the linewidth to be deconvolved and a base filename.
Try processing the spectra with various linewidths deconvolved. As an example I have chosen 0, 10, 30, and 50, but try any values you like. Make sure you enter the value as a float (with a decimal point).
./process_lw.com 0.0 data_001
./process_lw.com 5.0 data_002
./process_lw.com 10.0 data_003
./process_lw.com 15.0 data_004
./process_lw.com 25.0 data_005
./process_lw.com 50.0 data_006
Once the spectra are processed open them with nmrDraw and see the results (Use "data_%03d.ft2 1 6" as the template to load all of them at once). If you need to view the data in contour use the following commands.
contour
CONTOUR(F1-F2)>level 25.0 mul 1.3 20
It is easiest to view the differences in the spectra by zooming in and using a vertical slice through the t1 dimension.
0.0 Hz Linewidth deconvolution
5.0 Hz linewidth deconvolution
10.0 Hz linewidth deconvolution
50.0 Hz linewidth deconvolution
As can be seen from the images above the peaks narrow as we increase the amount of linewidth that is deconvolved from the spectra. At a certain point however the deconvolution causes problems. In this case 50.0 Hz is far too aggressive.
It is critical that when using linewidth deconvolution that you do not deconvolve too large a value (values that approach the natural linewidth) and that the final output size is sufficiently large. If the output size is not large enough then truncation artifacts will appear in the narrowed spectrum.
Generating sample schedules with ScheduleTool and sampsched2d (top)
There are two tools distributed with the rnmrtk software package. One is a command line tool, sampsched2d, that is built into the rnmrtk exe directory. The other tool is a graphical java based tool which is located in the folder ScheduleTool in the rnmrtk installation folder. It is also located in the ScheduleTool folder of this tutorial. To run the graphical version you need to have a semi-recent version of java installed (java 1.5 or later) and it should be a Sun version of java, not the GNU version. You can check what version of java is installed with java -version. It is also helpful to place the ScheduleTool.jar file and the ScheduleTool or ScheduleTool-mac script to launch the jar file somewhere in your path, such as /usr/local/bin.
In this part of the tutorial we will explore how to generate sample schedules with these two tools and show you how to apply those sample schedules to a large hnco dataset. We leave it up to you to try various sample schedules and process the data.
Generally when creating a sample schedule you want to choose maximum increment values that will give you the desired resolution, but select a greater number of points with short time increments where the signal is strong. Both these tools allow the user to enter a decay rate that is used to skew the distribution of selected points to the beginning where the signal is strong. For constant time experiments a decay rate of 0.0 can be entered which will cause a random distribution to be selected.
cd hnco_large.fid
more procpar.txt
When building a sample schedule it is important to know the nucleus type and sweep width in the indirect dimensions. For this experiment the sweep widths are 1050.0 and 930.0 Hz and the nuclei are 13CO and 15N.
First, sampsched2d
sampsched2d runs like other rnmrtk programs from the command line. Once executed values can be entered line by line.
sampsched2d
Once sampsched2d is started enter the values as they appear in the figure below. Note that sweep widths, decay constants, and J-coupling values must be entered as a float (with a decimal).
There should now be a file called schedule3.scd in the folder. Note that you can try entering any parameters you want, you don't have to use the defaults from this demo. However, as we are applying this schedule to data that has already been collected we are restricted to 128 as the maximum sample delay number in each dimension.
Alternatively a script can be created to generate the sample schedule. It would look something like this
Now lets try ScheduleTool
cd Schedule_Tool
ScheduleTool (linux)
ScheduleTool-mac (OSX)
A window like this should appear
I will give a brief description here on how to use the tool and refer you to the manual for detailed information.
Besides being graphical, the ScheduleTool differs in concept from sampsched2d in that it attempts to predict reasonable values for parameters needed to create the sample schedule. The user is supposed to enter parameters for molecular weight, field, nucleus, and sweep widths, and then hit the Compute Defaults button. This will attempt to predict a maximum increment which is equivalent in time to 1.26 X the T2 rate and choose a decay rate that is appropriate. In addition the user can force initial points to be sure that the earliest time points, where the signal is strong, are collected. Options for J-resolved experiments and oversampling are also included, but will not be discussed in this tutorial. Last the total points can be entered. A conservative value is chosen by default, but like all the parameters, it can be adjusted by the user.
Note: While it is intended to use the ScheduleTool in graphical mode for most users, it does have a command line option for scripting.
Fill out the form with various values and experiment with how it works. Choose create schedule when you are happy (do no choose maximum increment values greater than 128)
After hitting the create schedule button you can view three windows:
Once you have created a suitable sample schedule from sampsched2d or ScheduleTool save it to the hnco_large folder. From the ScheduleTool this can be done by hitting File - Save from the Schedule Window.
Process the data
./process_f3.com
This will process the acquisition dimension in a suitable manner to process the indirect dimensions with msa2d
more process_msa2d.com
This script creates the shared memory section, builds the msa2d parameter file (using as input the sample schedule name that you provide), processes the data, and then combines the data into a single file. The script needs the schedule name and a base output name to be entered on the command line.
./process_msa2d.com schedule_filename output_filename
Example: ./process_msa2d.com schedule3.scd schedule3
Feel free to try multiple sample schedules and see how the spectrum responds.
Measuring the response of the spectrum to altering DEF (top)
In this section we will use a auto script from the SBTOOLS web site that determines Def, Aim, and Lambda automatically. However, the value for Def will be scaled by dividing def by various values. In each case aim will be held constant and lambda will be determined in an automatic fashion.
cd hsqc_def.fid
more run_all.com
more process.com
The run_all.com script simply executes the process.com script seven times with different values to divide Def by.
Feel free to edit the run_all.com script to try any value of dividing def by that you like.
Execute the run_all.com script
./run_all.com
After it is complete run the intensities.com script. This script will extract what def and lambda values were used to process each of the spectra and then calculate the ratio of a weak to strong peak. The ratio from a linearly sampled spectra processed using conventional FT methods to listed for comparison.
more intensities.com
./intensities.com
Note that as the def value is decreased the converged lambda value also decreases and as the def value is increased the converged lambda value also increases.
On the bottom of the output is a ratio of a weak to strong peak along with the actual ratio of 0.363 from a FT spectra that was linearly sampled. Note that for low lambda values the ratio is smaller and the ratio increases as lambda increases. It would thus seem like a good idea to set lambda to a large value to reduce the non-linearity of the maximum entropy reconstruction. However, below are selected spectra processed with increasing lambda values and it is obvious that a large lambda value is not the regime in which optimal results will be obtained. NOTE: This is the regime in which forward maximum entropy reconstruction works in.
Def = 0.005890, Aim = 3.02, Lambda = 0.05
Def = 0.5890, Aim = 3.02, Lambda = 1.91
Def = 5.89, Aim = 3.02, Lambda = 8.27
Calibrating nonlinearites in peak intensities by injecting synthetic peaks (top)
When performing maximum entropy reconstructions the peak intensities become non-linear. Thus peak intensities and volumes cannot be accurately determined from direct measurement. In most cases this issue does not cause any problems as in many NMR spectra it is only the location of a peak that is criticial. However, for some experiments, especially relaxation studies, it is critial to have accurate peak intensities.
One way to accomplish that with maximum entroy reconstruciton is to synthetically inject peaks into your spectrum and measuring the resulting peak intensities as compared to the intensities that they were injected with.
Lets see what this would look like
cd 2d_correct.fid
Inside this folder are two scripts
process.com - Processes the 2D spectrum with msa and injects 10 peaks into the spectrum synthetically. Shown here is the section of the script that injects the peaks.
See the manual for detailed informaiton on the inject function, but in general we simply define if the peaks will be injected based on frequency or ppm, setup linewidths in the two dimensions, set the phase to be whatever phase value is used when processing the spectrum, and then use the peak command to inject peaks defining the intensity and frequncies.
The script is designed to run in constant lambda mode where the lambda value is defined on the command line. To run the script enter:
./process.com 1.0 (NOTE, use any lambda value you want. Try different values from near zero to over 100. Make sure to enter lambda as a float)
Take a look at the spectrum with nmrDraw or contour. A good level for contour is 100.0 The filename will be called test with the lambda value used in processing appended at the end.
There is another script called intensities.com. This script measures the intensities at the location of the 10 synthetically injected peaks. It reports 5 columns of numbers.
To run the script type
./intensities.com test.1.0.sec (Use whatever lambda value you entered when processing.
Examine the values for different lambda values used in processing. In general as lambda goes higher the non-linearity improves, but as in the def example earlier, the quality of the spectra deteriates.
Below are three plots of measured intensities versus actual intensities (blue lines). A normalized linear line is also shown to illustrate the non-linearity. The top plot was processed with Lambda = 0.1 and thus the non-linearity is signficant. The central plot used Lambda = 1.0 and the bottom plot used Lambda = 10.0.
Note that in the plots it may appear the non-linearity improves for low intensitiy peaks. This is not true and the actual non-linearity is greater the weaker the peak as illustrated by the ratio's.
Processing 3D Bruker Data (top)
Processing Bruker data is in principle no different than processing Varian data, but there are a few differences:
cd Bruker_3d
The first thing we need to do is extract experimental parameters. This can be done by examining the acqu* files, but I have a perl script to aid in the process. To run the script type:
./bruker2sbtools.prl (Enter 1, 298, and 4.772 when prompted)
There is no loadbruker command like there is loadvnmr. However, loading Bruker data is in general quite easy. In order to load the data a parameter file must be created which describes the layout of the Bruker dataset. It is also convienent to include additional information such as spectrometer frequencies, sweep widths, and referencing information, although this information can also be set later.
Lets look at a script that will load the Bruker data into RNMRTK
more convert.com
The script builds a file, ser.par, setting information about the format such as big-endian and int-32, information about the number of dimensions set with the DOM command, information such as number of points, SW, SF, PPM, QUAD in each of the dimensions. It also has a layout line which describes the layout of the file.
More about the layout line
In this example the layout line is: LAYOUT T2:80 T1:60 T3:1024
This states that the T3 dimension is the fastest collected dimension (direct dimension), the T1 dimension is the the second fastest dimension collected (first of the two indirect dimensions to be incremented), and T2 is the slowest dimension (second of the two indirect dimensions to be incremented). From a file layout point of view it states that the first 1024 points belong to the first FID. There are then 60 blocks of 1024 points for the T1 dimension. There are then 80 blocks of 60x1024 data points for the T2 dimension.
The nice thing about Bruker data is that there are no extraneous headers or padding between data, the data always is in int-32 and big-endian format, and the layout line is generally very simple to create.
At the bottom of the convert.com script we perform two more functions. We take the complex conjugate of the acquisition dimension. This reverses the spectrum putting it in the correct orientation for the RNMRTK. We also throw away the first 71 points by using the shrink command to eliminate artifacts from the Bruker digital filtering.
Now lets convert some data.
./convert.com
seepln
SEEPLN(T3)>row
Use seepln and the row command to make sure the data looks correct.
Back to the bruker2sbtools.prl program. When this script ran a few different files were created.
Lets try running the rnmrtk.com script
more ./rnmrtk.com
The rnmrtk.com script loads the data, fixes the digital filtering artifacts, processes the data, and saves the data in rnmrtk and nmrPipe formats.
./rnmrtk.com
Lets look at the data in either nmrDraw or contour
nmrDraw (open file hncogp3d_f3f2.ft3)
or
section -c 1024 128 128
rnmrtk load hncogp3d.sec
contour
CONTOUR(F2-F3)>level 200.0 mul 1.3 20
CONTOUR(F2-F3)>go
CONTOUR(F2-F3)>plane
In either nmrDraw or contour look at various planes. Below is an image of plane 53. Note that the peaks are near the edge of the spectrum and the central part of the spectrum is devoid of peaks. This is because some Bruker data sets need to have a quadfix performed on them inside rnmrtk and this is an example of an experiment that needs such a fix. In this experiment the problem is in the t2 dimension. To fix the problem the real and imaginary components along the t2 dimension need to be negated.
Using an editor open the rnmrtk script and add the following two lines after the SHRINK 441 72 line
and rerun the script
./rnmrtk.com
View the data again and notice that it is now correct.
In addition to conventional FT processing there is a processing script sbtools_msa2d.com that can also be executed to process the same spectra with MaxEnt. This script was generated by copying and pasting (from a text editor) the sbtools.input text file into the sbtools web site and generating a script. The only things that were changed on the site were the number of CPUs, the processing type (switched to maximum entropy in t1 and t2), and the addition of a quad fix in t2.
Feel free to run the sbtools_msa2d.com script or better yet attempt to build the script yourself from scratch or from the sbtools web site.
Processing 4D data (top)
cd 4D/4dCNnoesy.fid
Inside this folder there are three scripts.
From the procpar.txt file and the pulse sequence we can see parameters used to collect the data
more procpar.txt
more ft.com
This script will process the 4D experiment and save the result as final.sec. Currently forward linear prediction is only on along the t1 dimension, but it can be turned on along t2 and t3 by uncommenting the # from the flip lines. Better yet would be to add new sections below the t1 dimension to IFFT the t3/t2 dimension, perform linear prediction, and then do a FFT again. This way the linear prediction is only performed on a time domain dimension when the other three dimensions are in the frequency domain. I leave it to you to edit the script in a suitable manner to do this.
./ft.com
View the data with the instructions below
more maxent.com
The maxent.com script loads the data, does some referencing, processes the t4 dimension with FFT, and then processes the t1, t2, t3 cubes using msa3d in chunks of 8 cubes per chunk. Even though it is a 4D the script is not that different from the 3D examples earlier. Note that the msa3d.par file has an extra column for NUSE and NOUT.
./maxent.com
./combine.com (Saves the data as final_msa3d.sec)
View the data with the instructions below
more maxent_sched.com
This is an identical script to maxent.com except it will process the data with msa3d using a sample schedule as if the data was collected non-uniformly.
./maxent_sched.com
./combine.com (Saves the data as final_msa3d_sched.sec)
View the data with the instructions below.
I leave it up to you to try the script. Feel free to edit the script or create your own sample schedule. Note that as the data has already been collected any sample schedule can only have a maximum increment of 64, 16, and 16 for the t1, t2, and t3 dimensions.
Visualizing 4D data
Assuming the data is loaded into memory here are the instructions for viewing 4D data in contour.
contour
CONTOUR(F2-F3)> dim f1 f4 (This will show 1H - 1HN planes)
CONTOUR(F2-F3)>level 4.0 mul 1.3 20 (Note, you may need to try different contour levels. Use clear between tries when looking for a good level).
CONTOUR(F2-F3)>go
CONTOUR(F2-F3)>cube
select a cube in the middle somewhere
CONTOUR(F2-F3)>plane
Now look through the planes to see the quality of the spectrum.
The RNMRTK LOAD command (top)
The RNMRTK program has a universal loader command (load). It is capable of loading data in many different formats with one caveat; you must know what the data type is and what the layout of the file is. Clearly most users will not know these types of details. Fortunately, there is a loadvnmr command as we have seen earlier and while there is no loadbruker command, Bruker data is generally easier to load manually than Varian data as we will see shortly. To load data into the rnmrtk program with the load command two things must be present. The raw NMR data, which must have an extension, and a parameter file with the same base name and a .par extension.
NOTE: For Bruker data the ser filename must be renamed or copied to a name with an extension such as ser.dat.
The parameter file must contain at least these four lines, each which will be described next
The parameter file may also contain additional lines defining experimental parameters
FORMAT: The format line is a single line that can have up to six arguments:
For Varian data the following are typical parameters.
For Bruker Data the following are typical parameters.
DOM: The DOM line defines the number of dimensions and defines the order in memory that each of the dimensions will be.
N: The N line defines the number of points in each of the dimensions defined by DOM and defines whether the data is real (R) or complex (C). Data is ordered t1 t2 t3.
LAYOUT: The LAYOUT line describes the layout of data in the file, which for Varian data matches the way the data was arrayed.
For example if the Varian array for a 3D was set to phase2,phase the true array is actually
as np, ni, and ni2 are implied. Arrays go from the outside in, in nested loops. Essentially the line is read backwards as compared to the way the actual file is laid out.
The LAYOUT line has three parts for each arrayed element:
Varian data and the load command
cd 3d_noesyhsqc.fid
more fid.par
cp fid fid.sec
section -c 1024 220 64
rnmrtk loadvnmr ./fid (Load data with loadvnmr first for comparison)
rnmrtk seepar
rnmrtk load fid.sec
rnmrtk seepar
Most of the parameters are the same and the data is loaded in the same way. The loadvnmr command by itself did not calculate the carrier's and referencing information properly, but that is easily fixed with the setpar command.
Lets see if the data looks good with seepln
seepln
SEEPLN(T3)>exit
If you want to test that the data is truly loaded correctly you can process the experiment with the script test_load.com which uses the load command rather than loadvnmr to load the data.
more test_load.com
./test_load.com
More about the loadvnmr command (top)
The loadvnmr command does a good job at parsing the procpar and header file for information, but it cannot guess which dimension goes with which channel (transmitter/decoupler). It also assumes that the d2 delay goes with t1 and d3 goes with t2. To deal with these issues the loadvnmr command has two additional arguments
In the noesyhsqc experiment t1 and t3 are 1H (transmitter) and t2 is 15N (decoupler 2).
Thus for the loadvnmr command we can enter
rnmrtk loadvmr TR020 D23 ./fid
rnmrtk seepar
Now we see that the carrier is properly selected. Note that referencing information still needs to be entered manually.
Bruker data and the load command (top)
There is no loadbruker equivalent to the loadvnmr command. However, it is quite easy to build a parameter file to load Bruker data with the load command. This is because the layout line does not need to deal with sub-dimensions, there are no headers or padding to deal with, and the format is always the same.
Lets look at the Bruker 3d HNCO from earlier
cd Bruker_3d_hnco
cp ser data.sec
more data.par
Note that the Format line is always the same. The layout line is simple to execute as there are no sub-dimensions necessary to define as the Bruker layout and the rnmrtk layout are the same. This simple parameter file can be used to create just about any parameter file for Bruker data.
Lets load the data
section -c 1025 60 80
rnmrtk load data.sec
rnmrtk seepar
Lets look at the data
seepln
SEEPLN(T3)>peak
move the cursor with the number keys 1-4 and 6-9 to find the first real point of the FID. Note that it can be problematic due to phasing to know exactly which point is the first real point with this manner. In this example the first real point is point 72.
SEEPLN(T3)>exit
To fix the DPS data we need to shift the FID to the left. This is done with the shrink command
rnmrtk
RNMRTK>dim t3
RNMRTK>shrink 441 72
RNMRTK>CONJ
RNMRTK>exit
View the result with seepln
seepln
Note, the conj reverse the acquisition dimension which would be appear backwards otherwise.
Alternatives to figure out how many points to shift the FID to fix the DSP artifacts.
Look at the parameter GRPDLY. If it exists and is non zero then GRPDLY is equal to the number of points to shift. If GRPDLY does not exist or is equal to zero then look up the DECIM and DSPFVS parameters and use the table below.
Note: These values are floats and you must use an integer to shift. Simply delete the values after the decimal (do not round up). The zero order phase correction can then be calculated as well by the formula
phase0 = 360-((shift_from_table - integer_shift)*360)
Alternatively one can use the bruker2sbtools.prl program talked about in the Bruker section of this tutorial.
DECIM | DSPFVS 10 | DSPFVS 11 | DSPFVS 12 | DSPFVS 13 |
2 | 44.7500 | 46.0000 | 46.311 | 2.750 |
3 | 33.5000 | 36.5000 | 36.530 | 2.833 |
4 | 66.6250 | 48.0000 | 47.870 |
2.875 |
6 | 59.0833 | 50.1667 | 50.229 | 2.917 |
8 | 68.5625 | 53.2500 | 53.289 | 2.938 |
12 | 60.3750 | 69.5000 | 69.551 | 2.958 |
16 | 69.5313 | 72.2500 | 71.600 | 2.969 |
24 | 61.0208 | 70.1667 | 70.184 | 2.979 |
32 | 70.0156 | 72.7500 | 72.138 | 2.984 |
48 | 61.3438 | 70.5000 | 70.528 | 2.989 |
64 | 70.2578 | 73.0000 | 72.348 | 2.992 |
96 | 61.5052 | 70.6667 | 70.700 | 2.995 |
128 | 70.3789 | 72.5000 | 72.524 | |
192 | 61.5859 | 71.3333 | 71.3333 | |
256 | 70.4395 | 72.2500 | 72.2500 | |
384 | 61.6263 | 71.6667 | 71.6667 | |
512 | 70.4697 | 72.1250 | 72.1250 | |
768 | 61.6465 | 71.8333 | 71.8333 | |
1024 | 70.4849 | 72.0625 | 72.0625 | |
1536 | 61.6566 | 71.9167 | 71.9167 | |
2048 | 70.4924 | 72.0313 | 72.0313 |