SIMPSON v 4.1.1 is a bug fix release. The major changes concern the acq_block command. Nov 25, 2014
SIMPSON v 4.1 is a minor release. The major changes are
- Compiled with Tcl 8.6
- Fixed bug in storage of unit propagator
- Fixed bugs in fint
We are pleased to release a major update of simpson. The simulations are significantly faster than previously, and many new features are added. Use the links on the right to download your flavour or download the source if you want to compile SIMPSON yourself. Please note that the new version of SIMPSON uses two new libraries, fftw3 and nfft3.
dipole_ave N1 N2 bIS eta alpha beta gamma - for motionally averaged dipole-dipole interactions this allows to define asymmetry parametereta
method value1 value2 value3 ... - defines calculation methods for fsimpson.
value | feature |
direct, idirect | both values equivalent, uses direct evaluation of pulseq, possibly with COMPUTE method if acq_block is used*) |
gcompute, igcompute | both values equivalent, uses gamma-COMPUTE method, either in original pulseq syntax or in new syntax with acq_block – allows for manipulation before acquisition)*) |
time | forces calculations to be done in time domain |
frequency | forces calculations to be done in frequency domain – can be used only in combination with acq_block (both direct or gcompute) |
block_diag | tries to use block-diagonal Hamiltonians whenever it is possible; considerable speedup for free evolution periods and when rf not applied on all channels |
FWTinterpolation | triggers FWT interpolation method to be used for both amplitudes and frequencies – works with gcompute only |
FWT2interpolation | triggers FWT interpolation method to be used just for frequencies, amplitudes are taken the same as in the source orientational set (nearest) – works with gcompute only |
ASGingterpolation | triggers ASG interpolation method used after spectrum generation - works with gamma-COMPUTE in frequency domain only |
FWTASGinterpolation, FWT2ASGinterpolation | combinations of the interpolation methods |
sparse | uses sparse matix algorithms, needs simpson linked with Intel MKL libraries |
diag, pade, chebyshev, taylor, lanczos | experimenting with different methods to calculate propagators (matrix exponentials) |
*) For a description of possible combinations of method instructions see below. |
num_cores N – defines number of threads to be used for parallel calculations of parameter averages (mainly crystallite orientations). If not defined then all processor cores are used. Number of threads can also be limited by environment variable SIMPSON_NUM_CORES
crystal_file filename N1 N2 – optional arguments N1 and N2 define index of the first and the last orientation to be read from the filename
crystal_file filename1 filename2 – used for FWT interpolation, defines source set of orientations (filename1) and larger target set of orientations (filename2). At the moment, only hemisphere Lebedev sets are allowed.
rfprof_file filename
rfprof_file filename N1 N2 – optional arguments N1 and N2 define index of the first and the last parameter to be read from the filename containing rf inhomogeneity profiles
averaging_file filename
averaging_file filename N1 N2 – optional arguments N1 and N2 define index of the first and the last parameter to be read from the filename containing set of parameter values over which the calculation should average. The filename is formatted as follows: first line defines parameters in the same way as accepted for the fsimpson command, other lines define their values. Example:
shift_1_iso | dipole_1_2_aniso | weight |
-10p | -10000 | 0.2 |
+10p | -10000 | 0.2 |
0 | -10000 | 0.2 |
0 | -8000 | 0.2 |
0 | -12000 | 0.2 |
point_per_cycle N – used for direct method in frequency domain, defines number of points to which the one cycle of acquisition Hamiltonian should be split
ED_symmetry val – controls how to deal with possible excitation-detection symmetry: if val is -1 then it is not considered, if val is 0 then it is internally checked and used, if val is 1 then it is not checked and it is directly considered (user responsibility). Default is -1.
oc_grad_level N – defines accuracy level of optimal control gradient, default is 1 but L-BFGS methods requires at least 2 or higher
oc_method – defines optimal control method, can be CG for conjugated gradients (default), or L-BFGS
acq_block {…} – allows easy definition of pulse manipulations during acquisition. The pulse sequence defined in parenthesis should contain one period of the total Hamiltonian. For gamma-COMPUTE the total length has to be the rotor period divided by the number of gamma-angles. For COMPUTE the pulse sequence is repeated such that it fills up multiple of rotor periods.
gcompute can be used whenever the pulse sequence is synchronized over a period equal to the rotor period divided by the number of gamma angles. Otherwise, the following methods are possible
static | spinning | gcompute |
direct | direct | gcompute |
direct freq | direct freq | gcompute time FWTinterpolation |
direct freq ASGinterpolation | gcompute freq | |
direct freq FWTinterpolation | gcompute freq ASGinterpolation | |
direct freq FWTASGinterpolation | gcompute freq FWTinterpolation | |
gcompute freq FWTASGinterpolation | ||
gcompute freq FWT2interpolation | ||
gcompute freq FWT2ASGinterpolation |
all of these methods can be combined with your choice of block-diagonalization, sparse matrix handling, acq_block, etc.
Version 3.1 is a bug-fix release to version 3.0. The bugs were in the OC routines and for the second-order quadrupolar interaction during delays.
This section describes add-ons to the functionality of SIMPSON. All previous features are kept without any change except everything executes much faster for larger spin systems (more than 2 spins). Such speed up was gained thanks to replacing original matrix manipulations hard coded in SIMPSON by full use of BLAS and LAPACK pre-built (and performance optimized) libraries. Extra speed is achieved by implementing fast acquisition algorithms and parallelization.
There are implemented two ways of parallel calculations. The first one makes use of fork mechanism supported on Linux and Mac (not MS windows). The program automatically detects number of processor cores and starts appropriate number of instances. This number can be modified by setting the environment variable SIMPSON_NUM_CORES to a given number.
The second way is based on MPI and needs SIMPSON properly built from sources. Before installing form sources, make sure the following prerequisites are installed:
According to the specific versions and flavours of these prerequisites, it may be necessary to adjust the makefile. The makefile directives below show how to include and link to these packages. To enable MPI, add '-I/path/to/mpi/include' to the include list, '-lmpi' to the library list, and '-DMPI' to the flags list, i.e.:
INCLUDES = -I/usr/include/tcl8.5 -I/usr/lib/openmpi/include
LIBRARIES = -lm -lgslcblas -llapack -ltcl8.5 -lmpi
FLAGS = -DMPI -c -O3
Now, type make to build SIMPSON. Once built, SIMPSON can be run with MPI using mpirun. Make sure to start as many MPI processes as there are processor cores. For instance, in a cluster setup with 8 machines, each equipped with 4 processor cores, one should start 8*4 processes (the fork mechanism from above is not used when using MPI):
mpirun -np 32 simpson <inputfile>
Works for static as well as MAS simulations. Pulse sequence includes new command
acq_block {...}
which defines the whole acquisition period of the simulated experiment. It calculates the whole fid at once. In the curly braces one can define any pulse sequence applied during the acquisition. The sequence is coded using standard simpson commands but must obey certain rules.
In the input file, use coding like this:
par {
method idirect
...
}
proc pulseq {} {
global par
...
pulse ...
...
acq_block {delay $dw}
for decoupling on some channel, or
acq_block {
pulse $tp1 ...
pulse $tp2 ...
delay $td3 ...
...
}
for some periodic rf pulse sequence repeated every two dwell-time periods (and that would work just for MAS at the moment). Here one should be careful with synchronization, when the pulse sequence has to be repeated many times to match MAS, efficiency is lost.
The original simpson did not allow for preparation of density matrix before acquisition. The spin state had to be independent of the g-angle when acquisition starts. This is now released, again using acq_block {} command. Pulse sequence in parenthesis has to have duration of one rotor period divided by the number of g-angles (this originates from ideas behind g-COMPUTE, only then efficient averaging can be used). However, spectral width does not need to be spin_rate*gamma_angles - it just needs to stay simply synchronized like in the example. The following code is possible
par {
...
method igcompute
gamma_angles 10
spin_rate 12000
sw spin_rate*gamma_angles/3
...
}
proc pulseq {} {
global par
...
pulse ...
...
acq_block {
delay [expr 1.0e6/$par(spin_rate)/$par(gamma_angles)]
}
}
Some tricks cutting unnecessary operations has been used and this implementation is not only more general but also slightly faster (20% for 5 spins) than the original gcompute code.