A Python package for extracting and simulating diffuse scattering of X-rays in protein crystals
If you use this software, please cite the following paper: TBC
- phenix/cctbx: runs in cctbx.python (2.7.x / 3.x) and depends on cctbx modules
- dials/dxtbx: used for reading crystallographic image files
- numpy
- scipy
- matplotlib
Make sure cctbx.python is available and the dxtbx package is installed.
Clone the repository
git clone https://github.com/jvdhorn/decaf
Update PATH and PYTHONPATH
source source_env.sh
Optional: install package
cctbx.python -m pip install .
Command-line parameters can be provided using Phil syntax. For example:
schimpy pdb_in=structure.pdb tls_in=optimised_model.json
Run any of the modules without arguments to get an overview of the available parameters for that module.
schimpy- SuperCell Hierarchical Modelstimpy- Statistical Image-processingslicemtz- Viewing utility for mtz-filesubmtz- Subtract two mtz-filesplotmtz- Plot intensity distributions in mtz-filesccmtz- Calculate correlation coefficient and R1-values between mtz-filesrmbragg- Remove supercell voxels around Bragg positions in mtz-filemtzstats- Show various statistics including R-int for mtz-filestimpy3d- Apply the stimpy-procedure to resolution bins in mtz-filepatterson- Calculate Patterson map from mtz-filepattsize- Estimate the size of the Patterson origin-peakmtz2txt- Extract raw intensities from mtz-filemtz2map- Convert mtz to ccp4 map-filemap2mtz- Convert ccp4 map-file to mtzfilter_mtz- Apply a kernel-filter to an mtz-fileqdep- Find power law for intensity decay around Bragg positions in mtz-filepdbrad- Estimate the size of a pdb-objectpdbdist- Plot C-alpha covariance matrix for multistate pdbcovbydist- Plot C-alpha covariances by distance and fit decay functionensemble2adp- Convert multistate pdb to (anisotropic) ADPssubadp- Subtract ADPs from two pdb-filesbtrace- Plot C-alpha B-factor trace
pdb_in- refined structure file (pdb)tls_in- result of the ECHT B-factor distribution (json)- if not provided, TLS-matrices are extracted from
pdb_in(if available)
- if not provided, TLS-matrices are extracted from
sc_size- size of the supercell (e.g."5 5 10")correlate- enable or disable correlation of TLS-groups (defaultTrue)use_pbc- enable or disable periodic boundary conditions (defaultTrue)intra_only- ignore intermolecular interactions for these levels (e.g."2")stretch- stretch parameter for the anchor points (default0.25)cutoff- distance cutoff for intermolecular interactions (default3.0)weights- power for the number of interactions (default1.5)max_level- highest level of the TLS-hierarchy to include (default1)resolution- resolution limit (default2.0)k_solandb_sol- bulk solvent parameters (default0.35and50.0)tls_multipliers- multipliers for all input tls matrices (e.g."1.0 0.0 0.0")skip- skip correlation (but not displacements!) for these levels (e.g."2 3 4")randomize_after- randomly swap positions of all molecules after this level (e.g.2)reverse- start procedure at highest level of the hierarchy (defaultTrue)remove_waters- remove all water molecules from the input (defaultTrue)swap_frac- end correlation prematurely after this fraction of swaps (default1.0)energy_percentile- only allow swaps for which local energy exceeds this percentile (default0.0)single_mtz- write MTZ file with phases after first supercell (defaultFalse)processes- number of parallel simulations (default1, watch memory usage!)interval- number of seconds between consecutive simulations (default1.0)n_models- number of supercells to simulate (default128)
image- raw image fileradial- polarization-corrected radial averagebin_counts- bin regions within this range of counts (default1.0)N- expected number of independent rotations (default1.0)mode- Noisy Wilson distribution treatment (continuous,discrete,gainordual, defaultcontinuous)gain- estimate for the gain of the detector (default1.0)median_size- kernel size of the median filter (default9)dilation_size- kernel size of the mask dilation (default5)
mtz- input mtz or map-filelbl- array of interest (defaultIDFF)slice- desired slice to plot (defaulthk0)save- save image as high-res PNG instead of plotting (defaultFalse)depth- additional depth of the slice on both sides (default0)sc_size- size of the supercell for drawing Bragg positions (e.g."5 5 10")overlay- colour of axes and Bragg position indicators (defaultblack)log- plot log10 of the intensities (defaultFalse)minandmax- minimum and maximum values to plotautoscale- decrease upper bound for more detail in skewed slices (defaultTrue)zoom- zoom in around a given coordinate ("[x] [y] [pad]", e.g."65 65 27")inset- inset zoom around a given coordinate ("[x] [y] [pad]", e.g."65 65 27")contours- plot this many contours instead of raw values (e.g.127)projection- construct a cylindrical projection (gp,eqormercator)center- shift grid to put the corner in the center (defaultFalse)
mtz_1- first input mtz-filemtz_2- second input mtz-file (can be multiple)lbl_1- first array of interest (defaultIDFF)lbl_2- second array of interest (defaultIDFF)mode- use a different operator (sub,add,mul,divormerge, defaultsub)scale- scale factor for second mtz-file (0for autoscale, default1.0)
mtz- input mtz-file (can be multiple)lbl- array of interest (defaultIDFF)log- plot logarithmic distributions (defaultTrue)byres- plot average intensity in resolution bins (defaultFalse)resolution- low and high resolution (e.g."3.6 3.4")
mtz_1- first input mtz-filemtz_2- second input mtz-filelbl_1- first array of interest (defaultIDFF)lbl_2- second array of interest (defaultIDFF)bins- number of resolution shells (default10)hlimandklimandllim- limit h, k, and l (e.g."-20 20")resolution- low and high resolution (e.g."3.6 3.4")
mtz- input mtz-filesc_size- supercell size (e.g."5 5 10")box- number of voxels to remove around every Bragg position (default"1 1 1")fraction- remove this fraction of highest intensities in every box (e.g.0.05)subtract- subtract common intensities in resolution shells (meanormin)keep- invert selection (defaultFalse)
mtz- input mtz-filelbl- array of interest (defaultIDFF)sg- space group for R-int (symbol or number, e.g.P43212or96)resolution- low and high resolution (e.g."3.6 3.4")
mtz- input mtz-filelbl- array of interest (defaultIDFF)bins- number of resolution shells (default50)mode- Noisy Wilson distribution treatment (continuous,dual, defaultcontinuous)N- expected number of independent rotations (default1.0)
mtz- input mtz-filelbl- array of interest (defaultIDFF)bins- divide into this many shells and subtract the mean from each one (e.g.50)sample- sampling of the grid in 1/Angstrom, higher is finer (default3.0)use_intensities- calculate Patterson using intensities instead of F (defaultFalse)center- place the origin in the center of the map (defaultTrue)limit- real space limit in Angstrom of the output map (e.g.10.0)resolution- low and high resolution (e.g."3.6 3.4")
map- input patterson mapbinsize- size of radial bins in Angstrom (default1.0)sigma- sigma cutoff to determine size (default2.0)
mtz- input mtz-filelbl- array of interest (defaultIDFF)resolution- low and high resolution (e.g."3.6 3.4")
mtz- input mtz-filelbl- array of interest (defaultIDFF)fill- fill value for missing reflections (default-1000)resolution- low and high resolution (e.g."3.6 3.4")
map- input map-fileresolution- resolution limit (e.g.2.0)
mtz- input mtz-filelbl- array of interest (defaultIDFF)size- filter size (default1)filter- filter type (gaussianoruniform, defaultgaussian)interpolate- interpolate missing intensities using a normalized convolution (defaultFalse)
mtz- input mtz-filelbl- array of interest (defaultIDFF)sc_size- supercell size (e.g."5 5 10")strong- consider only this number of strongest reflections (default100)
pdb- input multistate pdb-file
pdb- input multistate pdb-file (can be multiple)mode- ensemble treatment (cov,cc,std,varormean, defaultcov)combine- multiple input treatment (both,sub,div,addormul, defaultboth)lines- plot lines at these x and y-positions (e.g."25.5 75.5")
pdb- input multistate pdb-fileinclude_neighbours- include neighbouring molecules in the analysis (defaultTrue)
pdb- input multistate pdb-filemodels- limit number of models from the input pdb (e.g.100)
pdb_1- first input pdb-filepdb_2- second input pdb-file (can be multiple)mode- use a different operator (suboradd, defaultsub)
input- input pdb or json (can be multiple)lines- plot vertical lines at these x-positions (e.g."25.5 75.5")
This software relies on methods described in the following papers:
- Chapman, Henry N., et al. "Continuous diffraction of molecules and disordered molecular crystals." Journal of applied crystallography 50.4 (2017): 1084-1103.
- Pearce, Nicholas M., and Piet Gros. "A method for intuitively extracting macromolecular dynamics from structural disorder." Nature communications 12.1 (2021): 5493.
- Urzhumtsev, Alexandre, et al. "From deep TLS validation to ensembles of atomic models built from elemental motions. Addenda and corrigendum." Acta Crystallographica Section D: Structural Biology 72.9 (2016): 1073-1075.