cctbx / dxtbx Goto Github PK

This project forked from dials/dxtbx

Diffraction Experiment Toolbox

License: BSD 3-Clause "New" or "Revised" License

Python 72.45% C++ 26.79% CMake 0.71% C 0.01% Gherkin 0.03%

dxtbx's Introduction

This repository uses submodules to track the versions of different repositories.

This is used to build the source tarball for cctbx releases and to test the conda packages from conda-forge and cctbx-nightly.

This repository is not for general use.

dxtbx's People

Contributors

Stargazers

Watchers

Forkers

ndevenish exafel antonyvam jblaschke graeme-winter tiankunzhou toastisme jbeilstenedmands lifehasorder jamesrhester christianbecke flexxbeamline

dxtbx's Issues

dxtbx.to_xds broken

cs03r-sc-serv-16 to_xds :) $ dials.import ../aps-zw-dials-bad/*cbf
DIALS (2018) Acta Cryst. D74, 85-97. https://doi.org/10.1107/S2059798317017235
DIALS 2.dev.905-g8220f3939
The following parameters have been modified:

input {
  experiments = <image files>
}

--------------------------------------------------------------------------------
  format: <class 'dxtbx.format.FormatCBFMiniEiger.FormatCBFMiniEiger'>
  num images: 3600
  num sweeps: 1
  num stills: 0
--------------------------------------------------------------------------------
Writing experiments to imported.expt
cs03r-sc-serv-16 to_xds :) $ dxtbx.to_xds imported.expt 
Traceback (most recent call last):
  File "/dls/science/users/gw56/svn/cctbx/build/../modules/dxtbx/command_line/to_xds.py", line 35, in <module>
    run(sys.argv[1:])
  File "/dls/science/users/gw56/svn/cctbx/build/../modules/dxtbx/command_line/to_xds.py", line 29, in run
    sweep = ImageSetFactory.new(file_names)[0]
  File "/dls/science/users/gw56/svn/cctbx/modules/dxtbx/imageset.py", line 331, in new
    iset = ImageSetFactory._create_imageset(filelist, check_headers)
  File "/dls/science/users/gw56/svn/cctbx/modules/dxtbx/imageset.py", line 434, in _create_imageset
    return format_class.get_imageset(filenames, as_imageset=True)
AttributeError: 'NoneType' object has no attribute 'get_imageset'

replace format registry metaclass with entry points

Downsides of metaclass approach is that every class needs to be imported for registration. Also apparently requires a kludge for Python 3. Might as well get rid of it and use the same mechanism that we use in dlstbx for service registration or dxtbx/dials for profile and scaling models.

Machine-dependent ordering of format classes

If multiple format classes match a given image, then the particular format that "wins out" appears to be machine-dependent, which can cause subtle bugs resulting in apparent differences in behaviour of downstream programs (spotfinding, indexing, etc...).

EIGER pixels as unsigned int?

I'm just noting a user-reported bug (I have not checked this myself)

dxtbx class for EIGER treats pixel values as unsigned int. Thus, invalid pixels which contain -2 become huge numbers

Mosaic crystal.change_basis returns non-mosaic crystal

Currently MosaicCrystalSauter2014.change_basis returns a Crystal model instead of a MosaicCrystalSauter2014. Likely MosaicCrystalSauter2014 needs a proper copy constructor or change_basis needs to respect the polymorphism.

See xfailing test test_change_basis_mosaic_crystal in dxtbx/tests/model/test_crystal_model.py

Saturation / underload values for HDF5 / NeXus

                # Get the trusted range of pixel values
                underload = (
                    float(nx_detector.handle["underload_value"][()])
                    if "underload_value" in nx_detector.handle
                    else -400
                )
                overload = (
                    float(nx_detector.handle["saturation_value"][()])
                    if "saturation_value" in nx_detector.handle
                    else 90000
                )

seems arbitrary at best

Grey-Area work :( $ h5dump -d /entry/instrument/detector/saturation_value ../i03-0013_2_4_master.h5 
HDF5 "../i03-0013_2_4_master.h5" {
DATASET "/entry/instrument/detector/saturation_value" {
   DATATYPE  H5T_STD_I64LE
   DATASPACE  SCALAR
   DATA {
   (0): 65535
   }
}
}

is inconsistent with the actual data type; presumably the data type should win?

Possible to create invalid Scan with negative oscillation

This comes from a user, who has SMV format images with OSC_RANGE=-0.10; in the header, which ends up as a scan with a negative oscillation (rather than a positive oscillation and an inverted rotation axis, which is the usual way of handling this).

Scan:
    image range:   {0,1060}
    oscillation:   {63.39,-0.1}
    exposure time: 0.096288

It seems that scans with a negative oscillation are invalid. For example, make a scan with negative oscillation:

from dxtbx.model import ScanFactory
scan=ScanFactory.make_scan(image_range = (1,90), oscillation=(0, -1.0),
  exposure_times=0.1, epochs=range(90), deg=True)

Now, use scan.is_angle_valid to check the range of valid angles between -360 and +360 degrees. What you find is that [-360,-90] ∪ [0, 270] is reported as valid. So it looks like a 270 degree scan rather than a 90 degree scan.

By contrast, set up a normal scan with positive oscillation:

scan=ScanFactory.make_scan(image_range = (1,90), oscillation=(0, 1.0),
  exposure_times=0.1, epochs=range(90), deg=True)

Now the is_angle_valid check tells us that [-360, -270] ∪ [0, 90] is valid, as expected.

Errors importing gzipped cbf

I'm seeing errors when trying to import the gzipped version of cbf files that work perfectly on their own. Example file: /dls/science/users/mep23677/cbfgz_dxtbx72_0001.cbf. Still poking this myself but it shows in 1.14 and 2.0. Curiously, taking the tutorial C2Sum betalactamase data doesn't exhibit this; even decompressing and recompressing works with those, so maybe something to do with FormatCBFFullPilatus?

$ dials.import cbfgz_dxtbx72_0001.cbf
DIALS (2018) Acta Cryst. D74, 85-97. https://doi.org/10.1107/S2059798317017235
DIALS 1.14.9-g0c59d74b8-release
...
Writing datablocks to datablock.json

$ gzip cbfgz_dxtbx72_0001.cbf
$ dials.import cbfgz_dxtbx72_0001.cbf.gz
DIALS (2018) Acta Cryst. D74, 85-97. https://doi.org/10.1107/S2059798317017235
DIALS 1.14.9-g0c59d74b8-release
CBFlib: warning input line 1 (1) -- invalid character
CBFlib: warning input line 1 (2) -- invalid character
CBFlib: warning input line 1 (3) -- invalid character
CBFlib: warning input line 1 (4) -- invalid character
CBFlib: warning input line 1 (6) -- invalid character
CBFlib: warning input line 1 (9) -- invalid character
CBFlib: warning input line 1 (23) -- invalid character
CBFlib: warning input line 1 (24) -- invalid character
CBFlib: warning input line 1 (26) -- invalid character
CBFlib: warning input line 1 (28) -- invalid character
CBFlib: warning input line 1 (30) -- invalid character
CBFlib: warning input line 1 (1) -- no data block
CBFlib: error input line 1 (1) -- syntax error
CBFlib: warning input line 1 (1) -- data block (null) ends with no content
Traceback (most recent call last):
  File "/dls_sw/apps/dials/dials-v1-14-9/build/../modules/dials/command_line/dials_import.py", line 883, in <module>
    halraiser(e)
  File "/dls_sw/apps/dials/dials-v1-14-9/build/../modules/dials/command_line/dials_import.py", line 881, in <module>
    script.run()
  File "/dls_sw/apps/dials/dials-v1-14-9/build/../modules/dials/command_line/dials_import.py", line 738, in run
    params, options = self.parser.parse_args(args=args, show_diff_phil=False)
  File "/dls_sw/apps/dials/dials-v1-14-9/modules/dials/util/options.py", line 899, in parse_args
    quick_parse=quick_parse,
  File "/dls_sw/apps/dials/dials-v1-14-9/modules/dials/util/options.py", line 592, in parse_args
    format_kwargs=format_kwargs,
  File "/dls_sw/apps/dials/dials-v1-14-9/modules/dials/util/options.py", line 233, in __init__
    format_kwargs,
  File "/dls_sw/apps/dials/dials-v1-14-9/modules/dials/util/options.py", line 305, in try_read_datablocks_from_images
    format_kwargs=format_kwargs,
  File "/dls_sw/apps/dials/dials-v1-14-9/modules/cctbx_project/dxtbx/datablock.py", line 1091, in from_filenames
    format_kwargs=format_kwargs,
  File "/dls_sw/apps/dials/dials-v1-14-9/modules/cctbx_project/dxtbx/datablock.py", line 545, in __init__
    format_kwargs=format_kwargs,
  File "/dls_sw/apps/dials/dials-v1-14-9/modules/cctbx_project/dxtbx/datablock.py", line 605, in _extract_file_metadata
    fmt = format_class(filename, **format_kwargs)
  File "/dls_sw/apps/dials/dials-v1-14-9/modules/cctbx_project/dxtbx/format/FormatCBFFullPilatus.py", line 41, in __init__
    FormatCBFFull.__init__(self, image_file, **kwargs)
  File "/dls_sw/apps/dials/dials-v1-14-9/modules/cctbx_project/dxtbx/format/FormatCBFFull.py", line 45, in __init__
    FormatCBF.__init__(self, image_file, **kwargs)
  File "/dls_sw/apps/dials/dials-v1-14-9/modules/cctbx_project/dxtbx/format/FormatCBF.py", line 58, in __init__
    Format.__init__(self, image_file, **kwargs)
  File "/dls_sw/apps/dials/dials-v1-14-9/modules/cctbx_project/dxtbx/format/Format.py", line 203, in __init__
    self.setup()
  File "/dls_sw/apps/dials/dials-v1-14-9/modules/cctbx_project/dxtbx/format/Format.py", line 216, in setup
    goniometer_instance = self._goniometer()
  File "/dls_sw/apps/dials/dials-v1-14-9/modules/cctbx_project/dxtbx/format/FormatCBFFull.py", line 68, in _goniometer
    return self._goniometer_factory.imgCIF_H(self._get_cbf_handle())
  File "/dls_sw/apps/dials/dials-v1-14-9/modules/cctbx_project/dxtbx/format/FormatCBFFull.py", line 62, in _get_cbf_handle
    self._cbf_handle.read_widefile(self._image_file, pycbf.MSG_DIGEST)
  File "/dls_sw/apps/dials/dials-v1-14-9/build/lib/pycbf.py", line 3285, in read_widefile
    return _pycbf.cbf_handle_struct_read_widefile(self, filename, headers)
Exception: CBFlib Error(s): CBF_FORMAT

Limitation of one scan per file

As far as I can tell there is a fundamental assumption in dxtbx of no more than one scan per input file:

https://github.com/cctbx/cctbx_project/blob/cc0cfc8ea0d3a8b1a230feac88b2226163695f62/dxtbx/datablock.py#L520-L533

For screening images from EIGER detectors this may not be the case, as a single .h5 file may contain e.g. 3 images as 0, 45, 90, which should be interpreted as 3 scans separated by 45 degrees, i.e. 1 file -> 3 scan objects.

E.g. /dls/i04/data/2018/cm19645-5/Eiger/screening/SeThaumatin_8_1_hacked_master.h5

Errors importing miniCBF from dlsnxs2cbf

import.zip

File in attached from dlsnxs2cbf - used to import fine - now gives:

Grey-Area dlsnxs2cbf :( $ dials.import therm_0001.cbf 
DIALS (2018) Acta Cryst. D74, 85-97. https://doi.org/10.1107/S2059798317017235
DIALS 2.dev.757-gacb12b9ec
Traceback (most recent call last):
  File "/Users/graeme/svn/cctbx/build/../modules/dials/command_line/dials_import.py", line 902, in <module>
    script.run()
  File "/Users/graeme/svn/cctbx/build/../modules/dials/command_line/dials_import.py", line 747, in run
    params, options = self.parser.parse_args(args=args, show_diff_phil=False)
  File "/Users/graeme/svn/cctbx/modules/dials/util/options.py", line 853, in parse_args
    quick_parse=quick_parse,
  File "/Users/graeme/svn/cctbx/modules/dials/util/options.py", line 565, in parse_args
    load_models=load_models,
  File "/Users/graeme/svn/cctbx/modules/dials/util/options.py", line 231, in __init__
    load_models,
  File "/Users/graeme/svn/cctbx/modules/dials/util/options.py", line 304, in try_read_experiments_from_images
    load_models=load_models,
  File "/Users/graeme/svn/cctbx/modules/dxtbx/model/experiment_list.py", line 506, in from_filenames
    format_kwargs=format_kwargs,
  File "/Users/graeme/svn/cctbx/modules/dxtbx/datablock.py", line 1030, in from_filenames
    format_kwargs=format_kwargs,
  File "/Users/graeme/svn/cctbx/modules/dxtbx/datablock.py", line 487, in __init__
    format_kwargs=format_kwargs,
  File "/Users/graeme/svn/cctbx/modules/dxtbx/datablock.py", line 543, in _extract_file_metadata
    fmt = format_class(filename, **format_kwargs)
  File "/Users/graeme/svn/cctbx/modules/dxtbx/format/FormatCBFMiniEigerDLS16MSN160.py", line 50, in __init__
    super(FormatCBFMiniEigerDLS16MSN160, self).__init__(image_file, **kwargs)
  File "/Users/graeme/svn/cctbx/modules/dxtbx/format/FormatCBFMini.py", line 66, in __init__
    super(FormatCBFMini, self).__init__(image_file, **kwargs)
  File "/Users/graeme/svn/cctbx/modules/dxtbx/format/FormatCBF.py", line 65, in __init__
    super(FormatCBF, self).__init__(str(image_file), **kwargs)
  File "/Users/graeme/svn/cctbx/modules/dxtbx/format/Format.py", line 146, in __init__
    self.setup()
  File "/Users/graeme/svn/cctbx/modules/dxtbx/format/Format.py", line 162, in setup
    detector_instance = self._detector()
  File "/Users/graeme/svn/cctbx/modules/dxtbx/format/FormatCBFMiniEiger.py", line 114, in _detector
    for f0, f1, s0, s1 in determine_eiger_mask(detector):
  File "/Users/graeme/svn/cctbx/modules/dxtbx/format/FormatPilatusHelpers.py", line 212, in determine_eiger_mask
    n_fast, remainder = divmod(size[0], detector.module_size_fast)
AttributeError: Please report this error to [email protected]: 'NoneType' object has no attribute 'module_size_fast'

suspect related to logical cleanup from @rjgildea.

cbf file in zip file attached.

Registry: pass along objects from one understand method to the next

Currently if one understand method from a parent Format class wishes to pass information to children Format classes, there is no way to do this. For example, if FormatCBF's understand method reads the header and returns True, then each of the children classes (mini, full, etc) re-opens the file and re-reads the header to see if it can understand it. Instead, if the understand method returned a tuple:

True, anyobject

False, None

Then the registry could pass on anyobject to the children. Perhaps anyobject is a header dictionary from FormatSMV, or a detector address interpreted by FormatXTC. It's implementation specific.

This would reduce file reads and allow communication of derived information to children classes.

It could be done with a minimum amount of intrusion by updating registry.py to look at what the understand method returned. If it's a bool, just call the children's understand methods. If it's a tuple, pass along the second object too.

Thoughts?

FormatNeXusStill picks up non-still images

Got Jungfrau data from SLS which is for rotations but FormatNeXusStill claims it as it's own. Then fails.

Panel pedestal does not work?

I've just come across the panel pedestal attribute. This seems to be a fairly recent innovation (this year at least: f96a16d) but as far as I can tell it does not work.

The "pedestal" should be subtracted as a dark image here:

dxtbx/imageset.h

Lines 680 to 685 in d95a922

 // Apply dark 

 if (p.size() > 0) { 

 for (std::size_t j = 0; j < r.size(); ++j) { 

 c[j] = c[j] - p[j]; 

 } 

 }

however, even if panel.pedestal is set to something non-zero, this block is never called because p.size() == 0.

Use of numpy inside get_raw_data causes seg fault

This command fails for me:

  dials.find_spots $DIALS_REGRESSION/spotfinding_test_data/idx-s00-20131106040302615.cbf

on a fresh up-to-date build on Ubuntu 18.04 that started with bootstrap.py --use-conda. This command is what is tested by test_spotfinder.test_find_spots_with_xfel_stills. It fails with a seg fault where the head of the stack trace points here:

show_stack(1): /home/fcx32934/sw/cctbx/modules/dxtbx/format/FormatCBFMultiTileHierarchy.py(351) get_raw_data

which is a line calling numpy functions:

array = flex.double(numpy.frombuffer(array_string, numpy.float))

It has proven difficult to reproduce this outside of the image reading context - I suspect the issue is related to the fact that the Python code that imports numpy is being called already from inside C++ (dxtbx::ImageSet::get_raw_data).

Running python pickle within C++ causing trouble

On a Python 3 installation this

pytest --runxfail -v tests/model/test_experiment_list.py::test_experimentlist_dumper_dump_empty_sweep

fails with

self = <dxtbx_imageset_ext.ImageSweep object at 0x7f4eeed9db90>

    def params(self):
        """ Get the parameters """
>       return self.data().get_params()
E       UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 0: invalid start byte

dxtbx/imageset.py:100: UnicodeDecodeError

It can be traced to these constructs:

dxtbx/boost_python/imageset_ext.cc

Lines 30 to 43 in f877b3f

 /** 

  * Unpickle a python object from a string 

  */ 

 boost::python::object pickle_loads(std::string x) { 

 if (x == "") { 

 return boost::python::object(); 

 } 

 boost::python::object main = boost::python::import("__main__"); 

 boost::python::object global(main.attr("__dict__")); 

 boost::python::object result = 

 exec("def loads(x):import pickle; return pickle.loads(x)", global, global); 

 boost::python::object loads = global["loads"]; 

 return loads(x); 

 }

Move dxtbx tickets from cctbx_project here

dials.import does not import hdf5 screening images as a sequence

Underlying bug in dxtbx -

         DATASET "omega" {
            DATATYPE  H5T_IEEE_F64LE
            DATASPACE  SIMPLE { ( 3 ) / ( 3 ) }
            DATA {
            (0): 0, 45, 90
            }
            ATTRIBUTE "depends_on" {
               DATATYPE  H5T_STRING {
                  STRSIZE 2;
                  STRPAD H5T_STR_NULLTERM;
                  CSET H5T_CSET_ASCII;
                  CTYPE H5T_C_S1;
               }
               DATASPACE  SCALAR
               DATA {
               (0): "."
               }
            }

imports as

--------------------------------------------------------------------------------
  format: <class 'dxtbx.format.FormatNexus.FormatNexus'>
  num images: 3
  num sweeps: 1
  num stills: 0
--------------------------------------------------------------------------------
Writing datablocks to datablock.json

Experiment.is_sweep/is_still change broke I23 integration

See: dials/dials#759

This was definitely broken by dials@e7fc848

Experiments that are sweeps are being loaded with empty scans - possible removed?

(actively working on this)

Beam.get_direction: proposed name change

The dxtbx Beam class has a method called Beam.get_direction, but unfortunately this invites confusion. Without looking into the details of Beam, a developer might reasonably expect this to return a unit vector pointing in the direction of propagation of the beam. In fact, it returns the inverse of that, i.e. the sample-to-source direction (the method Beam.get_unit_s0 does return what is expected). This trap has claimed at least one victim (cctbx/cctbx_project#133) and needs some care to disentangle.

The imgCIF coordinate system describes this direction using the 'source axis'.
http://www.iucr.org/__data/iucr/cifdic_html/2/cif_img.dic/Caxis.html

Axis 3 (Z): The Z-axis is derived from the source axis which goes from
the sample to the source. The Z-axis is the component of the source axis
in the direction of the source orthogonal to the X-axis in the plane
defined by the X-axis and the source axis.

Perhaps we could come up with a better method name that doesn't conceal an ambiguity. Maybe Beam.get_unit_source_axis or Beam.get_source_direction?

Eiger module sizes / mask etc. a mess

Due to (i) changes in writing of fast, slow and (ii) Eiger 2X being different to Eiger X the whole area is a mess. Looking into this properly now (for 4, 9, 16M) x (Eiger X, Eiger 2X)

For info from web pages:

16M: 4150 x 4371
9M: 3110 x 3269
4M: 2070 x 2167

2X 16M: 4148 x 4362
2X 9M: 3108 x 3262
2X 4M: 2068 x 2162

investigate passing tests

These tests passed on Python 3. This is fine.
Unfortunately they passed although the code under test was broken (fixed in 3f129f1), because the test ran through a portion of try: ... except: pass (removed in 08541ea)
If the tests had actually tested the code under test then they should have caught that originally the actual code was not run. It follows that there is an opportunity here to improve the tests.

format/test_cbf_mini_as_file.py::test_cbf_writer[image_examples/ALS_831/q315r_lyso_001.img]
format/test_cbf_mini_as_file.py::test_cbf_writer[image_examples/DLS_I02/X4_wide_M1S4_1_0001.cbf]
tests/test_datablock.py::test_create_single_sweep
tests/test_datablock.py::test_create_multiple_sweeps
tests/test_datablock.py::test_from_null_sweep
tests/model/test_experiment_list.py::test_experimentlist_factory_from_datablock
tests/model/test_experiment_list.py::test_load_models

Set gain to sensible values for CCD detector Formats

As discussed here

dead link

Hi, moving this to an issue. @dermen and I found a dead url linked to from this line:
https://github.com/cctbx/dxtbx/blob/master/model/crystal.h#L306

Dead url:
http://goo.gl/H3p1s

Appears to redirect to
http://www-bio3d-igbmc.u-strasbg.fr/~mgsb/biophys/rx/biblio/19_06_cowtan_coordinate_frames.pdf

Looks like it should be:
https://www.iucr.org/__data/assets/pdf_file/0009/7011/19_06_cowtan_coordinate_frames.pdf

Should we copy this pdf somewhere and ensure it's hosted nicely? Looks like a private location on IUCr.

@dagewa suggests hosting it on a CCP4 site.

Format needs get_static_mask

Following discussion in #65, Format needs a get_static_mask method. This will:

Return either a mask or None
Be set in the ImageSet during instantiation (imageset.external_lookup.mask.data = ImageBool(mask_flex_array_or_tuple_of_flex_arrays))
Allow merging of a user-provided mask with the mask from the format class
Be exercised by the dxtbx or DIALS tests

Improve Bruker format readers

The dxtbx handling of Bruker file formats is limited. FormatBruker relies on iotbx.detectors.BrukerImage. This makes some restrictive assumptions, such as 1024*1024 pixel images, which are not correct for detectors such as the new PHOTON-II. FormatBrukerPhotonII avoids using BrukerImage to work around the restrictions, but fails on datasets using Bruker's own compression scheme.

Arnaud Basle has been in touch with Bruker representatives, and obtained the latest frame file description:
BISFrameFileFormats.zip. We may also be able to see the source for FrmUtility, which is Bruker's own reference example for how to read their .sfrm format.

Ideally, this knowledge should be incorporated into FormatBruker, so that this can read any Bruker image, without unwarranted assumptions.

nexus.py uses incident_wavelength incorrectly

Our code for reading NeXus files looks at incident_wavelength and if it's array, assumes that each value corresponds to a different image:
https://github.com/dials/dxtbx/blob/master/format/nexus.py#L933-L938

However, I think we are mis-interpreting incident_wavelength:
http://download.nexusformat.org/doc/html/classes/applications/NXmx.html
"In the case of a polychromatic beam this is an array of the wavelengths with the relative weights in incident_wavelength_weight."

I have an opened an issue in NeXus about how to specify a per-shot wavelength (nexusformat/definitions#667). In the meantime, do we have NeXus file created either for XFELs or for Eiger that use incident_wavelength as an array?

For XFELs the answer is no. We are using mean energies and specifying only one:
https://github.com/cctbx/cctbx_project/blob/master/xfel/euxfel/agipd_cxigeom2nexus.py#L101
https://github.com/cctbx/cctbx_project/blob/master/xfel/swissfel/jf16m_cxigeom2nexus.py#L104
So for XFELs, however nexusformat/definitions#667 gets resolved, it won't affect existing files.

Support for the Oxford Diffraction file format

Format descriptions and source code received, with thanks to Dr. Mathias Meyer at Rigaku Oxford Diffraction.

Old (2003) format CrysAlisImageFormat01-07-2003.zip
New (2016) format CAP_image_format2016.zip

Supporting information from Andreas Förster:

CrysAlisPro is considered by many the gold standard for small-molecule processing. Comparing DIALS against data collected with CAP (the software also controls ROD diffractometers) should help you improve the algorithms for small-molecule data. The compression algorithm that CAP uses is based on CCP4 bitwise, by the way.

Support dials-data datasets in dxtbx image tests

Came up because some data added to dials-data that @dagewa wanted to use in the image tests.

Some WIP going on in https://github.com/cctbx/dxtbx/tree/extend-regression-tests but thought I'd make a ticket to allow it to have somewhere to not be forgotten.

Planning to work on this start of next week.

ImageSet broken with unicode paths

Input:

from __future__ import absolute_import, division, print_function

import os
import dials_regression
template = unicode(os.path.join(dials_regression.__path__[0], 'centroid_test_data', 'centroid_####.cbf'))

from dxtbx.datablock import DataBlockTemplateImporter
importer = DataBlockTemplateImporter([template])
imageset = importer.datablocks[0].extract_imagesets()[0]
print(imageset.get_path(1))

Output:

Traceback (most recent call last):
  File "<input>", line 1, in <module>
    print(imageset.get_path(1))
TypeError: No registered converter was able to produce a C++ rvalue of type std:
:string from this Python object of type unicode

Guess what you can't pickle? dxtbx_masking_ext.GoniometerShadowMasker

xia2/xia2#316 is actually caused by

Error: Pickling of "dxtbx_masking_ext.GoniometerShadowMasker" instances is not enabled

Traceback (most recent call last):
  File "/Users/graeme/svn/cctbx/build/../modules/xia2/command_line/xia2_main.py", line 372, in run
    xia2_main()
  File "/Users/graeme/svn/cctbx/build/../modules/xia2/command_line/xia2_main.py", line 54, in xia2_main
    CommandLine = get_command_line()
  File "/Users/graeme/svn/cctbx/modules/xia2/Applications/xia2_main.py", line 115, in get_command_line
    CommandLine.set_xinfo(xinfo)
  File "/Users/graeme/svn/cctbx/modules/xia2/Handlers/CommandLine.py", line 523, in set_xinfo
    self._xinfo = XProject(xinfo)
  File "/Users/graeme/svn/cctbx/modules/xia2/Schema/XProject.py", line 36, in __init__
    self.setup_from_xinfo_file(xinfo_file)
  File "/Users/graeme/svn/cctbx/modules/xia2/Schema/XProject.py", line 348, in setup_from_xinfo_file
    excluded_regions=sweep_info.get("excluded_regions", []),
  File "/Users/graeme/svn/cctbx/modules/xia2/Schema/XWavelength.py", line 267, in add_sweep
    excluded_regions=excluded_regions,
  File "/Users/graeme/svn/cctbx/modules/xia2/Schema/XSweep.py", line 231, in __init__
    self._imageset = copy.deepcopy(imagesets[0])
  File "/Users/graeme/svn/cctbx/base/Python.framework/Versions/2.7/lib/python2.7/copy.py", line 190, in deepcopy
    y = _reconstruct(x, rv, 1, memo)
  File "/Users/graeme/svn/cctbx/base/Python.framework/Versions/2.7/lib/python2.7/copy.py", line 328, in _reconstruct
    args = deepcopy(args, memo)
  File "/Users/graeme/svn/cctbx/base/Python.framework/Versions/2.7/lib/python2.7/copy.py", line 163, in deepcopy
    y = copier(x, memo)
  File "/Users/graeme/svn/cctbx/base/Python.framework/Versions/2.7/lib/python2.7/copy.py", line 237, in _deepcopy_tuple
    y.append(deepcopy(a, memo))
  File "/Users/graeme/svn/cctbx/base/Python.framework/Versions/2.7/lib/python2.7/copy.py", line 190, in deepcopy
    y = _reconstruct(x, rv, 1, memo)
  File "/Users/graeme/svn/cctbx/base/Python.framework/Versions/2.7/lib/python2.7/copy.py", line 328, in _reconstruct
    args = deepcopy(args, memo)
  File "/Users/graeme/svn/cctbx/base/Python.framework/Versions/2.7/lib/python2.7/copy.py", line 163, in deepcopy
    y = copier(x, memo)
  File "/Users/graeme/svn/cctbx/base/Python.framework/Versions/2.7/lib/python2.7/copy.py", line 237, in _deepcopy_tuple
    y.append(deepcopy(a, memo))
  File "/Users/graeme/svn/cctbx/base/Python.framework/Versions/2.7/lib/python2.7/copy.py", line 182, in deepcopy
    rv = reductor(2)
RuntimeError: Pickling of "dxtbx_masking_ext.GoniometerShadowMasker" instances is not enabled (http://www.boost.org/libs/python/doc/v2/pickle.html)

Guess should be trivial to mock up a failing test

For format objects for multi panel detectors, provide method to give 2D representation

i.e. if you want to show a picture, show results plotted as x, y or whatever for detectors which are made up from a collection of 2D panels, provide a mechanism to plot these on some larger 2D array.

See dials/dials#575 for background

Add batch_number_offset to dxtbx scan

To allow data tracking when more than one data set put into MTZ

treat images in local time or UTC?

While working on cfb6cc4 I noticed that FormatSMVRigakuSaturn uses calendar.timegm while FormatSMVNOIR, FormatSMVRigakuEiger and FormatSMVRigakuPilatus use time.mktime.

The former expects UTC, the latter local time.

Was this a conscious choice or is this more of a cargo cult thing and nobody really knows why one was picked over the other?

current grep timegm\|mktime:

FormatCBFMiniADSCHF4M.py:            return calendar.timegm(struct_time)
FormatSMVJHSim.py:                epoch = calendar.timegm(time.strptime(date_str, format_string))
FormatPYmultitile.py:from calendar import timegm
FormatPYmultitile.py:        epoch = timegm(strptime(str_min, "%Y-%m-%dT%H:%M%Z")) + float(str_sec)
FormatCBFMiniPilatusHelpers.py:            return calendar.timegm(struct_time) + float("0." + milliseconds)
FormatSMVADSC.py:                epoch = calendar.timegm(time.strptime(date_str, format_string))
doc/adding_new_formats.txt:            epoch = time.mktime(time.strptime(self._header_dictionary[
FormatSMVCMOS1.py:        epoch = calendar.timegm(time.strptime(date_record, "%a %b %d %Y %H:%M:%S"))
FormatTIFFRayonix.py:        epoch = time.mktime(self._get_rayonix_timestamp())
FormatRAXIS.py:        epoch = calendar.timegm(datetime.datetime(y, m, d, 0, 0, 0).timetuple())
FormatTIFFBruker.py:        epoch = time.mktime(self._get_bruker_timestamp())
FormatSMVTimePix_SU.py:                epoch = calendar.timegm(time.strptime(date_str, format_string))
FormatCBFMiniPilatusDLS12M.py:        # calendar.timegm(time.strptime('2016-04-01T00:00:00', '%Y-%m-%dT%H:%M:%S'))
FormatRAXISIVSpring8.py:        epoch = calendar.timegm(datetime.datetime(y, m, d, 0, 0, 0).timetuple())
FormatSMVRigaku.py:            epoch = time.mktime(epoch_time_struct)
FormatSMVRigaku.py:            epoch = calendar.timegm(epoch_time_struct)

Drop detectorbase from every format object

And instead provide a factory which will take a dxtbx format object which is fully populated and provide a detectorbase (if such a thing fits)

Would aid with things like non-square Bruker detectors (iotbx detectorbase insists they are square ISTR)
Would make debugging new detectors easier
Would mean that things which depend on detectorbase can do so with no new code, since (I assume) what they need is a subset of what we have in dxtbx

Also, from what I am aware detectorbase is used nowhere in dials or dxtbx so having this there only adds cost. Also as pointed #84 (comment) many of these are imported and not used in e.g. dials.find_spots

can't import snowflake CBF

$ dials.import /dls/i04/data/2019/mx19301-31/tmp/s2c/3d1ecaea-1703-45ba-b6f2-109a852d83a1/*1.cbf
DIALS (2018) Acta Cryst. D74, 85-97. https://doi.org/10.1107/S2059798317017235
DIALS 2.dev.545-g277d1717e
Traceback (most recent call last):
  File "/dls/science/groups/scisoft/DIALS/CD/now/build_dials/build/../modules/dials/command_line/dials_import.py", line 909, in <module>
    halraiser(e)
  File "/dls/science/groups/scisoft/DIALS/CD/now/build_dials/build/../modules/dials/command_line/dials_import.py", line 907, in <module>
    script.run()
  File "/dls/science/groups/scisoft/DIALS/CD/now/build_dials/build/../modules/dials/command_line/dials_import.py", line 751, in run
    params, options = self.parser.parse_args(args=args, show_diff_phil=False)
  File "/dls/science/groups/scisoft/DIALS/CD/now/build_dials/modules/dials/util/options.py", line 838, in parse_args
    quick_parse=quick_parse,
  File "/dls/science/groups/scisoft/DIALS/CD/now/build_dials/modules/dials/util/options.py", line 550, in parse_args
    format_kwargs=format_kwargs,
  File "/dls/science/groups/scisoft/DIALS/CD/now/build_dials/modules/dials/util/options.py", line 229, in __init__
    format_kwargs,
  File "/dls/science/groups/scisoft/DIALS/CD/now/build_dials/modules/dials/util/options.py", line 295, in try_read_experiments_from_images
    format_kwargs=format_kwargs,
  File "/dls/science/groups/scisoft/DIALS/CD/now/build_dials/modules/dxtbx/model/experiment_list.py", line 496, in from_filenames
    format_kwargs=format_kwargs,
  File "/dls/science/groups/scisoft/DIALS/CD/now/build_dials/modules/dxtbx/datablock.py", line 1027, in from_filenames
    format_kwargs=format_kwargs,
  File "/dls/science/groups/scisoft/DIALS/CD/now/build_dials/modules/dxtbx/datablock.py", line 484, in __init__
    format_kwargs=format_kwargs,
  File "/dls/science/groups/scisoft/DIALS/CD/now/build_dials/modules/dxtbx/datablock.py", line 540, in _extract_file_metadata
    fmt = format_class(filename, **format_kwargs)
  File "/dls/science/groups/scisoft/DIALS/CD/now/build_dials/modules/dxtbx/format/FormatCBFMiniEigerDLS16MSN160.py", line 54, in __init__
    super(FormatCBFMiniEigerDLS16MSN160, self).__init__(image_file, **kwargs)
  File "/dls/science/groups/scisoft/DIALS/CD/now/build_dials/modules/dxtbx/format/FormatCBFMiniEiger.py", line 52, in __init__
    FormatCBFMini.__init__(self, image_file, **kwargs)
  File "/dls/science/groups/scisoft/DIALS/CD/now/build_dials/modules/dxtbx/format/FormatCBFMini.py", line 70, in __init__
    FormatCBF.__init__(self, image_file, **kwargs)
  File "/dls/science/groups/scisoft/DIALS/CD/now/build_dials/modules/dxtbx/format/FormatCBF.py", line 66, in __init__
    Format.__init__(self, image_file, **kwargs)
  File "/dls/science/groups/scisoft/DIALS/CD/now/build_dials/modules/dxtbx/format/Format.py", line 146, in __init__
    self.setup()
  File "/dls/science/groups/scisoft/DIALS/CD/now/build_dials/modules/dxtbx/format/Format.py", line 162, in setup
    detector_instance = self._detector()
  File "/dls/science/groups/scisoft/DIALS/CD/now/build_dials/modules/dxtbx/format/FormatCBFMiniEiger.py", line 130, in _detector
    for f0, f1, s0, s1 in determine_eiger_mask(detector):
  File "/dls/science/groups/scisoft/DIALS/CD/now/build_dials/modules/dxtbx/format/FormatPilatusHelpers.py", line 212, in determine_eiger_mask
    n_fast, remainder = divmod(size[0], detector.module_size_fast)
AttributeError: Please report this error to [email protected]: 'NoneType' object has no attribute 'module_size_fast'

whereas

$ dials.import /dls/i04/data/2019/mx19301-31/tmp/s2c/3d1ecaea-1703-45ba-b6f2-109a852d83a1/*1.cbf
DIALS (2018) Acta Cryst. D74, 85-97. https://doi.org/10.1107/S2059798317017235
DIALS 1.14.8-ge45eb7d1a-release
The following parameters have been modified:

input {
  datablock = <image files>
}

--------------------------------------------------------------------------------
  format: <class 'dxtbx.format.FormatCBFMiniEigerDLS16MSN160.FormatCBFMiniEigerDLS16MSN160'>
  num images: 1
  num sweeps: 1
  num stills: 0
--------------------------------------------------------------------------------
Writing datablocks to datablock.json

DXTBX_ASSERT(start[i] + count[i] <= dataset_dims[i])

3 image screening dataset.
dials.import followed by dials.find_spots works for 2 images, on 3rd image:

Traceback (most recent call last):
  File "/scratch/wra62962/files/dials/build/../modules/dials/command_line/find_spots.py", line 223, in <module>
    halraiser(e)
  File "/scratch/wra62962/files/dials/build/../modules/dials/command_line/find_spots.py", line 221, in <module>
    script.run()
  File "/scratch/wra62962/files/dials/build/../modules/dials/command_line/find_spots.py", line 152, in run
    reflections = flex.reflection_table.from_observations(experiments, params)
  File "/scratch/wra62962/files/dials/modules/dials/array_family/flex.py", line 211, in from_observations
    return find_spots(experiments)
  File "/scratch/wra62962/files/dials/modules/dials/algorithms/spot_finding/finder.py", line 770, in __call__
    table, hot_mask = self._find_spots_in_imageset(imageset)
  File "/scratch/wra62962/files/dials/modules/dials/algorithms/spot_finding/finder.py", line 867, in _find_spots_in_imageset
    r, h = extract_spots(imageset[j0:j1])
  File "/scratch/wra62962/files/dials/modules/dials/algorithms/spot_finding/finder.py", line 478, in __call__
    return self._find_spots(imageset)
  File "/scratch/wra62962/files/dials/modules/dials/algorithms/spot_finding/finder.py", line 586, in _find_spots
    result = function(task)
  File "/scratch/wra62962/files/dials/modules/dials/algorithms/spot_finding/finder.py", line 115, in __call__
    image = self.imageset.get_corrected_data(index)
  File "/scratch/wra62962/files/dials/modules/dxtbx/format/FormatMultiImage.py", line 31, in read
    return format_instance.get_raw_data(index)
  File "/scratch/wra62962/files/dials/modules/dxtbx/format/FormatNexus.py", line 131, in get_raw_data
    return self._raw_data[index]
  File "/scratch/wra62962/files/dials/modules/dxtbx/format/nexus.py", line 1591, in __getitem__
    (slice(i, i + 1, 1), slice(0, height, 1), slice(0, width, 1)),
RuntimeError: Please report this error to [email protected]: dxtbx Internal Error: /scratch/wra62962/files/dials/modules/dxtbx/format/boost_python/nexus_ext.cc(55): DXTBX_ASSERT(start[i] + count[i] <= dataset_dims[i]) failure.

presumably this is #19?

Max IV Eiger work - vertical axis, to work nicely with dxtbx

See e.g. cctbx/cctbx_project@66a61ba and conversation, ongoing work

dxtbx -> Python 3

Python 3 for dxtbx

As you are aware dxtbx is already python 3 syntax compatible, and has been for some time. With cctbx now being nominally python 3 compatible we can proceed with our work on dxtbx.

As discussed on dials-support we propose that this is done in a staged way, with open work and continuous testing to record progress.

Process

Before the work is started we will set up a Jenkins build job at Diamond following the same principle that we had with the python 3 syntax conversion, ie. we will keep a list of expected failing tests in python 3 alongside the code.
A python 3 Travis job will not be available initially but may be added at a later date.

Work will proceed in feature branches off master which will be submitted as pull requests, reviewed, and merged in a timely fashion i.e. over a period of a couple of days.
This ensures we do not have long-running side branches leading to code conflicts and the need to merge from the master branch or rebase the feature branch.
The entire test suite is run on the iteratively updated master branch which means we avoid python 2 and - as the list of ignored tests shrinks - python 3 regressions.

Every feature branch will be concerned with a small number of topics. Initially these will be futurize stage 2 fixers, later these will be picked from remaining python 3 test failures. Commits in each pull request may fall into one of the following categories:

automatic code conversion
manual changes for test fixing commits
idiomatic updates (eg. enumerate(), use of context managers, reliance on deprecated code, explicit list construction in places where generators would be more appropriate, ...)
cleanup of immediate surrounding code (ie. cleaning campsite)
improving code coverage, including adding tests
flake8 compliance
addressing of LGTM issues.

There is no required flake8/LGTM compliancy level for pull requests, and particularly at the start of the process it may be the case that pull requests will be merged with failing flake8 tests. As we refactor dxtbx and our code quality improves we should aim for more stringent flake8 compliancy and even consider enforcing more flake8 test categories via the pre-commit hook. Pull requests should not introduce any new LGTM issues, and ideally address existing issues in the touched files, although this may again not be practicable at the start of the process.

Benefits

At the end of the process dxtbx will be idiomatically modernized python 3 code that is fully python 2.7 compatible. This will improve the overall code quality, reduce technical debt and code maintenance cost, and reduce the barriers to entry for new or external developers.
The process will serve as a prototype for the upcoming work on the DIALS project.

Success criteria

No test failures on Travis (Python 2.7, possibly 3.6 and 3.7)
No test failures on DLS Jenkins (Python 2.7 and 3.6)
Fewer than 10 remaining LGTM alerts (current: 121) with an A+ code rating (current: A)
Fewer than 10 flake8 errors/warnings (current: 392)
Code coverage at or above current levels (100% packages, 83% files, 83% classes, 62% lines, 45% conditionals)
Delivery by mid-July 2019

Comments

Please discuss this process in the issue comments below. If this is agreeable we anticipate starting this work in the next week.

Support for non Diamond Eiger 2X 16M

DECTRIS Eiger 2 16M at SLS is currently giving:

DIALS (2018) Acta Cryst. D74, 85-97. https://doi.org/10.1107/S2059798317017235
Traceback (most recent call last):
  File "/Users/graeme/svn/cctbx/build/../modules/dials/command_line/show.py", line 629, in <module>
    run(sys.argv[1:])
  File "/Users/graeme/svn/cctbx/build/../modules/dials/command_line/show.py", line 192, in run
    params, options = parser.parse_args(show_diff_phil=True)
  File "/Users/graeme/svn/cctbx/modules/dials/util/options.py", line 853, in parse_args
    quick_parse=quick_parse,
  File "/Users/graeme/svn/cctbx/modules/dials/util/options.py", line 565, in parse_args
    load_models=load_models,
  File "/Users/graeme/svn/cctbx/modules/dials/util/options.py", line 231, in __init__
    load_models,
  File "/Users/graeme/svn/cctbx/modules/dials/util/options.py", line 304, in try_read_experiments_from_images
    load_models=load_models,
  File "/Users/graeme/svn/cctbx/modules/dxtbx/model/experiment_list.py", line 496, in from_filenames
    format_kwargs=format_kwargs,
  File "/Users/graeme/svn/cctbx/modules/dxtbx/datablock.py", line 1030, in from_filenames
    format_kwargs=format_kwargs,
  File "/Users/graeme/svn/cctbx/modules/dxtbx/datablock.py", line 474, in __init__
    fmt, filename, format_kwargs=format_kwargs
  File "/Users/graeme/svn/cctbx/modules/dxtbx/datablock.py", line 668, in _create_single_file_imageset
    return format_class.get_imageset(abspath(filename), format_kwargs=format_kwargs)
  File "/Users/graeme/svn/cctbx/modules/dxtbx/format/FormatMultiImage.py", line 159, in get_imageset
    reader = cls.get_reader()(filenames, num_images=num_images, **format_kwargs)
  File "/Users/graeme/svn/cctbx/modules/dxtbx/format/FormatMultiImage.py", line 23, in __init__
    self._num_images = self.read_num_images()
  File "/Users/graeme/svn/cctbx/modules/dxtbx/format/FormatMultiImage.py", line 39, in read_num_images
    format_instance = self.format_class.get_instance(self._filename, **self.kwargs)
  File "/Users/graeme/svn/cctbx/modules/dxtbx/format/Format.py", line 255, in get_instance
    Class._current_instance_ = Class(filename, **kwargs)
  File "/Users/graeme/svn/cctbx/modules/dxtbx/format/FormatNexus.py", line 22, in __init__
    FormatHDF5.__init__(self, image_file, **kwargs)
  File "/Users/graeme/svn/cctbx/modules/dxtbx/format/FormatHDF5.py", line 14, in __init__
    Format.__init__(self, image_file, **kwargs)
  File "/Users/graeme/svn/cctbx/modules/dxtbx/format/Format.py", line 146, in __init__
    self.setup()
  File "/Users/graeme/svn/cctbx/modules/dxtbx/format/Format.py", line 156, in setup
    self._start()
  File "/Users/graeme/svn/cctbx/modules/dxtbx/format/FormatNexus.py", line 34, in _start
    self._reader = reader = NXmxReader(self._image_file)
  File "/Users/graeme/svn/cctbx/modules/dxtbx/format/nexus.py", line 883, in __init__
    % (filename, "\n".join(self.errors))
RuntimeError: 
        Error reading NXmxfile: /Users/graeme/data/sls-eiger2/insu_15_master.h5
          No NXmx entries in file

        The following errors occurred:

        No NXbeam in /entry/sample
No NXsample in /entry

type errors in dials/master - should make them work or identify what is missing.

dxtbx_ext => dxtbx.init

We currently

from dxtbx_ext import *

in top level dxtbx.__init__

Suggest moving this to dxtbx.ext with a one-release grace period with deprecation warnings.

This would

avoid polluting the dxtbx.* namespace, and therefore
make clear when extension methods are used, and
skip the automatic import of the extensions on importing dxtbx, which may not always be required

Opinions?

Code modification emails

We don't have them yet.

Do we want them to go to dials-commit or cctbx-commit or to a new mailing list? Opinions?

A new one could be useful from a spam/quota point of view. But does mean another thing to subscribe to.

Add reader for Bruker data exported to "mccd" format to dxtbx

Current status:

Graemes-MBP-5:tmp graeme$ dxtbx.print_header Collect_0001.mccd 
=== Collect_0001.mccd ===
Using header reader: FormatTIFFRayonix
Traceback (most recent call last):
  File "/Users/graeme/svn/cctbx/modules/cctbx_project/dxtbx/format/Format.py", line 136, in setup
    detector_instance = self._detector()
  File "/Users/graeme/svn/cctbx/modules/cctbx_project/dxtbx/format/FormatTIFFRayonix.py", line 126, in _detector
    assert(rotations[0] == 0.0)
AssertionError
No beam model found
No detector model found
Goniometer:
    Rotation axis:   {1,0,0}
    Fixed rotation:  {1,0,0,0,1,0,0,0,1}
    Setting rotation:{1,0,0,0,1,0,0,0,1}

No scan model found
Traceback (most recent call last):
  File "/Users/graeme/svn/cctbx/build/../modules/cctbx_project/dxtbx/command_line/print_header.py", line 47, in <module>
    print_header()
  File "/Users/graeme/svn/cctbx/build/../modules/cctbx_project/dxtbx/command_line/print_header.py", line 38, in print_header
    raw_data = i.get_raw_data()
  File "/Users/graeme/svn/cctbx/modules/cctbx_project/dxtbx/format/FormatTIFFRayonix.py", line 305, in get_raw_data
    assert(len(self.get_detector()) == 1)
TypeError: object of type 'NoneType' has no len()

some work needed

Format hierarchy assumption violated

We have the assumption that any one file must only be understood by a single (leaf node) format class.

This is currently violated for FormatCBFCspad.

For example dxtbx.show_matching_formats dials_regression/image_examples/LCLS_cspad_nexus/idx-20130301060858701.cbf is matched by these formats. Arrows denote 'is-subclass-of' relation. There are two leaf nodes, FormatCBFCspad and FormatCBFMultiTileStill.

I suggest to change the inheritance of FormatCBFMultiTileHierarchyStill from FormatStill to FormatCBFMultiTileStill.

The test we have in place to check for these cases does not catch this. I only noticed due to refactoring work related to #26.

Proposal: Re-label ImageSweep to ImageSequence

Will allow type of sequence to be defined by the Scan object it has - for traditional sweeps this would involve no change of behaviour but for e.g. raster scans / fixed target still shot collection etc. would give opportunity to define other Scan types which encode the relationship between frames j and j+1. N.B. this is different to an ImageSet as we are being explicit that there are relationships between frames - common detectors, beams, goniometers etc.

The proposal for this change is to make the name more general, but not to change the behaviour at this time. Subsequent proposals will include changing of behaviour.

This will not touch anything which at this time does not deal with sweeps.

Test JF16M

I'd like to add https://zenodo.org/record/3352358#.XYJrbtNKj2I for regular testing and I think dials-data is the way to do it but I don't know how. It's a 4 GB dataset.

Specifically I want to ensure that multi-module NeXus datasets are tested. See #92.

Beam polarisation handling

Pointed out by @dagewa the other day, in FormatCBFFullPilatus.py we explicitly override the beam polarisation to synchotron values:
https://github.com/cctbx/cctbx_project/blob/c8767224cd85600d3cc6c761741299cbac3448e4/dxtbx/format/FormatCBFFullPilatus.py#L54-L59

This was added 'for the moment' in 19a1020. Presumably we shouldn't be doing this? Or should be doing this better... @dagewa also added a ticket dials/dials#592 to allow this to be configured, but ignoring any file contents at the FullPilatus level seems to be the wrong way, at least.

Looking at some full-cbf files that came off of IO4, they seem to actually have this written in MiniCBF metadata as 0.990 , and fullCBF's _diffrn_radiation.polarizn_source_ratio as 0.8. Should we be using this/these instead? Are we writing the full CBF's wrong? Lots of questions about this.

In DataBlockTemplateImporter allow for option to work around missing images

i.e. if I have a sequence of images

th_8_2_0310.cbf
th_8_2_0311.cbf
th_8_2_0312.cbf
th_8_2_0313.cbf
th_8_2_0314.cbf
th_8_2_0316.cbf
th_8_2_0317.cbf
th_8_2_0318.cbf
th_8_2_0319.cbf

let the caller have the option of dealing with this. Use case is where caller will subsequently be trimming the sweep anyway. Yes, could work around this by calculating a full list of images and trimming to what is wanted, but seems like a useful option.

Eiger 4m @ ESRF no longer works

[gw56@cs03r-sc-serv-16 esrf-eiger-4m]$ dials.import thaumatin28_w1_3_1_master.h5 
DIALS (2018) Acta Cryst. D74, 85-97. https://doi.org/10.1107/S2059798317017235
DIALS 2.dev.325-g14f9423
Traceback (most recent call last):
  File "/dls/science/users/gw56/svn/cctbx/build/../modules/dials/command_line/dials_import.py", line 909, in <module>
    halraiser(e)
  File "/dls/science/users/gw56/svn/cctbx/build/../modules/dials/command_line/dials_import.py", line 907, in <module>
    script.run()
  File "/dls/science/users/gw56/svn/cctbx/build/../modules/dials/command_line/dials_import.py", line 751, in run
    params, options = self.parser.parse_args(args=args, show_diff_phil=False)
  File "/dls/science/users/gw56/svn/cctbx/modules/dials/util/options.py", line 838, in parse_args
    quick_parse=quick_parse,
  File "/dls/science/users/gw56/svn/cctbx/modules/dials/util/options.py", line 550, in parse_args
    format_kwargs=format_kwargs,
  File "/dls/science/users/gw56/svn/cctbx/modules/dials/util/options.py", line 229, in __init__
    format_kwargs,
  File "/dls/science/users/gw56/svn/cctbx/modules/dials/util/options.py", line 295, in try_read_experiments_from_images
    format_kwargs=format_kwargs,
  File "/dls/science/users/gw56/svn/cctbx/modules/dxtbx/model/experiment_list.py", line 676, in from_filenames
    format_kwargs=format_kwargs,
  File "/dls/science/users/gw56/svn/cctbx/modules/dxtbx/datablock.py", line 1076, in from_filenames
    format_kwargs=format_kwargs,
  File "/dls/science/users/gw56/svn/cctbx/modules/dxtbx/datablock.py", line 534, in __init__
    fmt, filename, format_kwargs=format_kwargs
  File "/dls/science/users/gw56/svn/cctbx/modules/dxtbx/datablock.py", line 729, in _create_single_file_imageset
    return format_class.get_imageset(abspath(filename), format_kwargs=format_kwargs)
  File "/dls/science/users/gw56/svn/cctbx/modules/dxtbx/format/FormatMultiImage.py", line 188, in get_imageset
    reader = Class.get_reader()(filenames, num_images=num_images, **format_kwargs)
  File "/dls/science/users/gw56/svn/cctbx/modules/dxtbx/format/FormatMultiImage.py", line 14, in __init__
    self._num_images = self.read_num_images()
  File "/dls/science/users/gw56/svn/cctbx/modules/dxtbx/format/FormatMultiImage.py", line 30, in read_num_images
    format_instance = self.format_class.get_instance(self._filename, **self.kwargs)
  File "/dls/science/users/gw56/svn/cctbx/modules/dxtbx/format/Format.py", line 313, in get_instance
    Class._current_instance_ = Class(filename, **kwargs)
  File "/dls/science/users/gw56/svn/cctbx/modules/dxtbx/format/FormatHDF5EigerNearlyNexus.py", line 313, in __init__
    FormatHDF5.__init__(self, image_file, **kwargs)
  File "/dls/science/users/gw56/svn/cctbx/modules/dxtbx/format/FormatHDF5.py", line 18, in __init__
    Format.__init__(self, image_file, **kwargs)
  File "/dls/science/users/gw56/svn/cctbx/modules/dxtbx/format/Format.py", line 203, in __init__
    self.setup()
  File "/dls/science/users/gw56/svn/cctbx/modules/dxtbx/format/Format.py", line 213, in setup
    self._start()
  File "/dls/science/users/gw56/svn/cctbx/modules/dxtbx/format/FormatHDF5EigerNearlyNexus.py", line 369, in _start
    for f0, f1, s0, s1 in determine_eiger_mask(self._detector_model):
  File "/dls/science/users/gw56/svn/cctbx/modules/dxtbx/format/FormatPilatusHelpers.py", line 170, in determine_eiger_mask
    assert (n_fast - 1) * gap_size_fast == remainder
AssertionError: Please report this error to [email protected]:

reported by @dagewa

Incorrect handling of trusted_range in FormatSMVADSC

When IMAGE_PEDESTAL is in the SMV header, the images have this value subtracted from them:

dxtbx/format/FormatSMVADSC.py

Lines 266 to 268 in 52bb9cb

 # if we subtract PEDESTAL is this still raw? 

 if "IMAGE_PEDESTAL" in self._header_dictionary: 

 raw_data -= int(self._header_dictionary["IMAGE_PEDESTAL"])

but the trusted_range of (underload, overload) is set by:

dxtbx/format/FormatSMVADSC.py

Lines 180 to 186 in 52bb9cb

 if "IMAGE_PEDESTAL" in self._header_dictionary: 

 pedestal = int(self._header_dictionary["IMAGE_PEDESTAL"]) 

 else: 

 pedestal = 0 

 overload = 65535 - pedestal 

 underload = pedestal - 1

So, if IMAGE_PEDESTAL=100 and the image has minimum value 103, say, then the raw data will have values starting from 3 but values up to 99 will be masked by being outside the trusted_range.

Make format_kwargs pass along arbitrary parameters

Currently two parameters are passed along to the format objects in the kwargs dictionary of their init functions, dynamic_shadowing and multi_panel. These are defined in two locations:

A format phil blob in util/options.py
A try except block in util/options.py where they are read from the phil blob by the importer

If this phil blob could take arbitrary parameters, then format classes could take arbitrary parameters. This would be super helpful for metadata poor data files and would allow less hardcoding of parameters in format objects.

	// Apply dark
	if (p.size() > 0) {
	for (std::size_t j = 0; j < r.size(); ++j) {
	c[j] = c[j] - p[j];
	}
	}

	/**
	* Unpickle a python object from a string
	*/
	boost::python::object pickle_loads(std::string x) {
	if (x == "") {
	return boost::python::object();
	}
	boost::python::object main = boost::python::import("__main__");
	boost::python::object global(main.attr("__dict__"));
	boost::python::object result =
	exec("def loads(x):import pickle; return pickle.loads(x)", global, global);
	boost::python::object loads = global["loads"];
	return loads(x);
	}

	# if we subtract PEDESTAL is this still raw?
	if "IMAGE_PEDESTAL" in self._header_dictionary:
	raw_data -= int(self._header_dictionary["IMAGE_PEDESTAL"])

	if "IMAGE_PEDESTAL" in self._header_dictionary:
	pedestal = int(self._header_dictionary["IMAGE_PEDESTAL"])
	else:
	pedestal = 0

	overload = 65535 - pedestal
	underload = pedestal - 1