Comments (15)
Link to a workaround:
from dxtbx.
I like this proposal, but I wonder how one would select just the first 9 images in the top example. i.e. where you do want to match just one digit. I would be tempted to bring ? into play, but that is recognised by shell glob so maybe not what we want.
In this case one has to use
dials.import template=image_#.cbf scan_range=1,9
ordials.import image_?.cbf
. In the latter case,?
is expanded by the shell so DIALS sees only 1 to 9. Thus, it can workout the template and set the scan range correctly.The only other thing which occurs to me in discussing this is whether to move away from the current template to something more printf-like which in fairness most people can understand so you would have foo_%d.cbf or foo_%03d.cbf i.e. someone else has already defined the grammar. This would seem like a sensible transition.
As a programmer, I like this. But for non-programmers, the printf-syntax might seem daunting.
OK, I agree with your viewpoint here on the single #
use case. I ask that we consider also adding support for the printf case as well, since I think that is potentially very powerful (but I am happy that we add that as a second issue / piece of work)
from dxtbx.
Context: some Rigaku datasets (such as these ones: https://xrda.pdbj.org/entry/93) use an image incremental serial number that is not zero-padded.
Can we reasonably detect this automatically and import them properly?
At worst, perhaps we could provide an option at import that allows the template to be non-padded.
from dxtbx.
How about using a single #
as a placeholder for non-zero-padded digit(s)?
This would be least disruptive.
If you have non-padded images, images_1.cbf
, images_2.cbf
, images_3.cbf
, ...images_300.cbf
:
images_#.cbf
matches all.images_###.cbf
does not match.
If you have zero-padded images, images_001.cbf
, images_002.cbf
, images_003.cbf
, ...images_300.cbf
:
images_#.cbf
does not match.images_###.cbf
matches all.images_00#.cbf
matches the first nine images.
So this does not change the existing behaviour. Am I missing corner cases?
from dxtbx.
How about using a single
#
as a placeholder for non-zero-padded digit(s)? This would be least disruptive.If you have non-padded images,
images_1.cbf
,images_2.cbf
,images_3.cbf
, ...images_300.cbf
:
images_#.cbf
matches all.images_###.cbf
does not match.If you have zero-padded images,
images_001.cbf
,images_002.cbf
,images_003.cbf
, ...images_300.cbf
:
images_#.cbf
does not match.images_###.cbf
matches all.images_00#.cbf
matches the first nine images.So this does not change the existing behaviour. Am I missing corner cases?
I like this proposal, but I wonder how one would select just the first 9 images in the top example. i.e. where you do want to match just one digit. I would be tempted to bring ? into play, but that is recognised by shell glob
so maybe not what we want.
Maybe however my concern does not matter because on balance what you suggest is far more useful than the current behaviour.
The only other thing which occurs to me in discussing this is whether to move away from the current template to something more printf
-like which in fairness most people can understand so you would have foo_%d.cbf
or foo_%03d.cbf
i.e. someone else has already defined the grammar. This would seem like a sensible transition.
from dxtbx.
I like this proposal, but I wonder how one would select just the first 9 images in the top example. i.e. where you do want to match just one digit. I would be tempted to bring ? into play, but that is recognised by shell glob so maybe not what we want.
In this case one has to use dials.import template=image_#.cbf scan_range=1,9
or dials.import image_?.cbf
. In the latter case, ?
is expanded by the shell so DIALS sees only 1 to 9. Thus, it can workout the template and set the scan range correctly.
The only other thing which occurs to me in discussing this is whether to move away from the current template to something more printf-like which in fairness most people can understand so you would have foo_%d.cbf or foo_%03d.cbf i.e. someone else has already defined the grammar. This would seem like a sensible transition.
As a programmer, I like this. But for non-programmers, the printf-syntax might seem daunting.
from dxtbx.
I like this proposal, but I wonder how one would select just the first 9 images in the top example. i.e. where you do want to match just one digit. I would be tempted to bring ? into play, but that is recognised by shell
glob
so maybe not what we want.
It's a bit ugly but apparently this is currently possible:
dials.import template=../frames/exp_864_1_#.rodhypix image_range=1,5
So something similar could be used to select any range within the single digit numbers, even if #
is made to match all images.
from dxtbx.
Ah I see @biochem-fan already suggested similar, using scan_range
rather than image_range
(the fact we have multiple options here is also confusing)
from dxtbx.
See also dials/dials#2376 (comment)
from dxtbx.
I have been looking at this today and unfortunately I am stuck. The problem is that there is code used to expand a template to filenames, which does not know if the filenames should be zero-padded or not:
Lines 48 to 58 in 7964d4b
I can put a
count == 1
special case to expand to a non zero-padded list, but this will fail if the dataset is actually using a zero-padded template. That is under the proposal above,
dials.import template=$(dials.data get -q x4wide)/X4_wide_M1S4_2_#.cbf
is supposed to match the files X4_wide_M1S4_2_[0001..0090].cbf
, but that information is not available to this function, so the special case will expand to X4_wide_M1S4_2_[1..90].cbf
instead and then fail later on.
from dxtbx.
Rather than overloading #
I feel it might be somewhat easier to add a new syntax for templates, where zero-padding is explicit. @graeme-winter's suggestion of printf
-like syntax seems appealing.
from dxtbx.
Although, reviewing the above I think I may have incorrectly implemented @biochem-fan's suggestion. I was working on the assumption that X4_wide_M1S4_2_#.cbf
should match sequential image numbers whether zero-padded or not. However, #646 (comment) states
images_#.cbf
does not match.
in this case, and now I can see why.
I'll revise my branch in the light of this and see if it can be rescued.
from dxtbx.
Ok, I think it works. PR incoming π
A remaining question is whether e.g. dials.import image_*.rodhypix
can be made clever enough to figure out that there is a single scan of non-zero-padded images and correctly import as such. At the moment this fails for me with
ValueError: Please report this error to [email protected]: Could not determine filename template
from dxtbx.
I fixed the ValueError
by expanding the special case single-digit #
template regex to match multiple digits. This enables me to import a Rigaku dataset consisting of images with non-zero padded indices in the normal way:
$ dials.import ../Tyrosine/exp_333/frames/exp_333_1_*.rodhypix
DIALS (2018) Acta Cryst. D74, 85-97. https://doi.org/10.1107/S2059798317017235
DIALS 3.dev.1101-g0f4a43535
The following parameters have been modified:
input {
experiments = <image files>
}
--------------------------------------------------------------------------------
format: <class 'dxtbx.format.FormatROD.FormatROD'>
template: /home/fcx32934/data/Rigaku/tyrosine/Tyrosine/exp_333/frames/exp_333_1_#.rodhypix:1:560
num images: 560
sequences:
still: 0
sweep: 1
num stills: 0
--------------------------------------------------------------------------------
Writing experiments to imported.expt
π
@graeme-winter, the single digit template was apparently contentious when it was added (dials/dials#972) and I have extended its application. However, I can't think of any realistic circumstances where this could go wrong (perhaps due to lack of imagination). Would appreciate your comments and review at #705
from dxtbx.
NB added a commit that changes tactic slightly. Rather than extending the single-digit pattern meant for images like image1.cbf
or whatever, I added a new pattern that includes the _
character, which seems to be part of the Rigaku / CAP filename template.
from dxtbx.
Related Issues (20)
- Use module_offset in multi-panel nexus
- EIGER X 16M at SPring-8 BL41XU HOT 9
- [3.17] error: βuint32_tβ does not name a type HOT 2
- [3.17] seems incompatible with Python 3.12 HOT 5
- [3.17] : DeprecationWarning: SelectableGroups dict interface is deprecated.
- [3.17] PytestUnknownMarkWarning: Unknown pytest.mark.regression
- Support for Rigaku Arc detectors HOT 5
- Generalise support for dynamic shadowing in dxtbx HOT 1
- NXmx bugs
- `FormatEigerNXmxFilewriter.get_raw_data` fails when number of linked data files is large HOT 3
- dxtbx.format.FormatNXmxDLS16M.FormatNXmxDLS16M no longer supports legacy data without VDS
- `FormatSMVADSCmlfsom` is too promiscuous HOT 1
- dlsnxs2cbf: produces invalid files from i04-1 Eiger 9M HOT 10
- More general support for handling of hit indices
- Define what we mean by array range HOT 11
- Treat everything internally as count-from-zero
- Figure out what is needed to remove image range
- Rename get_array_range -> get_z_range HOT 1
- Create branches in dxtbx and dials to explore impacts of removing image range etc. HOT 1
- Setting `geometry.goniometer.axis=` for a multi-axis goniometer should be disallowed
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. πππ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google β€οΈ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from dxtbx.