image-rs / image-tiff Goto Github PK
View Code? Open in Web Editor NEWTIFF decoding and encoding library in pure Rust
License: MIT License
TIFF decoding and encoding library in pure Rust
License: MIT License
This is a small compatibility hazard, the only enum variants are the currently supported or introduced variants for these two (and potentially other) fields in tiff
. Adding new variants breaks user code that matches on them, it would be customary to use instead an explicit __NonExhaustive
variant that is not publicly documented and explicitely should not be matched against.
Would you be able to add support for 32bit float Grayscale Tiffs?
Currently, decoding an image allocates a new vector to store the output data. This is unfortunate because it can result in extra copies, for instance in the image
crate:
It would be better if there was an API that took a mutable slice and wrote image data into that.
The crate currently uses num-derive
to add some num-traits::FromPrimitive
implementations to a couple of enumerations. Since this is done via a procmacro it requires sequencing in the compilation pipeline, compiling proc-macro2
/syn
prior. All cases are simple enumerations such as :
pub enum Type {
BYTE = 1,
ASCII = 2,
SHORT = 3,
LONG = 4,
RATIONAL = 5,
}
The functionality could also be added with a standard macro in such simple cases, see this example in smoltcp
which adds very similar converters for std::convert::{From, To}
.
Feed one of the attached sample files to the standard input of the following code to trigger a panic:
extern crate afl;
extern crate tiff;
use tiff::decoder::{Decoder};
use std::io::Cursor;
fn main() {
afl::read_stdio_bytes(|data| {
let cursor = Cursor::new(&data);
if let Ok(mut decoder) = Decoder::new(cursor) {
decoder.read_image();
}
});
}
Samples triggering the panic: tiff-oor-panics.zip
Backtrace:
thread 'main' panicked at 'index 884062630 out of range for slice of length 23707', src/libcore/slice/mod.rs:2349:5
stack backtrace:
0: std::sys::unix::backtrace::tracing::imp::unwind_backtrace
at src/libstd/sys/unix/backtrace/tracing/gcc_s.rs:49
1: std::sys_common::backtrace::_print
at src/libstd/sys_common/backtrace.rs:71
2: std::panicking::default_hook::{{closure}}
at src/libstd/sys_common/backtrace.rs:59
at src/libstd/panicking.rs:211
3: std::panicking::default_hook
at src/libstd/panicking.rs:227
4: std::panicking::rust_panic_with_hook
at src/libstd/panicking.rs:491
5: std::panicking::continue_panic_fmt
at src/libstd/panicking.rs:398
6: rust_begin_unwind
at src/libstd/panicking.rs:325
7: core::panicking::panic_fmt
at src/libcore/panicking.rs:95
8: core::slice::slice_index_len_fail
at src/libcore/slice/mod.rs:2349
9: <tiff::decoder::Decoder<R>>::expand_strip
10: <tiff::decoder::Decoder<R>>::read_image
11: std::panicking::try::do_call
12: __rust_maybe_catch_panic
at src/libpanic_unwind/lib.rs:102
13: afl::read_stdio_bytes
14: std::rt::lang_start::{{closure}}
15: std::panicking::try::do_call
at src/libstd/rt.rs:59
at src/libstd/panicking.rs:310
16: __rust_maybe_catch_panic
at src/libpanic_unwind/lib.rs:102
17: std::rt::lang_start_internal
at src/libstd/panicking.rs:289
at src/libstd/panic.rs:398
at src/libstd/rt.rs:58
18: main
19: __libc_start_main
20: _start
Aborted
Found with AFL.rs. Tested on version 0.2.2
From @TomasKralCZ in #125 (comment)
One thing I'm not sure about is why @fintelia removed color inversion for signed formats. Is that non-standard behaviour ? I remember there was a comment about how libtiff interprets that in a certain way but it got removed.
#128 wasn't intended to change this behavior, but looking more closely it seems that the library wasn't being consistent prior to the change. In particular, i32 and i64 grayscale images never did inversion, while i8 and i16 would if WhiteIsZero was specified.
We should figure out which is correct behavior and then do it consistently. This comment suggests that inversion might be correct:
Lines 1037 to 1038 in 9b94e9d
At least two formats, GeoTIFF and ScanImage, use a similar technique to include additional data in tiff's without affecting the structure of the file for conformant, pure image readers. The usual process when reading the first frame is like so:
Note that, by adjusting all offsets accordingly, it is possible to 'hide' data inside the file in the sense that conformant image decoders will simply seek over the data. In particular by choosing an initial offset greater than 8 resp. 16 some number of bytes immediately after the tiff header will be skipped. This is referred to as a Ghost Header in GeoTIFF and used to provide additional data for 'Cloud compatibility', i.e. it contains a separate index that can be used to optimize reads by transferring less of the whole file which would otherwise be a single linked list in the form of IFDs. (See also #85).
It seems interesting to provide access to this data in tiff
in a way that is agnostic of the usage in the variants of the tiff
format and in such a way that it is purely opt-in. The main questions are:
new
.The current size constraints of TIFF images (e.g. the size of decoding_buffer_size
) should be increased.
I am working with TIFF files which are mostly between 150 and 300MiB. I think it is fair to assume that most others who deal with TIFF images also deal with larger images.
(In general I would argue that the library should not have any such restriction at all (at least below the max possible limit). And that these checks should be made by clients who use this library.)
But for now I propse two possible solutions for now:
Right now it points to https://github.com/PistonDevelopers/image-png.git while I guess it should point to https://github.com/PistonDevelopers/image-tiff.git
I think that the current api of this crate is sort of messy and a overhaul would be nice. I think this crate should be structured for two use cases: a low-level usage of the tiff format with direct control over tags and data and a mid-level baseline tiff implementation with access to properties specified by baseline tiff (and possibly some more). The high-level use case should be covered by the image
crate.
Two things that I think would improve the api a lot would be to reorganize all the small helper types so that they fit together well with both the decoder and encoder and clean up decoder api.
I propose something like this for the decoding api:
impl TiffDecoder<R> {
fn new(...) {...}
fn next_directory<'a>(&'a mut self) -> TiffResult<Option<Directory<'a>>> {...}
fn prev_directory<'a>(&'a mut self) -> TiffResult<Option<Directory<'a>>> {...}
fn next_image<'a>(&'a mut self) -> TiffResult<Option<Image<'a>>> {...}
fn prev_image<'a>(&'a mut self) -> TiffResult<Option<Image<'a>>> {...}
}
impl DirectoryDecoder {
fn read_tag(&mut self, tag: Tag) -> Entry {...}
fn read_data<T: TiffValue>(&mut self, offset: u32, buffer: &mut T) -> TiffResult<()> {...}
}
impl ImageDecoder {
fn width(&self) -> u32 {...}
fn height(&self) -> u32 {...}
fn pixel_format(&self) -> PixelFormat {...}
fn read_strip(&mut self) -> DecodingResult {...} // ???
fn reader<'a>(&'a mut self) -> Reader<'a> {...}
fn resolution_unit(&mut self) -> ResolutionUnit {...}
fn x_resolution(&self) -> Rational {...}
fn y_resolution(&self) -> Rational {...}
// Other baseline tiff properties ...
}
I'm looking for an ability to retrieve a tag data as &[u8]
so I could parse it myself.
For my use-case, I need to decode TIFFs which contain 20+ channels. Currently, image-tiff only supports specific ColorType
s, although a previous (and slower) version of the crate did support ColorType::Other
. I have an extremely messy patch at https://github.com/Masterchef365/image-tiff/tree/more_channels which works for my purposes, but other users of this crate might benefit from a more refined approach.
https://github.com/pdf-rs/fax
It looks like it can be possible to implement fax/group 4 compression.
Decoding these samples with Decoder.read_image()
causes 100% CPU usage. I've run it for 10 minutes before giving up. The code is likely entering an infinite loop.
The exact reproduction code can be found in #28. Found via AFL.rs, tested on image-tiff
version 0.2.2
I am working on image-rs/image#858, but I am not entirely confident that my modified decoder is correct. Some more thorough test that would check if the decoded data is correct would be nice. There should also be tests for things like horizontal predictor.
I'm trying to write a DNG encoder based on this code. I expect to run into several places where the library will need to be made more flexible. The first of these is storing the possible tag values in an enum, as it doesn't appear possible to add additional Tag
values (such as those appearing in the TIFF/EP or DNG standards).
Suggestions for an approach are appreciated, otherwise I'll find something that works and submit in a merge request.
Thanks!
With some LZW compressed TIFF images, I am getting a Format error: Inconsistent sizes encountered.
in the expand_strip(...)
method of decoder/mod.rs
.
I uploaded 3 example files that show this problem here (~630MB).
Last time when I tried to debug this the only thing I noticed is that in the LZW::new(...)
the uncompressed.len()
is larger than the buffer size. And iirc the len()
was the image' width or length.
Anyways, for the three images I printed the bytes
, buffer.byte_length()
and the max_uncompressed_length
right before the error is returned in expand_strip(...)
:
crater1_lzw.tif
:bytes: 23714
, buffer_len: 1,
max_uncompressed_length: 23710
crater2_lzw.tif
bytes: 23714
, buffer_len: 1
, max_uncompressed_length: 23710
geotiff2.tif
bytes: 7710
, buffer_len: 1
, max_uncompressed_length: 2570
About the images: The first two are from the CTX data from the Mars Reconnaissance Orbiter. They should be public domain but I am not 100% sure.
The geotif2.tif
is taken from the sample files of the geotiff library (and as far as I can tell the official geotiff organization) here.
One last thing about these samples, if you think that these provide perfect examples of geotiff, check their readme, specifically: "Just because a file is in this tree does not imply that it follows the standard correctly [...]"
warning: unused `std::result::Result` which must be used
--> src/decoder/mod.rs:276:9
|
276 | self.reader.read_to_string(&mut out);
Currently, the values of the tags "Compression", "SamplesPerPixel" and "SampleFormat" are not reset to the default values specified in the TIFF specification when "next_image" is called.
Is that "persistance across pages" correct? Unfortunately, I am not an expert in the TIFF format. However, one might have to least document it because it could quickly lead to strange errors. I'm willing to contribute a pull request.
It's not quite clear how much we want to move into the main library as a feature or if it should live in a separate crate. The split could be similar to gif
/gifski
where the format gives access to the format while the latter interprets and composes the individual parts with a high-level interface. In any case, GeoTIFF naes a few custom tags which we definitely want to support:
// GeoTiff
pub const MODELPIXELSCALE: Tag = Tag::Unknown(33550);
pub const MODELTIEPOINT: Tag = Tag::Unknown(33922);
pub const MODELTRANSFORMATIONTAG: Tag = Tag::Unknown(34264);
pub const GEOKEYDIRECTORYTAG: Tag = Tag::Unknown(34735);
pub const GEODOUBLEPARAMSTAG: Tag = Tag::Unknown(34736);
pub const GEOASCIIPARAMSTAG: Tag = Tag::Unknown(34737);
// GDAL
pub const GDALMETADATA: Tag = Tag::Unknown(42112);
pub const GDALNODATA: Tag = Tag::Unknown(42113);
@Farkal Further discussion on the GeoTIFF specific problems here. I've hidden our comment chain in the other issue.
The specification says the following:
2 = ASCII 8-bit byte that contains a 7-bit ASCII code; the last bytemust be NUL (binary zero)
We instead validate that the string is valid UTF-8, a superset of the allowed values
Lines 478 to 480 in 679f4f0
This may not seem problematic at first glance but callers might assume that the string can be sliced at any byte. This would panic at runtime if the index turned out not to be a utf-8 character boundary. If values were validated to be truly ascii then all byte boundaries are valid character boundaries.
Some JPEG-in-TIFF images aren't correctly decoded.
For example: (tested with Emulsion 8.0.0)
Output from tiffdump:
stripped-jpeg.tif:
Magic: 0x4949 <little-endian> Version: 0x2a <ClassicTIFF>
Directory 0: offset 94702 (0x171ee) next 0 (0)
ImageWidth (256) SHORT (3) 1<365>
ImageLength (257) SHORT (3) 1<547>
BitsPerSample (258) SHORT (3) 3<8 8 8>
Compression (259) SHORT (3) 1<7>
Photometric (262) SHORT (3) 1<2>
FillOrder (266) SHORT (3) 1<1>
StripOffsets (273) LONG (4) 1<8>
Orientation (274) SHORT (3) 1<1>
SamplesPerPixel (277) SHORT (3) 1<3>
RowsPerStrip (278) SHORT (3) 1<547>
StripByteCounts (279) LONG (4) 1<94694>
XResolution (282) RATIONAL (5) 1<72>
YResolution (283) RATIONAL (5) 1<72>
PlanarConfig (284) SHORT (3) 1<1>
ResolutionUnit (296) SHORT (3) 1<2>
PageNumber (297) SHORT (3) 2<0 1>
Whitepoint (318) RATIONAL (5) 2<0.3127 0.329>
PrimaryChromaticities (319) RATIONAL (5) 6<0.64 0.33 0.3 0.6 0.15 0.06>
JPEGTables (347) UNDEFINED (7) 289<0xff 0xd8 0xff 0xdb 00 0x43 00 0x8 0x6 0x6 0x7 0x6 0x5 0x8 0x7 0x7 0x7 0x9 0x9 0x8 0xa 0xc 0x14 0xd ...>
ICC Profile (34675) UNDEFINED (7) 3144<00 00 0xc 0x48 0x4c 0x69 0x6e 0x6f 0x2 0x10 00 00 0x6d 0x6e 0x74 0x72 0x52 0x47 0x42 0x20 0x58 0x59 0x5a 0x20 ...>
Same error appeared while I was trying to decode tiled-based images.
With #32 applied which fixed the early panics fuzzer could further into the decoding code. Here are some samples discovered by AFL that trigger a panic: divide-by-zero.zip
Backtrace:
thread 'main' panicked at 'attempt to divide by zero', /home/shnatsel/Code/image-tiff/src/decoder/mod.rs:539:12
stack backtrace:
0: std::sys::unix::backtrace::tracing::imp::unwind_backtrace
at src/libstd/sys/unix/backtrace/tracing/gcc_s.rs:49
1: std::sys_common::backtrace::_print
at src/libstd/sys_common/backtrace.rs:71
2: std::panicking::default_hook::{{closure}}
at src/libstd/sys_common/backtrace.rs:59
at src/libstd/panicking.rs:211
3: std::panicking::default_hook
at src/libstd/panicking.rs:227
4: std::panicking::rust_panic_with_hook
at src/libstd/panicking.rs:491
5: std::panicking::continue_panic_fmt
at src/libstd/panicking.rs:398
6: rust_begin_unwind
at src/libstd/panicking.rs:325
7: core::panicking::panic_fmt
at src/libcore/panicking.rs:95
8: core::panicking::panic
at src/libcore/panicking.rs:59
9: <tiff::decoder::Decoder<R>>::read_image
10: std::panicking::try::do_call
11: __rust_maybe_catch_panic
at src/libpanic_unwind/lib.rs:102
12: afl::read_stdio_bytes
13: std::rt::lang_start::{{closure}}
14: std::panicking::try::do_call
at src/libstd/rt.rs:59
at src/libstd/panicking.rs:310
15: __rust_maybe_catch_panic
at src/libpanic_unwind/lib.rs:102
16: std::rt::lang_start_internal
at src/libstd/panicking.rs:289
at src/libstd/panic.rs:398
at src/libstd/rt.rs:58
17: main
18: __libc_start_main
19: _start
Steps to reproduce are the same as in #28
The function decoder.next_image()
is under-documented.
First off, it is not clear how this function relates to read_image()
and what the difference between them is.
Second, the naive approach to using it does not compile:
fn decode(data: &[u8]) {
let cursor = Cursor::new(data);
if let Ok(mut decoder) = Decoder::new(cursor) {
while decoder.more_images() {
decoder.next_image();
}
}
}
This fails with the following error:
error[E0382]: borrow of moved value: `decoder`
--> src/bin/tiff-afl-multi.rs:11:19
|
11 | while decoder.more_images() {
| ^^^^^^^ value borrowed here after move
12 | decoder.next_image();
| ------- value moved here, in previous iteration of loop
|
= note: move occurs because `decoder` has type `tiff::decoder::Decoder<std::io::Cursor<&[u8]>>`, which does not implement the `Copy` trait
A code example that uses it correctly would be appreciated.
Hi,
using what I think is the same TIFF file with an U8 and an U16 variant, the U16 variant throws an InconsistentSizesEncountered
error when the U8 variant loads fine.
$ tiffinfo ASTGTMV003_N11E028_dem_strip.tif
TIFF Directory at offset 0xcae732 (13297458)
Image Width: 3601 Image Length: 3601
Bits/Sample: 16
Sample Format: signed integer
Compression Scheme: LZW
Photometric Interpretation: min-is-black
Orientation: row 0 top, col 0 lhs
Samples/Pixel: 1
Rows/Strip: 1
Planar Configuration: single image plane
Predictor: none 1 (0x1)
TIFF Directory at offset 0xfe96d6 (16684758)
Subfile Type: reduced-resolution image (1 = 0x1)
Image Width: 1801 Image Length: 1801
Bits/Sample: 16
Sample Format: signed integer
Compression Scheme: LZW
Photometric Interpretation: min-is-black
Orientation: row 0 top, col 0 lhs
Samples/Pixel: 1
Rows/Strip: 2
Planar Configuration: single image plane
Predictor: none 1 (0x1)
TIFF Directory at offset 0x1157b9a (18185114)
Subfile Type: reduced-resolution image (1 = 0x1)
Image Width: 1201 Image Length: 1201
Bits/Sample: 16
Sample Format: signed integer
Compression Scheme: LZW
Photometric Interpretation: min-is-black
Orientation: row 0 top, col 0 lhs
Samples/Pixel: 1
Rows/Strip: 3
Planar Configuration: single image plane
Predictor: none 1 (0x1)
TIFF Directory at offset 0x1224a88 (19024520)
Subfile Type: reduced-resolution image (1 = 0x1)
Image Width: 901 Image Length: 901
Bits/Sample: 16
Sample Format: signed integer
Compression Scheme: LZW
Photometric Interpretation: min-is-black
Orientation: row 0 top, col 0 lhs
Samples/Pixel: 1
Rows/Strip: 4
Planar Configuration: single image plane
Predictor: none 1 (0x1)
TIFF Directory at offset 0x12570e8 (19230952)
Subfile Type: reduced-resolution image (1 = 0x1)
Image Width: 451 Image Length: 451
Bits/Sample: 16
Sample Format: signed integer
Compression Scheme: LZW
Photometric Interpretation: min-is-black
Orientation: row 0 top, col 0 lhs
Samples/Pixel: 1
Rows/Strip: 9
Planar Configuration: single image plane
Predictor: none 1 (0x1)
TIFF Directory at offset 0x127ee7c (19394172)
Subfile Type: reduced-resolution image (1 = 0x1)
Image Width: 401 Image Length: 401
Bits/Sample: 16
Sample Format: signed integer
Compression Scheme: LZW
Photometric Interpretation: min-is-black
Orientation: row 0 top, col 0 lhs
Samples/Pixel: 1
Rows/Strip: 10
Planar Configuration: single image plane
Predictor: none 1 (0x1)
TIFF Directory at offset 0x128c038 (19447864)
Subfile Type: reduced-resolution image (1 = 0x1)
Image Width: 226 Image Length: 226
Bits/Sample: 16
Sample Format: signed integer
Compression Scheme: LZW
Photometric Interpretation: min-is-black
Orientation: row 0 top, col 0 lhs
Samples/Pixel: 1
Rows/Strip: 18
Planar Configuration: single image plane
Predictor: none 1 (0x1)
TIFF Directory at offset 0x128d100 (19452160)
Subfile Type: reduced-resolution image (1 = 0x1)
Image Width: 57 Image Length: 57
Bits/Sample: 16
Sample Format: signed integer
Compression Scheme: LZW
Photometric Interpretation: min-is-black
Orientation: row 0 top, col 0 lhs
Samples/Pixel: 1
Rows/Strip: 57
Planar Configuration: single image plane
Predictor: none 1 (0x1)
TIFF Directory at offset 0x128dbde (19454942)
Subfile Type: reduced-resolution image (1 = 0x1)
Image Width: 45 Image Length: 45
Bits/Sample: 16
Sample Format: signed integer
Compression Scheme: LZW
Photometric Interpretation: min-is-black
Orientation: row 0 top, col 0 lhs
Samples/Pixel: 1
Rows/Strip: 45
Planar Configuration: single image plane
Predictor: none 1 (0x1)
tiffinfo ASTGTMV003_N11E028_num_strip.tif
TIFF Directory at offset 0x2364b2 (2319538)
Image Width: 3601 Image Length: 3601
Bits/Sample: 8
Sample Format: unsigned integer
Compression Scheme: LZW
Photometric Interpretation: min-is-black
Orientation: row 0 top, col 0 lhs
Samples/Pixel: 1
Rows/Strip: 2
Planar Configuration: single image plane
Predictor: none 1 (0x1)
TIFF Directory at offset 0x2cd8a0 (2939040)
Subfile Type: reduced-resolution image (1 = 0x1)
Image Width: 1801 Image Length: 1801
Bits/Sample: 8
Sample Format: unsigned integer
Compression Scheme: LZW
Photometric Interpretation: min-is-black
Orientation: row 0 top, col 0 lhs
Samples/Pixel: 1
Rows/Strip: 4
Planar Configuration: single image plane
Predictor: none 1 (0x1)
TIFF Directory at offset 0x30ee2c (3206700)
Subfile Type: reduced-resolution image (1 = 0x1)
Image Width: 1201 Image Length: 1201
Bits/Sample: 8
Sample Format: unsigned integer
Compression Scheme: LZW
Photometric Interpretation: min-is-black
Orientation: row 0 top, col 0 lhs
Samples/Pixel: 1
Rows/Strip: 6
Planar Configuration: single image plane
Predictor: none 1 (0x1)
TIFF Directory at offset 0x33239c (3351452)
Subfile Type: reduced-resolution image (1 = 0x1)
Image Width: 901 Image Length: 901
Bits/Sample: 8
Sample Format: unsigned integer
Compression Scheme: LZW
Photometric Interpretation: min-is-black
Orientation: row 0 top, col 0 lhs
Samples/Pixel: 1
Rows/Strip: 9
Planar Configuration: single image plane
Predictor: none 1 (0x1)
TIFF Directory at offset 0x33b6e2 (3389154)
Subfile Type: reduced-resolution image (1 = 0x1)
Image Width: 451 Image Length: 451
Bits/Sample: 8
Sample Format: unsigned integer
Compression Scheme: LZW
Photometric Interpretation: min-is-black
Orientation: row 0 top, col 0 lhs
Samples/Pixel: 1
Rows/Strip: 18
Planar Configuration: single image plane
Predictor: none 1 (0x1)
TIFF Directory at offset 0x342c28 (3419176)
Subfile Type: reduced-resolution image (1 = 0x1)
Image Width: 401 Image Length: 401
Bits/Sample: 8
Sample Format: unsigned integer
Compression Scheme: LZW
Photometric Interpretation: min-is-black
Orientation: row 0 top, col 0 lhs
Samples/Pixel: 1
Rows/Strip: 20
Planar Configuration: single image plane
Predictor: none 1 (0x1)
TIFF Directory at offset 0x34570c (3430156)
Subfile Type: reduced-resolution image (1 = 0x1)
Image Width: 226 Image Length: 226
Bits/Sample: 8
Sample Format: unsigned integer
Compression Scheme: LZW
Photometric Interpretation: min-is-black
Orientation: row 0 top, col 0 lhs
Samples/Pixel: 1
Rows/Strip: 36
Planar Configuration: single image plane
Predictor: none 1 (0x1)
TIFF Directory at offset 0x345c00 (3431424)
Subfile Type: reduced-resolution image (1 = 0x1)
Image Width: 57 Image Length: 57
Bits/Sample: 8
Sample Format: unsigned integer
Compression Scheme: LZW
Photometric Interpretation: min-is-black
Orientation: row 0 top, col 0 lhs
Samples/Pixel: 1
Rows/Strip: 57
Planar Configuration: single image plane
Predictor: none 1 (0x1)
TIFF Directory at offset 0x345f82 (3432322)
Subfile Type: reduced-resolution image (1 = 0x1)
Image Width: 45 Image Length: 45
Bits/Sample: 8
Sample Format: unsigned integer
Compression Scheme: LZW
Photometric Interpretation: min-is-black
Orientation: row 0 top, col 0 lhs
Samples/Pixel: 1
Rows/Strip: 45
Planar Configuration: single image plane
Predictor: none 1 (0x1)
The error does not appear if the compression is removed using tiffcp
:
ttiffcp -c none -s ASTGTMV003_N11E028_dem_strip.tif ~/Downloads/ASTGTMV003_N11E028_dem_strip_nocomp.tif
Thus, the compression is likely the issue here.
How can I help resolve this?
Regards,
We should use the test cases pack provided by libtiff
, the newest version (3.8) can be obtained here: http://simplesystems.org/libtiff/images.html (~5.5 MB unpacked).
I've tried to upgrade upstream image
crate to tiff 0.7.0, and it uncovered a regression bug: upstream image mandrill.tiff used to decode with 0.6.0, but fails with 0.7.0 with:
Cannot create decoder: FormatError(Format("Neither strips nor tiles were found or both were used in the same file"))
I verified that this problem is reproducible with just image-tiff
too, so it's not an issue with integration in the image
crate itself. It's also an uncompressed TIFF, so doesn't seem related to my deflate PR (#132) either.
Beyond that, I'm not that familiar with changes between releases, so leaving it up to others to debug this further.
It seems like all of the encoding code works for strip-based tiffs and there is currently no means of writing tile-based images.
I work with a lot of floating point tiffs and would like to read them using this library. I'm currently trying to implement it myself in this library, but I'm new to rust and not too familiar with image decoding, so not making much progress so far.
Maybe this issue can serve as a central place to discuss what needs to be done to support it?
Forked from #70 .
The horizontal predictor for floating points does not use float differences, as those would be lossy, but apparently uses an integer difference. Recheck this with the standard.
There is a dedicated floating point predictor which first re-orders some bytes: http://chriscox.org/TIFFTN3d1.pdf
In ImageEncoder::new() the strip size of the image data in the tiff is limited to 8kB, with the remark it is per tiff spec. But the tiff spec of 1992(!!) states a recommendation for the 8kB as mitigation for out of memory errors. At time has passed 8kB is ridiculous small.
I bumped into this as I faced performance issues in the Python Pillow library. We have an application retrieving rectangular parts from large images. I noticed that the performance of images generated via the image-tiff crate was ten time slower than images created by Pillow itself. It appeared that Pillow stored the image as a single strip, whil image-tiff stored one line per strip.
I would like to see an option to store the image in a single strip, configure the max strip size or even get rid of the 8kB strip limit.
https://github.com/image-rs/image-tiff/blob/master/src/encoder/mod.rs#L625
When agreed on a solution, I'll implement it myself.
It would be nice of you to release a new version with num-derive 0.3 so that downstream users can gradually get rid of syn 0.15 in their crate graph.
After updating from 0.5.0 to 0.6.0 I got a pattern matching error. It appear the tag XResolution in the decoder gives a List, whil before it did not:
decoder.find_tag(Tag::XResolution)?
Returns in 0.6.0:
Some(List([Rational(250000000, 250000000)]))
In 0.5.0 it returns:
Some(Rational(250000000, 250000000))
Is this intentional?
Currently only unsigned integers and floating point samples are handled. Signed integers used to be silently treated as unsigned, but now trigger an error.
Currently the tests fail with an index out of bounds error, this is probably a bug introduced in #26 that could only be discovered with the tests from #23.
running 1 test
test decoder::stream::test::test_packbits ... ok
test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out
Running target/debug/deps/decode_images-564f01fb7ded3313
running 6 tests
test test_gray_u8 ... ok
test test_rgb_u8 ... ok
test test_string_tags ... ok
test test_gray_u16 ... ok
test test_decode_data ... ok
test test_rgb_u16 ... ok
test result: ok. 6 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out
Running target/debug/deps/encode_images-b9c0207f836596cd
running 5 tests
test test_gray_u8_roundtrip ... ok
test test_rgb_u8_roundtrip ... ok
test test_gray_u16_roundtrip ... ok
test encode_decode ... ok
test test_rgb_u16_roundtrip ... FAILED
failures:
---- test_rgb_u16_roundtrip stdout ----
thread 'test_rgb_u16_roundtrip' panicked at 'index out of bounds: the len is 19 but the index is 19', /rustc/9fda7c2237db910e41d6a712e9a2139b352e558b/src/libcore/slice/mod.rs:2463:10
note: Run with `RUST_BACKTRACE=1` for a backtrace.
failures:
test_rgb_u16_roundtrip
test result: FAILED. 4 passed; 1 failed; 0 ignored; 0 measured; 0 filtered out
error: test failed, to rerun pass '--test encode_images'
It looks like at the moment the tiff
crate can't read a large amount of valid tiff files - specifically any geo data that I've found. I've attached a small sample file since it's unclear to me how to fix this at the moment without going deep into the tiff file format (or the image-tiff codebase).
The following code causes a panic in the encoder:
let input_data = vec![1,2,3].as_slice();
let mut output = Vec::new();
let mut output_stream = Cursor::new(output);
if let Ok(mut tiff) = TiffEncoder::new(&mut output_stream) {
tiff.write_image::<colortype::RGB8>(50, 50, &input_data)
}
Error message:
thread 'main' panicked at 'index 7500 out of range for slice of length 173', src/libcore/slice/mod.rs:2349:5
I'm trying to use image
crate to read TIFF images produced (compressed) by ImageMagick with -compress zip
. My understanding is that image
crate relies on this one to do actual reading, so raising issue here.
This is what Exiftool reports on those images:
> exiftool -compression pleiades4.tif
Compression : Adobe Deflate
README suggests that decoding Deflate TIFFs should be supported:
However, when trying to read them using image::open
, I'm getting:
The decoder for Tiff does not support the format features Compression method Deflate is unsupported
Am I holding it wrong? Or is there some specific to this "Adobe Deflate" that ImageMagick uses that makes it different from Deflate supported by this crate?
Decoding any of the attached files triggers a crash with the following error message: memory allocation of 136902082592 bytes failedAborted
The exact reproduction code can be found in #28. Found via AFL.rs, tested on image-tiff
version 0.2.2
Most decoding libraries face this issue at some point. This is usually solved by limiting the amount of allocated memory to some sane default, and letting people override it if they're really dealing with enormous amounts of data. In Rust we can easily allow the API user to override these limits via the builder pattern.
See https://libpng.sourceforge.io/decompression_bombs.html for more info on how a similar issue was solved in libpng. See also the Limits struct from flif crate.
Somehow the attached image does not decode correctly, it only reads half of the rows width.
2_1.zip
Can we get a new release with this, also for image
? image
itself is already using jpeg-decoder 0.2
and this is showing as a duplicate dependency for us.
Originally posted by @MarijnS95 in #155 (comment)
I noticed that CompressionMethod::JPEG
has the value 6
. According to wikipedia, this is actually the "obsolete 'old-style' JPEG" that "... should never be written". For consistency with Deflate
/OldDeflate
should we consider renaming this to OldJPEG
?
I realize this is a breaking change, but since it seems like there is some work being done around this area it may be a good opportunity to standardize the names.
Using a *
forces downstream users to be unable to use newer versions of crates. It is usually suggested to avoid a version operator at all. Currently, this crate has:
Instead, I would expect this to be written as:
byteorder = "1.2"
lzw = "0.10"
num-derive = "0.2"
num-traits = "0.2"
This would allow for byteorder 1.2.x and 1.3.x, a semver-compatible version. It would not allow num-derive 0.3 (which is not semver-compatible with 0.2).
See the Cargo doc for details on the operators.
The playground uses the newest version of crates. The image crate is popular enough to be included on the playground and it depends on tiff. This version restriction makes it so that tiff cannot be installed at the same time as byteorder 1.3. Until this issue is resolved, I will be preventing image (and thus tiff) from being available in the playground.
Thanks for listening!
Presumably using e.g. flate2 or similar should enable deflate
support.
The CompressionMethod
should come out as 32946
.
I open this issue to speak about the best way to have cloud compatibility.
What does it mean ? Simply that we should be able to decode the header step by step. Because the ifd can be anywhere in the file we should be able to decode ifd one by one and set the data to decode each time.
Example:
If there is a way to dynamicaly add data to cursor and fake the position we should be able to already have something working but i didn't find any way to do that. I wait for your advices about this use case before making a PR ๐
The unwrap
in decoder/image.rs:L348
(
image-tiff/src/decoder/image.rs
Line 348 in 802949f
I can provide a test file, if you need to test it. I would prefer to email or message it privately.
Regards,
Micah
Am I correct that there is currently no way to read (private) IFD Tags like "GpsIFD"? However, the necessary parsing method is already implemented and even the offset of the tag is available (tiff::decoder::ifd::Value::Ifd(offset)
).
By extending read_ifd
, I would offer to add the missing function pub fn get_tag_ifd(&mut self, tag: Tag) -> TiffResult<tiff::decoder::ifd::Directory>
.
Are there any opinions or criticism regarding that plan?
Hi,
I was considering contributing BigTiff support (decoding, for now) to the library. Since I'm nowhere near a Rust expert, I'm still debating with myself on the implementation, since the (BigTiff spec)[https://www.awaresystems.be/imaging/tiff/bigtiff.html] differs from the Tiff spec in several key ways.
For example, the third byte is always 43, as opposed to Tiff's always 42. This might limit the reuseability of the current Decoder
, forcing me to re-write large portions of it, essentially creating a BigTiffRecorder
. On the other hand, the user shouldn't care if the Tiff is a BigTiff or a normal one, he just wants it decoded.
How would you suggest me to tackle this?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.