image-rs / image-tiff Goto Github PK

View Code? Open in Web Editor NEW

105.0 105.0 68.0 8.63 MB

TIFF decoding and encoding library in pure Rust

License: MIT License

Rust 100.00%

hackoctoberfest

image-tiff's People

Contributors

Stargazers

Watchers

image-tiff's Issues

`CompressionMethod` and `PhotometricInterpretation` should be NonExhaustive

This is a small compatibility hazard, the only enum variants are the currently supported or introduced variants for these two (and potentially other) fields in tiff. Adding new variants breaks user code that matches on them, it would be customary to use instead an explicit __NonExhaustive variant that is not publicly documented and explicitely should not be matched against.

Feature request: 32 bits per sample

Would you be able to add support for 32bit float Grayscale Tiffs?

Support in-place decoding

Currently, decoding an image allocates a new vector to store the output data. This is unfortunate because it can result in extra copies, for instance in the image crate:

https://github.com/image-rs/image/blob/6e0cd31a5287dd589d2e78ae33c1f720c77a6863/src/codecs/tiff.rs#L198

It would be better if there was an API that took a mutable slice and wrote image data into that.

`num-derive` dependency worth it?

The crate currently uses num-derive to add some num-traits::FromPrimitive implementations to a couple of enumerations. Since this is done via a procmacro it requires sequencing in the compilation pipeline, compiling proc-macro2/syn prior. All cases are simple enumerations such as :

pub enum Type {
    BYTE = 1,
    ASCII = 2,
    SHORT = 3,
    LONG = 4,
    RATIONAL = 5,
}

The functionality could also be added with a standard macro in such simple cases, see this example in smoltcp which adds very similar converters for std::convert::{From, To}.

Panic on malformed input: Index out of range in Decoder::expand_strip

Feed one of the attached sample files to the standard input of the following code to trigger a panic:

extern crate afl;
extern crate tiff;

use tiff::decoder::{Decoder};
use std::io::Cursor;

fn main() {
    afl::read_stdio_bytes(|data| {
        let cursor = Cursor::new(&data);
        if let Ok(mut decoder) = Decoder::new(cursor) {
            decoder.read_image();
        }
    });
}

Samples triggering the panic: tiff-oor-panics.zip

Backtrace:

thread 'main' panicked at 'index 884062630 out of range for slice of length 23707', src/libcore/slice/mod.rs:2349:5
stack backtrace:
   0: std::sys::unix::backtrace::tracing::imp::unwind_backtrace
             at src/libstd/sys/unix/backtrace/tracing/gcc_s.rs:49
   1: std::sys_common::backtrace::_print
             at src/libstd/sys_common/backtrace.rs:71
   2: std::panicking::default_hook::{{closure}}
             at src/libstd/sys_common/backtrace.rs:59
             at src/libstd/panicking.rs:211
   3: std::panicking::default_hook
             at src/libstd/panicking.rs:227
   4: std::panicking::rust_panic_with_hook
             at src/libstd/panicking.rs:491
   5: std::panicking::continue_panic_fmt
             at src/libstd/panicking.rs:398
   6: rust_begin_unwind
             at src/libstd/panicking.rs:325
   7: core::panicking::panic_fmt
             at src/libcore/panicking.rs:95
   8: core::slice::slice_index_len_fail
             at src/libcore/slice/mod.rs:2349
   9: <tiff::decoder::Decoder<R>>::expand_strip
  10: <tiff::decoder::Decoder<R>>::read_image
  11: std::panicking::try::do_call
  12: __rust_maybe_catch_panic
             at src/libpanic_unwind/lib.rs:102
  13: afl::read_stdio_bytes
  14: std::rt::lang_start::{{closure}}
  15: std::panicking::try::do_call
             at src/libstd/rt.rs:59
             at src/libstd/panicking.rs:310
  16: __rust_maybe_catch_panic
             at src/libpanic_unwind/lib.rs:102
  17: std::rt::lang_start_internal
             at src/libstd/panicking.rs:289
             at src/libstd/panic.rs:398
             at src/libstd/rt.rs:58
  18: main
  19: __libc_start_main
  20: _start
Aborted

Found with AFL.rs. Tested on version 0.2.2

Color inversion for signed gray images

From @TomasKralCZ in #125 (comment)

One thing I'm not sure about is why @fintelia removed color inversion for signed formats. Is that non-standard behaviour ? I remember there was a comment about how libtiff interprets that in a certain way but it got removed.

#128 wasn't intended to change this behavior, but looking more closely it seems that the library wasn't being consistent prior to the change. In particular, i32 and i64 grayscale images never did inversion, while i8 and i16 would if WhiteIsZero was specified.

We should figure out which is correct behavior and then do it consistently. This comment suggests that inversion might be correct:

image-tiff/src/decoder/mod.rs

Lines 1037 to 1038 in 9b94e9d

 // The following conversions interpret the image as in libtiff. 

 // In particular, MIN is white and MAX is black and not Zero as the name would imply.

Ghost header or; data outside IFDs and Images

At least two formats, GeoTIFF and ScanImage, use a similar technique to include additional data in tiff's without affecting the structure of the file for conformant, pure image readers. The usual process when reading the first frame is like so:

Read the header, 8 bytes or 16 bytes for normal and Big Tiff respectively. This headers contains an Offset of first image directory.
Seek forward to the first image directory.
Find StripOffset tags or similar, and seek to image data accordingly.

Note that, by adjusting all offsets accordingly, it is possible to 'hide' data inside the file in the sense that conformant image decoders will simply seek over the data. In particular by choosing an initial offset greater than 8 resp. 16 some number of bytes immediately after the tiff header will be skipped. This is referred to as a Ghost Header in GeoTIFF and used to provide additional data for 'Cloud compatibility', i.e. it contains a separate index that can be used to optimize reads by transferring less of the whole file which would otherwise be a single linked list in the form of IFDs. (See also #85).

It seems interesting to provide access to this data in tiff in a way that is agnostic of the usage in the variants of the tiff format and in such a way that it is purely opt-in. The main questions are:

What extra methods are necessary in the decoder? Note that we skip the Ghost Header already in new.
How should the encoder be instructed to write such data?
How can we ensure that the internal state/the offsets stored in encoder and decoder is consistent with the additional data? Are there are hard-coded offsets we need to adjust?

Increase size constraints

The current size constraints of TIFF images (e.g. the size of decoding_buffer_size) should be increased.

I am working with TIFF files which are mostly between 150 and 300MiB. I think it is fair to assume that most others who deal with TIFF images also deal with larger images.

(In general I would argue that the library should not have any such restriction at all (at least below the max possible limit). And that these checks should be made by clients who use this library.)

But for now I propse two possible solutions for now:

Either increase the size limit to 512MiB, which should be reasonable for most hardware in 2020 OR
Make this limit configurable when initializing the decoder.

Crate repository url is wrong in Cargo.toml

Right now it points to https://github.com/PistonDevelopers/image-png.git while I guess it should point to https://github.com/PistonDevelopers/image-tiff.git

API overhaul

I think that the current api of this crate is sort of messy and a overhaul would be nice. I think this crate should be structured for two use cases: a low-level usage of the tiff format with direct control over tags and data and a mid-level baseline tiff implementation with access to properties specified by baseline tiff (and possibly some more). The high-level use case should be covered by the image crate.

Two things that I think would improve the api a lot would be to reorganize all the small helper types so that they fit together well with both the decoder and encoder and clean up decoder api.

I propose something like this for the decoding api:

impl TiffDecoder<R> {
    fn new(...) {...}

    fn next_directory<'a>(&'a mut self) -> TiffResult<Option<Directory<'a>>> {...}
    fn prev_directory<'a>(&'a mut self) -> TiffResult<Option<Directory<'a>>> {...}

    fn next_image<'a>(&'a mut self) -> TiffResult<Option<Image<'a>>> {...}
    fn prev_image<'a>(&'a mut self) -> TiffResult<Option<Image<'a>>> {...}
}

impl DirectoryDecoder {
    fn read_tag(&mut self, tag: Tag) -> Entry {...}
    fn read_data<T: TiffValue>(&mut self, offset: u32, buffer: &mut T) -> TiffResult<()> {...}
}

impl ImageDecoder {
    fn width(&self) -> u32 {...}
    fn height(&self) -> u32 {...}
    fn pixel_format(&self) -> PixelFormat {...}

    fn read_strip(&mut self) -> DecodingResult {...} // ???
    fn reader<'a>(&'a mut self) -> Reader<'a> {...}

    fn resolution_unit(&mut self) -> ResolutionUnit {...}
    fn x_resolution(&self) -> Rational {...}
    fn y_resolution(&self) -> Rational {...}
    // Other baseline tiff properties  ...
}

Allow returning a tag data as a raw slice

I'm looking for an ability to retrieve a tag data as &[u8] so I could parse it myself.

Support for arbitrary sample counts

For my use-case, I need to decode TIFFs which contain 20+ channels. Currently, image-tiff only supports specific ColorTypes, although a previous (and slower) version of the crate did support ColorType::Other. I have an extremely messy patch at https://github.com/Masterchef365/image-tiff/tree/more_channels which works for my purposes, but other users of this crate might benefit from a more refined approach.

Possible Group 4 decoding with fax crate?

https://github.com/pdf-rs/fax
It looks like it can be possible to implement fax/group 4 compression.

Infinite loop on malformed input

Decoding these samples with Decoder.read_image() causes 100% CPU usage. I've run it for 10 minutes before giving up. The code is likely entering an infinite loop.

The exact reproduction code can be found in #28. Found via AFL.rs, tested on image-tiff version 0.2.2

Better tests

I am working on image-rs/image#858, but I am not entirely confident that my modified decoder is correct. Some more thorough test that would check if the decoded data is correct would be nice. There should also be tests for things like horizontal predictor.

Support additional tags

I'm trying to write a DNG encoder based on this code. I expect to run into several places where the library will need to be made more flexible. The first of these is storing the possible tag values in an enum, as it doesn't appear possible to add additional Tag values (such as those appearing in the TIFF/EP or DNG standards).

Suggestions for an approach are appreciated, otherwise I'll find something that works and submit in a merge request.

Thanks!

"Inconsistent sizes" when parsing LZW compressed images

With some LZW compressed TIFF images, I am getting a Format error: Inconsistent sizes encountered. in the expand_strip(...) method of decoder/mod.rs.

I uploaded 3 example files that show this problem here (~630MB).

Last time when I tried to debug this the only thing I noticed is that in the LZW::new(...) the uncompressed.len() is larger than the buffer size. And iirc the len() was the image' width or length.

Anyways, for the three images I printed the bytes, buffer.byte_length() and the max_uncompressed_length right before the error is returned in expand_strip(...):

`crater1_lzw.tif`:

bytes: 23714, buffer_len: 1, max_uncompressed_length: 23710

`crater2_lzw.tif`

bytes: 23714, buffer_len: 1, max_uncompressed_length: 23710

`geotiff2.tif`

bytes: 7710, buffer_len: 1, max_uncompressed_length: 2570

About the images: The first two are from the CTX data from the Mars Reconnaissance Orbiter. They should be public domain but I am not 100% sure.
The geotif2.tif is taken from the sample files of the geotiff library (and as far as I can tell the official geotiff organization) here.

One last thing about these samples, if you think that these provide perfect examples of geotiff, check their readme, specifically: "Just because a file is in this tree does not imply that it follows the standard correctly [...]"

Error case of `read_to_string` not handled

warning: unused `std::result::Result` which must be used
   --> src/decoder/mod.rs:276:9
    |
276 |         self.reader.read_to_string(&mut out);

Reset values of tags on "next_image"?

Currently, the values of the tags "Compression", "SamplesPerPixel" and "SampleFormat" are not reset to the default values specified in the TIFF specification when "next_image" is called.

Is that "persistance across pages" correct? Unfortunately, I am not an expert in the TIFF format. However, one might have to least document it because it could quickly lead to strange errors. I'm willing to contribute a pull request.

Tracking: GeoTIFF support

It's not quite clear how much we want to move into the main library as a feature or if it should live in a separate crate. The split could be similar to gif/gifski where the format gives access to the format while the latter interprets and composes the individual parts with a high-level interface. In any case, GeoTIFF naes a few custom tags which we definitely want to support:

// GeoTiff
pub const MODELPIXELSCALE: Tag = Tag::Unknown(33550);
pub const MODELTIEPOINT: Tag = Tag::Unknown(33922);
pub const MODELTRANSFORMATIONTAG: Tag = Tag::Unknown(34264);
pub const GEOKEYDIRECTORYTAG: Tag = Tag::Unknown(34735);
pub const GEODOUBLEPARAMSTAG: Tag = Tag::Unknown(34736);
pub const GEOASCIIPARAMSTAG: Tag = Tag::Unknown(34737);

// GDAL
pub const GDALMETADATA: Tag = Tag::Unknown(42112);
pub const GDALNODATA: Tag = Tag::Unknown(42113);

@Farkal Further discussion on the GeoTIFF specific problems here. I've hidden our comment chain in the other issue.

ASCII tag type is not validated enough

The specification says the following:

2 = ASCII 8-bit byte that contains a 7-bit ASCII code; the last bytemust be NUL (binary zero)

We instead validate that the string is valid UTF-8, a superset of the allowed values

image-tiff/src/decoder/ifd.rs

Lines 478 to 480 in 679f4f0

 self.r(bo).read_exact(&mut buf)?; 

 let v = String::from_utf8(buf)?; 

 let v = v.trim_matches(char::from(0));

This may not seem problematic at first glance but callers might assume that the string can be sliced at any byte. This would panic at runtime if the index turned out not to be a utf-8 character boundary. If values were validated to be truly ascii then all byte boundaries are valid character boundaries.

Erroneous JPEG decoding

Some JPEG-in-TIFF images aren't correctly decoded.

For example: (tested with Emulsion 8.0.0)

Output from tiffdump:

stripped-jpeg.tif:
Magic: 0x4949 <little-endian> Version: 0x2a <ClassicTIFF>
Directory 0: offset 94702 (0x171ee) next 0 (0)
ImageWidth (256) SHORT (3) 1<365>
ImageLength (257) SHORT (3) 1<547>
BitsPerSample (258) SHORT (3) 3<8 8 8>
Compression (259) SHORT (3) 1<7>
Photometric (262) SHORT (3) 1<2>
FillOrder (266) SHORT (3) 1<1>
StripOffsets (273) LONG (4) 1<8>
Orientation (274) SHORT (3) 1<1>
SamplesPerPixel (277) SHORT (3) 1<3>
RowsPerStrip (278) SHORT (3) 1<547>
StripByteCounts (279) LONG (4) 1<94694>
XResolution (282) RATIONAL (5) 1<72>
YResolution (283) RATIONAL (5) 1<72>
PlanarConfig (284) SHORT (3) 1<1>
ResolutionUnit (296) SHORT (3) 1<2>
PageNumber (297) SHORT (3) 2<0 1>
Whitepoint (318) RATIONAL (5) 2<0.3127 0.329>
PrimaryChromaticities (319) RATIONAL (5) 6<0.64 0.33 0.3 0.6 0.15 0.06>
JPEGTables (347) UNDEFINED (7) 289<0xff 0xd8 0xff 0xdb 00 0x43 00 0x8 0x6 0x6 0x7 0x6 0x5 0x8 0x7 0x7 0x7 0x9 0x9 0x8 0xa 0xc 0x14 0xd ...>
ICC Profile (34675) UNDEFINED (7) 3144<00 00 0xc 0x48 0x4c 0x69 0x6e 0x6f 0x2 0x10 00 00 0x6d 0x6e 0x74 0x72 0x52 0x47 0x42 0x20 0x58 0x59 0x5a 0x20 ...>

Same error appeared while I was trying to decode tiled-based images.

stripped-jpeg.zip

Panic on malformed input: attempt to divide by zero

With #32 applied which fixed the early panics fuzzer could further into the decoding code. Here are some samples discovered by AFL that trigger a panic: divide-by-zero.zip

Backtrace:

thread 'main' panicked at 'attempt to divide by zero', /home/shnatsel/Code/image-tiff/src/decoder/mod.rs:539:12
stack backtrace:
   0: std::sys::unix::backtrace::tracing::imp::unwind_backtrace
             at src/libstd/sys/unix/backtrace/tracing/gcc_s.rs:49
   1: std::sys_common::backtrace::_print
             at src/libstd/sys_common/backtrace.rs:71
   2: std::panicking::default_hook::{{closure}}
             at src/libstd/sys_common/backtrace.rs:59
             at src/libstd/panicking.rs:211
   3: std::panicking::default_hook
             at src/libstd/panicking.rs:227
   4: std::panicking::rust_panic_with_hook
             at src/libstd/panicking.rs:491
   5: std::panicking::continue_panic_fmt
             at src/libstd/panicking.rs:398
   6: rust_begin_unwind
             at src/libstd/panicking.rs:325
   7: core::panicking::panic_fmt
             at src/libcore/panicking.rs:95
   8: core::panicking::panic
             at src/libcore/panicking.rs:59
   9: <tiff::decoder::Decoder<R>>::read_image
  10: std::panicking::try::do_call
  11: __rust_maybe_catch_panic
             at src/libpanic_unwind/lib.rs:102
  12: afl::read_stdio_bytes
  13: std::rt::lang_start::{{closure}}
  14: std::panicking::try::do_call
             at src/libstd/rt.rs:59
             at src/libstd/panicking.rs:310
  15: __rust_maybe_catch_panic
             at src/libpanic_unwind/lib.rs:102
  16: std::rt::lang_start_internal
             at src/libstd/panicking.rs:289
             at src/libstd/panic.rs:398
             at src/libstd/rt.rs:58
  17: main
  18: __libc_start_main
  19: _start

Steps to reproduce are the same as in #28

Missing docs/examples for decoder.next_image()

The function decoder.next_image() is under-documented.

First off, it is not clear how this function relates to read_image() and what the difference between them is.

Second, the naive approach to using it does not compile:

fn decode(data: &[u8]) {
    let cursor = Cursor::new(data);
    if let Ok(mut decoder) = Decoder::new(cursor) {
        while decoder.more_images() {
            decoder.next_image();
        }
    }
}

This fails with the following error:

error[E0382]: borrow of moved value: `decoder`
  --> src/bin/tiff-afl-multi.rs:11:19
   |
11 |             while decoder.more_images() {
   |                   ^^^^^^^ value borrowed here after move
12 |                 decoder.next_image();
   |                 ------- value moved here, in previous iteration of loop
   |
   = note: move occurs because `decoder` has type `tiff::decoder::Decoder<std::io::Cursor<&[u8]>>`, which does not implement the `Copy` trait

A code example that uses it correctly would be appreciated.

InconsistentSizesEncountered error on LZW/U16 compressed file

Hi,

using what I think is the same TIFF file with an U8 and an U16 variant, the U16 variant throws an InconsistentSizesEncountered error when the U8 variant loads fine.

$ tiffinfo ASTGTMV003_N11E028_dem_strip.tif 
TIFF Directory at offset 0xcae732 (13297458)
  Image Width: 3601 Image Length: 3601
  Bits/Sample: 16
  Sample Format: signed integer
  Compression Scheme: LZW
  Photometric Interpretation: min-is-black
  Orientation: row 0 top, col 0 lhs
  Samples/Pixel: 1
  Rows/Strip: 1
  Planar Configuration: single image plane
  Predictor: none 1 (0x1)
TIFF Directory at offset 0xfe96d6 (16684758)
  Subfile Type: reduced-resolution image (1 = 0x1)
  Image Width: 1801 Image Length: 1801
  Bits/Sample: 16
  Sample Format: signed integer
  Compression Scheme: LZW
  Photometric Interpretation: min-is-black
  Orientation: row 0 top, col 0 lhs
  Samples/Pixel: 1
  Rows/Strip: 2
  Planar Configuration: single image plane
  Predictor: none 1 (0x1)
TIFF Directory at offset 0x1157b9a (18185114)
  Subfile Type: reduced-resolution image (1 = 0x1)
  Image Width: 1201 Image Length: 1201
  Bits/Sample: 16
  Sample Format: signed integer
  Compression Scheme: LZW
  Photometric Interpretation: min-is-black
  Orientation: row 0 top, col 0 lhs
  Samples/Pixel: 1
  Rows/Strip: 3
  Planar Configuration: single image plane
  Predictor: none 1 (0x1)
TIFF Directory at offset 0x1224a88 (19024520)
  Subfile Type: reduced-resolution image (1 = 0x1)
  Image Width: 901 Image Length: 901
  Bits/Sample: 16
  Sample Format: signed integer
  Compression Scheme: LZW
  Photometric Interpretation: min-is-black
  Orientation: row 0 top, col 0 lhs
  Samples/Pixel: 1
  Rows/Strip: 4
  Planar Configuration: single image plane
  Predictor: none 1 (0x1)
TIFF Directory at offset 0x12570e8 (19230952)
  Subfile Type: reduced-resolution image (1 = 0x1)
  Image Width: 451 Image Length: 451
  Bits/Sample: 16
  Sample Format: signed integer
  Compression Scheme: LZW
  Photometric Interpretation: min-is-black
  Orientation: row 0 top, col 0 lhs
  Samples/Pixel: 1
  Rows/Strip: 9
  Planar Configuration: single image plane
  Predictor: none 1 (0x1)
TIFF Directory at offset 0x127ee7c (19394172)
  Subfile Type: reduced-resolution image (1 = 0x1)
  Image Width: 401 Image Length: 401
  Bits/Sample: 16
  Sample Format: signed integer
  Compression Scheme: LZW
  Photometric Interpretation: min-is-black
  Orientation: row 0 top, col 0 lhs
  Samples/Pixel: 1
  Rows/Strip: 10
  Planar Configuration: single image plane
  Predictor: none 1 (0x1)
TIFF Directory at offset 0x128c038 (19447864)
  Subfile Type: reduced-resolution image (1 = 0x1)
  Image Width: 226 Image Length: 226
  Bits/Sample: 16
  Sample Format: signed integer
  Compression Scheme: LZW
  Photometric Interpretation: min-is-black
  Orientation: row 0 top, col 0 lhs
  Samples/Pixel: 1
  Rows/Strip: 18
  Planar Configuration: single image plane
  Predictor: none 1 (0x1)
TIFF Directory at offset 0x128d100 (19452160)
  Subfile Type: reduced-resolution image (1 = 0x1)
  Image Width: 57 Image Length: 57
  Bits/Sample: 16
  Sample Format: signed integer
  Compression Scheme: LZW
  Photometric Interpretation: min-is-black
  Orientation: row 0 top, col 0 lhs
  Samples/Pixel: 1
  Rows/Strip: 57
  Planar Configuration: single image plane
  Predictor: none 1 (0x1)
TIFF Directory at offset 0x128dbde (19454942)
  Subfile Type: reduced-resolution image (1 = 0x1)
  Image Width: 45 Image Length: 45
  Bits/Sample: 16
  Sample Format: signed integer
  Compression Scheme: LZW
  Photometric Interpretation: min-is-black
  Orientation: row 0 top, col 0 lhs
  Samples/Pixel: 1
  Rows/Strip: 45
  Planar Configuration: single image plane
  Predictor: none 1 (0x1)

tiffinfo ASTGTMV003_N11E028_num_strip.tif                                       
TIFF Directory at offset 0x2364b2 (2319538)
  Image Width: 3601 Image Length: 3601
  Bits/Sample: 8
  Sample Format: unsigned integer
  Compression Scheme: LZW
  Photometric Interpretation: min-is-black
  Orientation: row 0 top, col 0 lhs
  Samples/Pixel: 1
  Rows/Strip: 2
  Planar Configuration: single image plane
  Predictor: none 1 (0x1)
TIFF Directory at offset 0x2cd8a0 (2939040)
  Subfile Type: reduced-resolution image (1 = 0x1)
  Image Width: 1801 Image Length: 1801
  Bits/Sample: 8
  Sample Format: unsigned integer
  Compression Scheme: LZW
  Photometric Interpretation: min-is-black
  Orientation: row 0 top, col 0 lhs
  Samples/Pixel: 1
  Rows/Strip: 4
  Planar Configuration: single image plane
  Predictor: none 1 (0x1)
TIFF Directory at offset 0x30ee2c (3206700)
  Subfile Type: reduced-resolution image (1 = 0x1)
  Image Width: 1201 Image Length: 1201
  Bits/Sample: 8
  Sample Format: unsigned integer
  Compression Scheme: LZW
  Photometric Interpretation: min-is-black
  Orientation: row 0 top, col 0 lhs
  Samples/Pixel: 1
  Rows/Strip: 6
  Planar Configuration: single image plane
  Predictor: none 1 (0x1)
TIFF Directory at offset 0x33239c (3351452)
  Subfile Type: reduced-resolution image (1 = 0x1)
  Image Width: 901 Image Length: 901
  Bits/Sample: 8
  Sample Format: unsigned integer
  Compression Scheme: LZW
  Photometric Interpretation: min-is-black
  Orientation: row 0 top, col 0 lhs
  Samples/Pixel: 1
  Rows/Strip: 9
  Planar Configuration: single image plane
  Predictor: none 1 (0x1)
TIFF Directory at offset 0x33b6e2 (3389154)
  Subfile Type: reduced-resolution image (1 = 0x1)
  Image Width: 451 Image Length: 451
  Bits/Sample: 8
  Sample Format: unsigned integer
  Compression Scheme: LZW
  Photometric Interpretation: min-is-black
  Orientation: row 0 top, col 0 lhs
  Samples/Pixel: 1
  Rows/Strip: 18
  Planar Configuration: single image plane
  Predictor: none 1 (0x1)
TIFF Directory at offset 0x342c28 (3419176)
  Subfile Type: reduced-resolution image (1 = 0x1)
  Image Width: 401 Image Length: 401
  Bits/Sample: 8
  Sample Format: unsigned integer
  Compression Scheme: LZW
  Photometric Interpretation: min-is-black
  Orientation: row 0 top, col 0 lhs
  Samples/Pixel: 1
  Rows/Strip: 20
  Planar Configuration: single image plane
  Predictor: none 1 (0x1)
TIFF Directory at offset 0x34570c (3430156)
  Subfile Type: reduced-resolution image (1 = 0x1)
  Image Width: 226 Image Length: 226
  Bits/Sample: 8
  Sample Format: unsigned integer
  Compression Scheme: LZW
  Photometric Interpretation: min-is-black
  Orientation: row 0 top, col 0 lhs
  Samples/Pixel: 1
  Rows/Strip: 36
  Planar Configuration: single image plane
  Predictor: none 1 (0x1)
TIFF Directory at offset 0x345c00 (3431424)
  Subfile Type: reduced-resolution image (1 = 0x1)
  Image Width: 57 Image Length: 57
  Bits/Sample: 8
  Sample Format: unsigned integer
  Compression Scheme: LZW
  Photometric Interpretation: min-is-black
  Orientation: row 0 top, col 0 lhs
  Samples/Pixel: 1
  Rows/Strip: 57
  Planar Configuration: single image plane
  Predictor: none 1 (0x1)
TIFF Directory at offset 0x345f82 (3432322)
  Subfile Type: reduced-resolution image (1 = 0x1)
  Image Width: 45 Image Length: 45
  Bits/Sample: 8
  Sample Format: unsigned integer
  Compression Scheme: LZW
  Photometric Interpretation: min-is-black
  Orientation: row 0 top, col 0 lhs
  Samples/Pixel: 1
  Rows/Strip: 45
  Planar Configuration: single image plane
  Predictor: none 1 (0x1)

The error does not appear if the compression is removed using tiffcp:

ttiffcp -c none -s ASTGTMV003_N11E028_dem_strip.tif  ~/Downloads/ASTGTMV003_N11E028_dem_strip_nocomp.tif

Thus, the compression is likely the issue here.

How can I help resolve this?

Regards,

Add test suite coverage

We should use the test cases pack provided by libtiff, the newest version (3.8) can be obtained here: http://simplesystems.org/libtiff/images.html (~5.5 MB unpacked).

Decoder regression: mandrill.tiff from upstream

I've tried to upgrade upstream image crate to tiff 0.7.0, and it uncovered a regression bug: upstream image mandrill.tiff used to decode with 0.6.0, but fails with 0.7.0 with:

Cannot create decoder: FormatError(Format("Neither strips nor tiles were found or both were used in the same file"))

I verified that this problem is reproducible with just image-tiff too, so it's not an issue with integration in the image crate itself. It's also an uncompressed TIFF, so doesn't seem related to my deflate PR (#132) either.

Beyond that, I'm not that familiar with changes between releases, so leaving it up to others to debug this further.

Encoding images using tiles

It seems like all of the encoding code works for strip-based tiffs and there is currently no means of writing tile-based images.

Decoding floating point tiffs

I work with a lot of floating point tiffs and would like to read them using this library. I'm currently trying to implement it myself in this library, but I'm new to rust and not too familiar with image decoding, so not making much progress so far.

Maybe this issue can serve as a central place to discuss what needs to be done to support it?

Floating point predictor support

Forked from #70 .

The horizontal predictor for floating points does not use float differences, as those would be lossy, but apparently uses an integer difference. Recheck this with the standard.

There is a dedicated floating point predictor which first re-orders some bytes: http://chriscox.org/TIFFTN3d1.pdf

Support strip size > 8kB

In ImageEncoder::new() the strip size of the image data in the tiff is limited to 8kB, with the remark it is per tiff spec. But the tiff spec of 1992(!!) states a recommendation for the 8kB as mitigation for out of memory errors. At time has passed 8kB is ridiculous small.
I bumped into this as I faced performance issues in the Python Pillow library. We have an application retrieving rectangular parts from large images. I noticed that the performance of images generated via the image-tiff crate was ten time slower than images created by Pillow itself. It appeared that Pillow stored the image as a single strip, whil image-tiff stored one line per strip.

I would like to see an option to store the image in a single strip, configure the max strip size or even get rid of the 8kB strip limit.

https://github.com/image-rs/image-tiff/blob/master/src/encoder/mod.rs#L625

When agreed on a solution, I'll implement it myself.

Move to image-rs

https://github.com/image-rs

Run TIFF decoder through tiff test images

Images at http://www.remotesensing.org/libtiff/images.html

Release a new version

It would be nice of you to release a new version with num-derive 0.3 so that downstream users can gradually get rid of syn 0.15 in their crate graph.

why does tag XResolution return a List?

After updating from 0.5.0 to 0.6.0 I got a pattern matching error. It appear the tag XResolution in the decoder gives a List, whil before it did not:

decoder.find_tag(Tag::XResolution)?
Returns in 0.6.0:
Some(List([Rational(250000000, 250000000)]))

In 0.5.0 it returns:
Some(Rational(250000000, 250000000))

Is this intentional?

Support signed integer sample types

Currently only unsigned integers and floating point samples are handled. Signed integers used to be silently treated as unsigned, but now trigger an error.

Tests fail with "index out of bounds"

Currently the tests fail with an index out of bounds error, this is probably a bug introduced in #26 that could only be discovered with the tests from #23.

running 1 test
test decoder::stream::test::test_packbits ... ok

test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out

     Running target/debug/deps/decode_images-564f01fb7ded3313

running 6 tests
test test_gray_u8 ... ok
test test_rgb_u8 ... ok
test test_string_tags ... ok
test test_gray_u16 ... ok
test test_decode_data ... ok
test test_rgb_u16 ... ok

test result: ok. 6 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out

     Running target/debug/deps/encode_images-b9c0207f836596cd

running 5 tests
test test_gray_u8_roundtrip ... ok
test test_rgb_u8_roundtrip ... ok
test test_gray_u16_roundtrip ... ok
test encode_decode ... ok
test test_rgb_u16_roundtrip ... FAILED

failures:

---- test_rgb_u16_roundtrip stdout ----
thread 'test_rgb_u16_roundtrip' panicked at 'index out of bounds: the len is 19 but the index is 19', /rustc/9fda7c2237db910e41d6a712e9a2139b352e558b/src/libcore/slice/mod.rs:2463:10
note: Run with `RUST_BACKTRACE=1` for a backtrace.


failures:
    test_rgb_u16_roundtrip

test result: FAILED. 4 passed; 1 failed; 0 ignored; 0 measured; 0 filtered out

error: test failed, to rerun pass '--test encode_images'

Tile-Based images can not be read

It looks like at the moment the tiff crate can't read a large amount of valid tiff files - specifically any geo data that I've found. I've attached a small sample file since it's unclear to me how to fix this at the moment without going deep into the tiff file format (or the image-tiff codebase).

ASTGTMV003_N58W128_num.zip

Encoding panics if the input is too short

The following code causes a panic in the encoder:

let input_data = vec![1,2,3].as_slice();
let mut output = Vec::new();
let mut output_stream = Cursor::new(output);
if let Ok(mut tiff) = TiffEncoder::new(&mut output_stream) {
    tiff.write_image::<colortype::RGB8>(50, 50, &input_data)
}

Error message:
thread 'main' panicked at 'index 7500 out of range for slice of length 173', src/libcore/slice/mod.rs:2349:5

Compression method Deflate is unsupported

I'm trying to use image crate to read TIFF images produced (compressed) by ImageMagick with -compress zip. My understanding is that image crate relies on this one to do actual reading, so raising issue here.

This is what Exiftool reports on those images:

> exiftool -compression pleiades4.tif
Compression                     : Adobe Deflate

README suggests that decoding Deflate TIFFs should be supported:

However, when trying to read them using image::open, I'm getting:

The decoder for Tiff does not support the format features Compression method Deflate is unsupported

Am I holding it wrong? Or is there some specific to this "Adobe Deflate" that ImageMagick uses that makes it different from Deflate supported by this crate?

Memory exhaustion and crash on malformed input

Decoding any of the attached files triggers a crash with the following error message: memory allocation of 136902082592 bytes failedAborted

tiff-oom-crashes.zip

The exact reproduction code can be found in #28. Found via AFL.rs, tested on image-tiff version 0.2.2

Most decoding libraries face this issue at some point. This is usually solved by limiting the amount of allocated memory to some sane default, and letting people override it if they're really dealing with enormous amounts of data. In Rust we can easily allow the API user to override these limits via the builder pattern.

See https://libpng.sourceforge.io/decompression_bombs.html for more info on how a similar issue was solved in libpng. See also the Limits struct from flif crate.

Decoding 16-bit gray images

Somehow the attached image does not decode correctly, it only reads half of the rows width.
2_1.zip

Release with jpeg-dependency 0.2

Can we get a new release with this, also for image? image itself is already using jpeg-decoder 0.2 and this is showing as a duplicate dependency for us.

Originally posted by @MarijnS95 in #155 (comment)

JPEG Compression Tag

I noticed that CompressionMethod::JPEG has the value 6. According to wikipedia, this is actually the "obsolete 'old-style' JPEG" that "... should never be written". For consistency with Deflate/OldDeflate should we consider renaming this to OldJPEG?

I realize this is a breaking change, but since it seems like there is some work being done around this area it may be a good opportunity to standardize the names.

Dependency versions are overly restrictive

Using a * forces downstream users to be unable to use newer versions of crates. It is usually suggested to avoid a version operator at all. Currently, this crate has:

https://github.com/PistonDevelopers/image-tiff/blob/4e3b36598d97fe0096f4696495f3f2af8f4f0148/Cargo.toml#L18-L21

Instead, I would expect this to be written as:

byteorder = "1.2"
lzw = "0.10"
num-derive = "0.2"
num-traits = "0.2"

This would allow for byteorder 1.2.x and 1.3.x, a semver-compatible version. It would not allow num-derive 0.3 (which is not semver-compatible with 0.2).

See the Cargo doc for details on the operators.

The playground uses the newest version of crates. The image crate is popular enough to be included on the playground and it depends on tiff. This version restriction makes it so that tiff cannot be installed at the same time as byteorder 1.3. Until this issue is resolved, I will be preventing image (and thus tiff) from being available in the playground.

Thanks for listening!

Add deflate support

Presumably using e.g. flate2 or similar should enable deflate support.
The CompressionMethod should come out as 32946.

Cloud compatibility

I open this issue to speak about the best way to have cloud compatibility.

What does it mean ? Simply that we should be able to decode the header step by step. Because the ifd can be anywhere in the file we should be able to decode ifd one by one and set the data to decode each time.

Example:

I get the first 1024 bytes of a file
Get the first ifd and decode it (here we could keep the current init method that decode the header and the first ifd)
If there is another ifd and it's outside the 1024 bytes i should be able to call next_image(new_bytes) and get an error if we can't decode the next ifd (better would be to tell the size we need = 2 + the entry count at the begining of the IFD * 12 )

If there is a way to dynamicaly add data to cursor and fake the position we should be able to already have something working but i didn't find any way to do that. I wait for your advices about this use case before making a PR 😉

Panic decoding malformed tiff

The unwrap in decoder/image.rs:L348 (

image-tiff/src/decoder/image.rs

Line 348 in 802949f

let data = decoder.decode().unwrap();

) can fail, and crash whatever program is decoding the TIFF. Please add error handling to remove the unwrap.

I can provide a test file, if you need to test it. I would prefer to email or message it privately.

Regards,
Micah

Reading (private) IFD tags

Am I correct that there is currently no way to read (private) IFD Tags like "GpsIFD"? However, the necessary parsing method is already implemented and even the offset of the tag is available (tiff::decoder::ifd::Value::Ifd(offset)).

By extending read_ifd, I would offer to add the missing function pub fn get_tag_ifd(&mut self, tag: Tag) -> TiffResult<tiff::decoder::ifd::Directory>.

Are there any opinions or criticism regarding that plan?

BigTiff support?

Hi,

I was considering contributing BigTiff support (decoding, for now) to the library. Since I'm nowhere near a Rust expert, I'm still debating with myself on the implementation, since the (BigTiff spec)[https://www.awaresystems.be/imaging/tiff/bigtiff.html] differs from the Tiff spec in several key ways.

For example, the third byte is always 43, as opposed to Tiff's always 42. This might limit the reuseability of the current Decoder, forcing me to re-write large portions of it, essentially creating a BigTiffRecorder. On the other hand, the user shouldn't care if the Tiff is a BigTiff or a normal one, he just wants it decoded.

How would you suggest me to tackle this?

	// The following conversions interpret the image as in libtiff.
	// In particular, MIN is white and MAX is black and not Zero as the name would imply.

	self.r(bo).read_exact(&mut buf)?;
	let v = String::from_utf8(buf)?;
	let v = v.trim_matches(char::from(0));

image-rs / image-tiff Goto Github PK

image-tiff's People

Contributors

Stargazers

Watchers

Forkers

image-tiff's Issues

crater1_lzw.tif:

crater2_lzw.tif

geotiff2.tif

Recommend Projects

Recommend Topics

Recommend Org

`crater1_lzw.tif`:

`crater2_lzw.tif`

`geotiff2.tif`