google / wuffs Goto Github PK

View Code? Open in Web Editor NEW

4.1K 85.0 127.0 34.48 MB

Wrangling Untrusted File Formats Safely

License: Other

Go 10.98% C 85.14% Rust 0.24% Shell 0.25% C++ 3.38%

parsing memory-safety programming-language codec

wuffs's Issues

PNG decoder should be able to read out eXIf

https://ftp-osl.osuosl.org/pub/libpng/documents/pngext-1.5.0.html#C.eXIf

I've found all of 25 such images amongst thousands, but they exist.

PNG decoder will not read trailing text chunks

The state diagram doesn't seem to allow for this situation, and key-value-pairs.png.make-artificial.txt doesn't test it.

I've previously scanned all non-thumbnail PNGs that I keep around, and 845 out of 10748 have trailing text chunks (none of them have other kind of trailing chunks). The thumbnail specification I'm implementing doesn't say anything about metadata placement, and although Qt and gdk-pixbuf always write them at the beginning of a file, I should support both options.

Also, the WEBP specification, which is on the roadmap, says EXIF and XMP metadata should both be placed at the end of a file.

What API should be used to retrieve it?

Installation is deprecated and difficult

I cannot even figure out how to have wuffs test work.

Getting Started tells you to run go get -v github.com/google/wuffs/cmd/.... This gives the following deprecation notice:

go get: installing executables with 'go get' in module mode is deprecated.
        Use 'go install pkg@version' instead.
        For more information, see https://golang.org/doc/go-get-install-deprecation
        or run 'go help get' or 'go help install'.

This still installs binaries into $GOPATH/bin though, so now I run wuffs test and I get

 ╭─mag@magnus in ~/.local/share/go took 6ms
 ╰─λ wuffs test
could not find Wuffs root directory

I do some research and run strace wuffs test and find out its looking for some files in ~/.local/share/go/src/github.com/google/wuffs so I manually create ~/.local/share/go/src/github.com/google and git clone https://github.com/google/wuffs into there, hoping this will work. Now we get

 ╭─mag@magnus in ~/.local/share/go/src/github.com/google took 5s
 ╰─λ wuffs test
exec: "clang-format-5.0": executable file not found in $PATH
wuffs-c: failed

I'm on Arch, so I can't install clang-format v5 from 2017, so I pretend that this isn't a problem and create a link:

 ╭─mag@magnus in ~/.local/share/go/src/github.com/google took 1s
 ╰─λ doas ln -s /bin/clang-format /usr/local/bin/clang-format-5.0

Unsurprisingly, this doesn't work.

 ╭─mag@magnus in ~/.local/share/go/src/github.com/google took 4s
 ╰─λ wuffs test
gen wrote:      /home/mag/.local/share/go/src/github.com/google/wuffs/gen/c/wuffs-base.c
parse: expected "(", got "implements" at /home/mag/.local/share/go/src/github.com/google/wuffs/std/adler32/common_adler32.wuffs:16

Then this somehow happens which I cannot reproduce:

 ╭─mag@magnus in repo: wuffs on  main via  v1.17.5 took 209ms
[🔴] × doas rm /usr/local/bin/clang-format-5.0 

 ╭─mag@magnus in repo: wuffs on  main via  v1.17.5 took 19ms
 ╰─λ go get -v github.com/google/wuffs/cmd/...
github.com/google/wuffs/lib/readerat
github.com/google/wuffs/lib/dumbindent
github.com/google/wuffs/lib/compression
github.com/google/wuffs/lib/cgozlib
github.com/google/wuffs/cmd/commonflags
github.com/google/wuffs/lang/token
github.com/google/wuffs/lib/flatecut
github.com/google/wuffs/lib/rac
github.com/google/wuffs/lib/interval
github.com/google/wuffs/lib/cgolz4
github.com/google/wuffs/lib/cgozstd
github.com/google/wuffs/cmd/dumbindent
github.com/google/wuffs/lib/zlibcut
github.com/google/wuffs/lang/render
github.com/google/wuffs/lang/wuffsroot
github.com/google/wuffs/lib/internal/racdict
github.com/google/wuffs/lang/ast
github.com/google/wuffs/lang/parse
github.com/google/wuffs/lib/raczlib
github.com/google/wuffs/lang/builtin
github.com/google/wuffs/cmd/wuffsfmt
github.com/google/wuffs/lang/check
github.com/google/wuffs/lib/raczstd
github.com/google/wuffs/lib/raclz4
github.com/google/wuffs/cmd/ractool
github.com/google/wuffs/lang/generate
github.com/google/wuffs/cmd/wuffs
github.com/google/wuffs/internal/cgen
github.com/google/wuffs/cmd/wuffs-c

 ╭─mag@magnus in repo: wuffs on  main via  v1.17.5 took 1s
 ╰─λ wuffs test
gen wrote:      /home/mag/.local/share/go/src/github.com/google/wuffs/gen/c/wuffs-base.c
gen wrote:      /home/mag/.local/share/go/src/github.com/google/wuffs/gen/c/wuffs-std-adler32.c
gen wrote:      /home/mag/.local/share/go/src/github.com/google/wuffs/gen/wuffs/std/adler32.wuffs
gen wrote:      /home/mag/.local/share/go/src/github.com/google/wuffs/gen/c/wuffs-std-bmp.c
gen wrote:      /home/mag/.local/share/go/src/github.com/google/wuffs/gen/wuffs/std/bmp.wuffs
gen wrote:      /home/mag/.local/share/go/src/github.com/google/wuffs/gen/c/wuffs-std-cbor.c
gen wrote:      /home/mag/.local/share/go/src/github.com/google/wuffs/gen/wuffs/std/cbor.wuffs
gen wrote:      /home/mag/.local/share/go/src/github.com/google/wuffs/gen/c/wuffs-std-crc32.c
gen wrote:      /home/mag/.local/share/go/src/github.com/google/wuffs/gen/wuffs/std/crc32.wuffs
gen wrote:      /home/mag/.local/share/go/src/github.com/google/wuffs/gen/c/wuffs-std-deflate.c
gen wrote:      /home/mag/.local/share/go/src/github.com/google/wuffs/gen/wuffs/std/deflate.wuffs
gen wrote:      /home/mag/.local/share/go/src/github.com/google/wuffs/gen/c/wuffs-std-lzw.c
gen wrote:      /home/mag/.local/share/go/src/github.com/google/wuffs/gen/wuffs/std/lzw.wuffs
gen wrote:      /home/mag/.local/share/go/src/github.com/google/wuffs/gen/c/wuffs-std-gif.c
gen wrote:      /home/mag/.local/share/go/src/github.com/google/wuffs/gen/wuffs/std/gif.wuffs
gen wrote:      /home/mag/.local/share/go/src/github.com/google/wuffs/gen/c/wuffs-std-gzip.c
gen wrote:      /home/mag/.local/share/go/src/github.com/google/wuffs/gen/wuffs/std/gzip.wuffs
gen wrote:      /home/mag/.local/share/go/src/github.com/google/wuffs/gen/c/wuffs-std-json.c
gen wrote:      /home/mag/.local/share/go/src/github.com/google/wuffs/gen/wuffs/std/json.wuffs
gen wrote:      /home/mag/.local/share/go/src/github.com/google/wuffs/gen/c/wuffs-std-nie.c
gen wrote:      /home/mag/.local/share/go/src/github.com/google/wuffs/gen/wuffs/std/nie.wuffs
gen wrote:      /home/mag/.local/share/go/src/github.com/google/wuffs/gen/c/wuffs-std-zlib.c
gen wrote:      /home/mag/.local/share/go/src/github.com/google/wuffs/gen/wuffs/std/zlib.wuffs
gen wrote:      /home/mag/.local/share/go/src/github.com/google/wuffs/gen/c/wuffs-std-png.c
gen wrote:      /home/mag/.local/share/go/src/github.com/google/wuffs/gen/wuffs/std/png.wuffs
gen wrote:      /home/mag/.local/share/go/src/github.com/google/wuffs/gen/c/wuffs-std-wbmp.c
gen wrote:      /home/mag/.local/share/go/src/github.com/google/wuffs/gen/wuffs/std/wbmp.wuffs
gen unchanged:  /home/mag/.local/share/go/src/github.com/google/wuffs/release/c/wuffs-unsupported-snapshot.c
exec: "clang-9": executable file not found in $PATH
exec: "clang-9": executable file not found in $PATH
exec: "clang-9": executable file not found in $PATH
exec: "clang-9": executable file not found in $PATH
exec: "clang-9": executable file not found in $PATH
exec: "clang-9": executable file not found in $PATH
exec: "clang-9": executable file not found in $PATH
exec: "clang-9": executable file not found in $PATH
exec: "clang-9": executable file not found in $PATH
exec: "clang-9": executable file not found in $PATH
exec: "clang-9": executable file not found in $PATH
exec: "clang-9": executable file not found in $PATH
exec: "clang-9": executable file not found in $PATH
wuffs test: some tests failed

So apparently it depends on both clang-format-5.0 and clang-9 existing in my PATH. And anyway whatever motivation I had to write Wuffs code is now overshadowed by my frustration.

This language seems really cool and I want to try to learn it. I don't know if I'm just using a different target system than the only supported one or something, or if I just don't understand enough about go programming and there's one magic go install something@magic_version_thing? command that's analogous to the old go get that I should just automatically know, or if I'm just dumb, but somehow I'm unable to get wuffs working at the moment and I think the documentation should be updated with a newer, correct way to get this working.

Facts involving constants don't propagate to bounds checks

Apologies if this bug report is lacking any information or if I missed anything obvious. But I've been playing with this for over twenty minutes, and I cannot figure it out.

I have a piece of code:

pri const maxBlockSize u32 = 128 << 10 // 128KiB
[...]
var blockSize u32 = [...]
[...]
var i u32
while i < blockSize, inv blockSize <= maxBlockSize {
        assert i <= maxBlockSize via "a <= b: a <= c; c <= b"(c:blockSize)
        i += 1
}

And this gets an error:

check: assignment "i += 1" bounds [1..4294967296] is not within bounds [0..4294967295] at decode.wuffs:87. Facts:
        blockSize <= maxBlockSize
        i < blockSize
        i <= maxBlockSize

It seems to me like from those facts, I should be able to prove that i <= 131072, since 131072 == maxBlockSize and i <= maxBlockSize. However, I don't seem to be able to. I have tried multiple asserts, including assert maxBlockSize == 131072 and assert i <= 131072 via "a <= b: a <= c; c == b"(c:maxBlockSize).

Here's why I think this is a bug in Wuffs - if I replace all uses of maxBlockSize with 131072, the bounds error disappears.

Debug prints in wuffs code?

I'm trying to play with wuffs, and something I miss from coding in Go or other languages is the ability to add debug prints. Something akin to Go's println would be great, for example.

I realise that I can do this via the generated C code, but that's not as straightforward. And, once wuffs adds support for other languages, debugging will be different depending on what language you're generating.

Any ideas or suggestions?

Also, as a drop-by question - how is one supposed to use the wuffs tool for little wuffs packages not part of wuffs/std? I haven't figured how to get wuffs gen to work with that, so I'm just using wuffs-c. I presume that's the best I can do at the moment.

Decode to CHW format already

For now, output from wuffs_aux::DecodeImage (in case of BGR pngs) is HWC array.
Is there some way to make it CHW, so separated channels in memory?
This format becomes more and more popular because of neural networks and cost of HWC -> CWH conversion is very high.

example/{imageviewer,sdl-imageviewer} don't work/work wrong with DefaultDepth 30

System: Arch Linux, X11 with DefaultDepth 30
Tested on various PNG images.

sdl-imageviewer always crashes, because SDL_GetWindowSurface() returns NULL.

imageviewer garbles the images it manages to display in the following way:

On slightly larger images, it crashes on xcb_wait_for_event() returning NULL.

Explain what the Wuffs root directory is

The benchmark page seems to say that Wuffs benchmarks can be run by running "wuffs bench". However, this simply results in it saying that it can not find the Wuffs root directory. This happens even when running it from the repo where there is a directory called wuffs-root-directory.txt:

xobs@nas ~/C/wuffs> wuffs bench
could not find Wuffs root directory
xobs@nas ~/C/wuffs> ls -l
total 68
-rw-------.  1 xobs xobs   647 Sep  2 14:31 AUTHORS
-rwx------.  1 xobs xobs  3463 Sep  2 14:31 build-all.sh*
-rwx------.  1 xobs xobs  2076 Sep  2 14:31 build-example.sh*
-rwx------.  1 xobs xobs  1123 Sep  2 14:31 build-fuzz.sh*
drwx------.  8 xobs xobs   102 Sep  2 14:31 cmd/
-rw-------.  1 xobs xobs   709 Sep  2 14:31 CONTRIBUTING.md
-rw-------.  1 xobs xobs  1716 Sep  2 14:31 CONTRIBUTORS
drwx------.  5 xobs xobs  4096 Sep  2 14:31 doc/
drwx------. 12 xobs xobs   199 Sep  2 14:31 example/
drwx------.  3 xobs xobs    15 Sep  2 14:31 fuzz/
-rw-------.  1 xobs xobs   235 Sep  2 14:31 go.mod
-rw-------.  1 xobs xobs   675 Sep  2 14:31 go.sum
drwx------.  2 xobs xobs    90 Sep  2 14:31 hello-wuffs-c/
drwx------.  4 xobs xobs    50 Sep  2 14:31 internal/
drwx------. 10 xobs xobs   135 Sep  2 14:31 lang/
drwx------. 17 xobs xobs   252 Sep  2 14:31 lib/
-rw-------.  1 xobs xobs 10174 Sep  2 14:31 LICENSE
-rw-------.  1 xobs xobs  9557 Sep  2 14:31 README.md
drwx------.  3 xobs xobs    15 Sep  2 14:31 release/
drwx------.  7 xobs xobs  4096 Sep  2 14:36 script/
drwx------. 13 xobs xobs   142 Sep  2 14:31 std/
drwx------.  4 xobs xobs    27 Sep  2 14:31 test/
-rw-------.  1 xobs xobs   151 Sep  2 14:31 wuffs-root-directory.txt
xobs@nas ~/C/wuffs> wuffs bench
could not find Wuffs root directory
xobs@nas ~/C/wuffs> more wuffs-root-directory.txt
This placeholder file indicates the root of the Wuffs repository.

For example, filenames like "test/data/pi.txt" are relative to this root
directory.
xobs@nas ~/C/wuffs> wuffs bench
could not find Wuffs root directory
xobs@nas ~/C/wuffs>

PNG decoder seems to get pixel format wrong

Running wuffs on the PNG test suite: (PngSuite-2013jan13),
hitting some odd reporting of pixel formats by Wuffs:

basn0g16.png (16 bit (64k level) grayscale):

bits_per_pixel is incorrectly reported as 8. (should be 16)
likewise with channel_0_bits.

basn2c16.png (3x16 bits rgb color):

bpp incorrectly reported as 64, should be 48.
channel_3_bits incorrectly reported as 16, should be 0.

Or am I misunderstanding wuffs, and it's doing some automatic conversion in some weird way?

thanks!

Decode LZ4

There should be a std/lz4 package.

dumbindent: cannot handle /* slash-star comments */ or multi-line strings

Check for "not enough pixel data" and return an Error or Warning

It is possible for a frame's image data to not contain enough LZW blocks to fill the frame rect. Wuffs should report an error in this case. There is already a TODO.

Add Bazel support to hello-wuffs-c example code

I'd like to contribute example code to hello-wuffs-c example so that it's easier for people using Bazel to get started using wuffs-the-language.

I spent a long time hacking at it and I think I have something close to working, but when it comes to the linker step I'm getting these errors and I don't know enough C to know what it's talking about

... external/local_config_cc/cc_wrapper.sh @bazel-out/darwin-fastbuild/bin/src/main/wuffs/hello-wuffs/hello-c-2.params)
Apple clang version 11.0.0 (clang-1100.0.33.16)
Target: x86_64-apple-darwin18.7.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin
 "/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ld" -demangle -lto_library /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/libLTO.dylib -dynamic -arch x86_64 -headerpad_max_install_names -macosx_version_min 10.15.0 -syslibroot /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.15.sdk -o bazel-out/darwin-fastbuild/bin/src/main/wuffs/hello-wuffs/hello-c -lc++ -S bazel-out/darwin-fastbuild/bin/src/main/wuffs/hello-wuffs/_objs/hello-c/main.o bazel-out/darwin-fastbuild/bin/src/main/wuffs/hello-wuffs/libparse.a -framework Foundation -lobjc -lSystem /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/clang/11.0.0/lib/darwin/libclang_rt.osx.a
Undefined symbols for architecture x86_64:
  "_sizeof__wuffs_demo__parser", referenced from:
      _parse in main.o
  "_wuffs_demo__parser__initialize", referenced from:
      _parse in main.o
  "_wuffs_demo__parser__parse", referenced from:
      _parse in main.o
  "_wuffs_demo__parser__value", referenced from:
      _parse in main.o
ld: symbol(s) not found for architecture x86_64
clang: error: linker command failed with exit code 1 (use -v to see invocation)

Image decode API for color spaces and gamma correction.

wuffs_base should have some way to represent or process a PNG image's gAMA, iCCP or similar chunks.

Errors when decoding gzip with many short reads and occasional harvesting of the output buffer

Hi, I've been playing around with the library and come across a strange problem (which could be entirely due to me holding it wrong). It came up in the context of using the gzip decoder during some buffered I/O, I've tried to replicate the conditions in the following C program below (please excuse the whacky constants and sloppy coding while I was trying to get to a minimal example).

The rough conditions for this error are (a) there are short reads, and (b) the destination buffer is sometimes emptied and sometimes not. In the example below, I decode 5 bytes at a time, and on every 3rd loop empty the destination buffer to the console. The decoding runs off the rails and gets the wrong answer (and a bad checksum error at the end). If I empty the destination buffer on every loop instead (or simply reset dst.meta = {0} on every loop), the decode runs correctly. Alternatively, if I leave the destination buffer alone, letting it fill up, the decode runs correctly.

Perhaps I am misunderstanding something about the contract to do with coroutine suspension: is the caller allowed to modify the dst buffer after a short read? My assumption was that while a coroutine was not running, the caller is free to compact buffers as they please. My other assumption was that after a coroutine suspends, it has given away the right to access any of the bytes in the dst buffer again.

Cheers,
Joel

#include <stdint.h>
#include <stdio.h>
#define WUFFS_IMPLEMENTATION
#include "wuffs-v0.3.c"

// First 50 Fibonacci numbers, gzipped using Python's gzip module.
uint8_t inbuf[183] = {
  0x1f, 0x8b, 0x08, 0x00, 0x07, 0x22, 0x16, 0x61, 0x02, 0xff, 0x15, 0x8f, 0xc1, 0x11, 0x00, 0x31, 0x08, 0x02, 0xff, 0x54, 0x23, 0x6a, 0x44, 0xfb, 0x6f,
  0xec, 0xb8, 0xc9, 0xcb, 0x59, 0x09, 0x6b, 0x80, 0x7e, 0x89, 0xc2, 0xc3, 0x82, 0x85, 0x24, 0xaa, 0xf1, 0x3c, 0x1d, 0xd8, 0x8d, 0xac, 0x42, 0x49, 0x18,
  0x06, 0x6e, 0x05, 0xbe, 0x13, 0xf2, 0x6d, 0xa3, 0xb9, 0xc4, 0x68, 0x1e, 0x18, 0xd7, 0x03, 0x4a, 0xf4, 0x57, 0x3b, 0x4f, 0xe8, 0xa9, 0x59, 0xe8, 0x45,
  0x9a, 0x26, 0xeb, 0x0a, 0xbc, 0x71, 0x02, 0x45, 0xad, 0xd7, 0x1e, 0x3b, 0xf3, 0xb0, 0x95, 0xd1, 0xe1, 0xde, 0x9e, 0x9c, 0x73, 0xb9, 0xb6, 0xe2, 0x50,
  0x2f, 0xfb, 0x69, 0xf1, 0x14, 0xb9, 0x2e, 0xbd, 0x4c, 0xf5, 0x5f, 0xd4, 0x57, 0x61, 0x88, 0x6c, 0x9a, 0x53, 0xa8, 0x8b, 0x5d, 0x3a, 0x3a, 0xe5, 0xc8,
  0xad, 0x35, 0xc2, 0xca, 0xc6, 0xde, 0x9e, 0xf7, 0x36, 0xd8, 0x96, 0x1a, 0x9d, 0x0b, 0x6f, 0xd0, 0x66, 0xd7, 0x5d, 0x82, 0x4c, 0x62, 0xe5, 0xf3, 0xe8,
  0xfa, 0x0b, 0x8b, 0x59, 0x64, 0x6b, 0x8a, 0xf4, 0x84, 0x3c, 0xd9, 0xfc, 0x85, 0x0a, 0xbd, 0xa1, 0x67, 0x3f, 0x0d, 0x24, 0xad, 0xda, 0xd2, 0x87, 0x0f,
  0x12, 0x69, 0x4a, 0xa8, 0x3b, 0x01, 0x00, 0x00,
};
wuffs_base__io_buffer src = {
  .data = {
    .ptr = inbuf,
    .len = sizeof(inbuf),
  },
  .meta = {0}
};

// "Infinite" output buffer (we won't use much of it).
uint8_t outbuf[10 * 1024] = {0};
wuffs_base__io_buffer dst = {
  .data = {
    .ptr = outbuf,
    .len = sizeof(outbuf),
  },
  .meta = {0}
};


int main(void) {
  uint8_t workbuf[1];
  wuffs_base__status status;

  // Init decoder
  wuffs_gzip__decoder dec;
  status = wuffs_gzip__decoder__initialize(&dec, sizeof dec, WUFFS_VERSION, 0);
  if (!wuffs_base__status__is_ok(&status)) {
    printf("Could not init decoder: %s\n", status.repr);
    return 1;
  }

  // Decode 5 bytes at a time. On every third loop, print the output and empty the dst buffer.
  for (int i = 0; i < sizeof(inbuf); i += 5) {
    if (i >= sizeof(inbuf)) {
      i = sizeof(inbuf);
      src.meta.closed = 1;
    }

    src.meta.wi = i;
    status = wuffs_gzip__decoder__transform_io(&dec, &dst, &src, (wuffs_base__slice_u8){.ptr = workbuf, .len = sizeof(workbuf)});
    if (status.repr != wuffs_base__suspension__short_read) {
      printf("Expecting a short read, was %s\n", status.repr);
      return 1;
    }

    // Change to i % 5 == 0 to have things run correctly.
    if (i % 15 == 0) {
      fwrite(dst.data.ptr + dst.meta.ri, sizeof(uint8_t), dst.meta.wi - dst.meta.ri, stdout);
      dst.meta.ri = dst.meta.wi;
      wuffs_base__io_buffer__compact(&dst);
    }
  }

  // Check whether decoding was successful.
  src.meta.wi = sizeof(inbuf);
  status = wuffs_gzip__decoder__transform_io(&dec, &dst, &src, (wuffs_base__slice_u8){.ptr = workbuf, .len = sizeof(workbuf)});
  if (status.repr != NULL) {
    printf("Decode unsuccessful: %s\n", status.repr);
    return 1;
  }

  return 0;
}

Decode PNG text chunks (tEXt, zTXt, iTXt)

Spun out of #13 (comment)

I need PNG metadata readout... complete support of text chunks (for thumbnails).

@pjanx can you clarify what "complete support" means? According to https://www.w3.org/TR/2003/REC-PNG-20031110/#11tEXt such chunks are actually key-value pairs. Do you need the keys too or only the values? iTXt chunks also have language codes (e.g. ISO 639, ISO 646) and the key can also be translated (e.g. from English to Japanese). Do you need that too?

It may be helpful if you can attach some example PNG files (with text chunks) and say what you need to crack out of them.

Odd self-assignment to `r.err`

https://github.com/google/wuffs/blob/master/lib/rac/reader.go#L441

This was present in the original commit: 7753d14

Add a "hello world outside of std" example

If anyone wants to play with Wuffs today, it's fairly safe to assume they'd want the code to live somewhere outside of the std directory. Forking and symlinks are a possibility, but they're not a very comfortable way to do things.

I propose that we add a simple example of what a working Wuffs program would be. For example, a dead simple program that you can compile to a working binary:

A capitalize.wuffs file, that takes a stream of bytes, and converts all lowercase a letters to A, before writing to an output stream
A main.c file, that uses the func/type from Wuffs above and puts it in between stdin and stdout
A Makefile or script that, using wuffs-c and $CC, generates the C code and compiles a working executable

In total, this should all be under 50 lines of code. Yet it would be a great starting point for those wanting to try something of their own with Wuffs.

Split from #4 - there I mentioned how it took me some digging through the Wuffs codebase to figure out how to make it work outside of the std directory.

Ranges don't always automatically work

$ cat foo.wuffs
packageid "xxxx"

pri const arr[4] u8 = $(1, 2, 3, 4)

pub func foo(a u8)() {
        var b u8 = in.a >> 6
        var elem u8 = arr[b]
}
$ wuffs-c gen -package_name xxxx foo.wuffs
cannot prove "b < 4": failed at foo.wuffs:7. Facts:
        b == (in.a >> 6)

However, all of the cases below work:

var b u8 = in.a >> 6
if b < 4 { // always true, redundant
        var elem u8 = arr[b]
}

var b u8 = in.a >> 6
var elem u8 = arr[b & 0x03]

Funnily enough though, this doesn't work - I presume because the fact gets lost between statements:

var b u8 = (in.a >> 6) & 0x03
var elem u8 = arr[b]

Might be related to #5.

Decode QOI

There should be a std/qoi package.

Go modules in conflict with wuffRoot

Recently, go module was introduced, which sounds like the way to go (pardon the pun) with golang.

However this is in conflict with wuff's design, which assumes some source artifacts are available through $GOPATH/src/github.com/wuffs.

For example, wuffs gen is hardcoded to consume files from wuffs source root.

A more reasonable design would be not to depend on the wuffs source dir in the modules era, but to package required resources in the executable, and accept path for the sources to generate.

Even if wuffsRoot is still deemed necessary, I think we should explicitly copy the required resources to $GOPATH/src in build-all.sh, in order to have a clean copy of cloned wuffs simply work.

Support for SSE/AVX when targeting x86?

I'm using Wuffs-the-library in an application that is restricted to x86 (32-bit) and MSVC. I noticed that the definition of WUFFS_BASE__CPU_ARCH__X86_64 is guarded by a check for _M_X64 for MSVC (and __x86_64__ if not MSVC) in this place:

wuffs/internal/cgen/base/fundamental-public.h

Line 89 in 786fc74

#if defined(_M_X64)

Is there an inherent reason for that? Or would it be possible to change those checks to e.g.

#if defined(_M_X64) || defined(_M_IX86)

(and similarly __i386__ if not MSVC) to allow these extensions on x86?

At least for my application this change works without any problems and greatly improves performance for decoding PNGs which is what I use Wuffs for.

Decode Zstandard (zstd)

There should be a std/zstd package.

Directory name "aux" is invalid on Windows.

Hey folks - commit cf6c578 introduced a directory named "aux", which is invalid on windows. (see stack overflow)

This results in failures during checkout on windows:

E:\chromium.wuffs>git checkout cf6c5789736b19c1ed94c5e231e4db999e21c544
fatal: cannot create directory at 'internal/cgen/aux': Invalid argument

want to add encoding as well?

safe decoding is important however is limited, the puffs is designed elegant I admit and want to use in decoding some proprietary binary format, but to occasionally to write in such format is also a need, would you like to support encoding eventually? to make the cost of learning a different language more of worthy?

https://github.com/google/puffs/blob/master/std/gif/decode_lzw.puffs for example, would you make some code generator support encoding functions to enable write gif files in lzw compression?

Decode TGA

Hi,
Could you support the Truevision TGA format? It's still quite popular with game developers, and supports 24bpp/32bpp images. It's quite simple to parse for the most part, although it's complex enough that security concerns are still relevant.

https://en.wikipedia.org/wiki/Truevision_TGA

Test files:
https://github.com/image-rs/image/tree/master/tests/images/tga/testsuite

Thanks.

wuffs_base__magic_number_guess_fourcc can't identify VP8X

        if (y == 0x56503820) {         // 'VP8 'be
          return 0x57503820;           // 'WP8 'be
        } else if (y == 0x5650384C) {  // 'VP8L'be
          return 0x5750384C;           // 'WP8L'be
        }

An "Extended file format" WEBP starts with a "VP8X" chunk, VP8/VP8L may come after that.

https://github.com/webmproject/libwebp/blob/master/doc/webp-container-spec.txt

I'm not entirely sure what the current WEBP magic detection is for--if it said just "WEBP", it would be workable.

A way to overflow or underflow on purpose

Sometimes it is really wanted. All I could find in the repo is:

doc/wuffs-the-language.md:74:TODO: ignore-overflow ops, equivalent to Swift's `&+`.

I presume that means it is planned?

all: convert into a Go module

This would be helpful to easily track which version of Wuffs that a Wuffs program/library depends on to build correctly.

facts involving constants still need extra verbosity to work

Take this piece of code:

packageid "xxxx"

pri const four u32 = 4

pub func foo(a u32)() {
        if in.a >= four {
                return
        }
        assert in.a < 4 via "a < b: a < c; c == b"(c:four)
}

As it is, it generates with wuffs-c gen -package_name xxxx foo.wuffs just fine.

However, none of the alternative assert lines below work - when I would expect every single one of them to work.

Reasoning that 4 == four instead of four == 4: assert in.a < 4 via "a < b: a < c; b == c"(c:four). Note, however, that a lone assert 4 == four works just as fine as an assert four == 4, so it seems to be a bug in these assert reason expressions.

check: cannot prove "in.a < 4": no such reason "a < b: a < c; b == c" at foo.wuffs:9. Facts:
        in.a < four

Asserting assert in.a < 4. In other words, if this works:

if in.a >= 4 {
        return
}
assert in.a < 4

I would expect this to work too, yet it doesn't (at least not without the extra reasoning in the original code):

if in.a >= four {
        return
}
assert in.a < 4

check: cannot prove "in.a < 4" at foo.wuffs:9. Facts:
        in.a < four

I'm also confused by how four == 4 doesn't appear in the error above. It only appears if I'm explicit with an assertion or invariant beforehand:

assert four == 4
assert in.a < 4

check: cannot prove "in.a < 4" at foo.wuffs:10. Facts:
        in.a < four
        four == 4

Cannot declare constants using arithmetic expressions

For example, this is very useful when declaring constant sizes in megabytes and kilobytes:

$ cat foo.wuffs
packageid "xxxx"

pri const maxSize u32 = 1 << 10 // 1KiB
$ wuffs-c gen -package_name xxxx foo.wuffs
invalid const value "1 << 10"

Would be neat if the compiler performed the calculation, similar to what the Go compiler does. For now, I'm just writing out the full numbers directly.

reasoning behind restricting vars at the top of functions

This was done in cbe19f5 - I'm curious if there was a particular reason behind it. I presume it's to simplify generating C89 code?

I'm not directly opposed to the restriction, but it does make code a bit unreadable. Perhaps it's because I've been writing a lot of Go in recent years. For example, see the diff in mvdan/zstd@7f8b358.

doc/std/image-decoders.md needs an update

ack_metadata_chunk? has been removed.

Decode APNG

std/png should support animated PNG.

creating code for other languages than C

How much effort would it take to extend the transpiler to create code for other languages, e.g. Pascal?

ractool doesn't fully suport lz4 or zstd

They're partially supported, but:

They don't support -cchunksize, only -dchunksize.
They don't support -resources.

Supporting -cchunksize will require implementing a 'cut' package, similar to the existing github.com/google/wuffs/lib/zlibcut.

Supporting -resources will probably require the upstream LZ4 and Zstd C libraries to provide stable API for dictionaries + streaming. Issues filed as LZ4 #791 and Zstd #1776.

wuffs-c requires clang-format-5.0

Is there a reason why version 5.0 is explicitly required? I only have clang-format installed (version 6.0), and creating a symlink in my $PATH keeps wuffs-c running normally.

I presume that the tool is used to format the C output from the generator. Why not make it configurable though? It could, for example, default to clang-format, where one could change it to clang-format-5.0 or any other desired version.

Broken link in README.md

README.md from commit bdd35ee:
The link under "What Does Wuffs Code Look Like?" should apparently be
https://github.com/google/wuffs/blob/master/std/lzw/decode_lzw.wuffs,
not
https://github.com/google/wuffs/blob/master/std/gif/decode_lzw.wuffs.

Proof checker cannot collapse constant arithmetic

Fact:

(outIdx + 2) < 400

What I expect:

outIdx < 400 is provable

Observed:

"cannot prove "outIdx < 400": failed at (...). Facts:
[. . .]
(outIdx + 2) < 400

Is methods calling other methods supported?

Whatever I try, I always hit errors like cannot convert Wuffs call "this.advance?(src:in.src)" to C. This leads me to the TODO in the Wuffs source code:

// TODO: fix this.
//
// This might involve calling g.writeExpr with replaceNothing??
return fmt.Errorf("cannot convert Wuffs call %q to C", n.Str(g.tm))

The code in question was the following:

pri struct backBitReader?(
        rem u32,
        cur u8,
        curbits u8[..8],
)

pri func backBitReader.advance?(src reader1)() {
        if this.rem == 0 {
                return error "TODO"
        }
        this.rem -= 1
        this.cur = in.src.read_u8?()
        this.curbits = 8
}

pri func backBitReader.skipPadding?(src reader1)() {
        this.advance?(src:in.src)
}

Am I missing something, or is this simply not supported yet?

"bad huffman code under subscribed" error on valid PNG

Hi,
I'm unable to load a valid PNG (see attached file) using wuffs, but this PNG loads with libpng (zlib), stb_image.h, lodepng, and validates OK with pngcheck. After some investigation it appears that if the Deflate distance table only has a single used code (with a bit length of 1), wuffs refuses to load the image.

This is incorrect behavior, as far as I can tell. From the Deflate spec 3.2.7. Compression with dynamic Huffman codes (BTYPE=10):

https://datatracker.ietf.org/doc/rfc1951/

If only one distance code is used, it is encoded using one bit, not zero bits; 
in this case there is a single code length of one, with one unused code.

The problem is in function wuffs_deflate__decoder__init_huff():

if (v_remaining != 0) {
    if ((a_which == 1) &&
        (v_counts[1] == 1) &&
        (self->private_data.f_code_lengths[a_n_codes0] == 1) &&
        ((((uint32_t)(v_counts[0])) + a_n_codes0 + 1) == a_n_codes1)) {
      self->private_impl.f_n_huffs_bits[1] = 1;
      self->private_data.f_huffs[1][0] = (WUFFS_DEFLATE__DCODE_MAGIC_NUMBERS[0] | 1);
      self->private_data.f_huffs[1][1] = (WUFFS_DEFLATE__DCODE_MAGIC_NUMBERS[31] | 1);
      return wuffs_base__make_status(NULL);
    }
    return wuffs_base__make_status(wuffs_deflate__error__bad_huffman_code_under_subscribed);
  }

C language impl of the RAC file format

We have a Go implementation of the RAC file format. We should have a C one too (which other programming languages can bind).

Decode JPEG

There should be a std/jpeg package.

Best way to read 3-byte little endian number?

This is just another question that has popped in my mind. Not necessarily a bug report.

It's quite common to have to read numbers from any number of bits or bytes, not just the power-of-two sizes like u8, u16, and u32.

For example, when implementing zstd, in a couple of places I need to read a three-byte little endian number. At the moment, I am doing something like:

// spaghetti code to read a three-byte little endian number
var block_lower base.u32[..0xFF] = in.src.read_u8?() as base.u32
var block_upper base.u32[..0xFFFF] = in.src.read_u16le?() as base.u32
var block_header base.u32[..0xFFFFFF] = (block_upper << 8) | block_lower

Is there a better way? Or rather, should there be a better way?

I realise that this isn't strictly necessary, and that languages like Go with their encoding/binary packages don't support such a thing either out of the box. But this language being precisely for encoders and decoders, I wonder if such an "arbitrary byte length" or even "arbitrary bit length" functionality would be possible.

Ideally, I'd instead write something like:

var block_header base.u32[..0xFFFFFF] = in.src.read_u24le?()

Do you use Wuffs? Tell us!

This is not an issue so much as a lightweight way of gathering information on who is using Wuffs. This is mostly to satisfy our curiosity, but might also help us decide how to evolve the project.

So, if you use Wuffs for something, please chime in here and tell us more!

Image decode API for Region of Interest.

When you ultimately only need the top half of a 640x480 image, there should be a way to stop the decode early (for a ~2x speed-up).

Using wuffs-c (not wuffs) is awkward

The wuffs command line tool is used when working on Wuffs-the-Library. The wuffs-c command line tool is (implicitly) part of that, but it can also be used on its own for other code written in Wuffs-the-Language, including code that doesn't live in this repository.

Doing that works... sort of. But it's not a great experience. The hello-wuffs-c example has to declare a placeholder wuffs-base.c file (as it can't rely on the generated gen/c/wuffs-base.c file), and the C driver program has to #define WUFFS_CONFIG__MODULE__BASE.

Both of these workarounds shouldn't be necessary.

internal error: temporary variable count out of sync

Stumbled upon this internal error when changing some code. Small reproducer:

$ cat decode.wuffs
packageid "zstd"

pub struct decoder?(
)

pub func decoder.decode?(dst base.io_writer, src base.io_reader)() {
        return in.src.skip32?(n:3)
}
$ wuffs-c gen -package_name zstd decode.wuffs
internal error: temporary variable count out of sync

I soon figured out that I should do in.src.skip32?(n:3); return instead, but I guess that an internal error should never happen in any case.

The test file gifplayer-muybridge.gif is not standard conformant and rust-gif update

I'm just curious how this particular gif was created. It seems that in frame 62 (and possibly more after it) the LZW compressed data is not terminated by an end code. However, the specifcation says:

An End of Information code is defined that explicitly indicates the end of
the image data stream. LZW processing terminates when this code is encountered.
It must be the last code output by the encoder for an image.

Where image refers to one block terminated by a block terminator.

This was noticed when upgrading the Rust gif library to a more principled decoder. (The old decoder was incredibly lenient in the structure of data permitted, even outputting additional data after the end code). By the way, this yields new performance numbers which compare quite favorably. As measured:

Benchmarkrust_gif_decode_1k_bw                 20000           3125 ns/op   327.585 MB/s
Benchmarkrust_gif_decode_1k_color              10000           9139 ns/op   110.285 MB/s
Benchmarkrust_gif_decode_10k_bgra               1000         121340 ns/op   332.287 MB/s
Benchmarkrust_gif_decode_10k_indexed            1000          97152 ns/op   103.754 MB/s
Benchmarkrust_gif_decode_20k                     500         152167 ns/op   126.176 MB/s
Benchmarkrust_gif_decode_100k_artificial         150         516036 ns/op   267.236 MB/s
Benchmarkrust_gif_decode_100k_realistic          100        1136677 ns/op   121.322 MB/s
Benchmarkrust_gif_decode_1000k                    10        8158620 ns/op   122.659 MB/s
Benchmarkrust_gif_decode_anim_screencap           10        8946771 ns/op   519.986 MB/s
Benchmarkrust_gif_decode_1k_bw                 20000           3140 ns/op   326.044 MB/s
Benchmarkrust_gif_decode_1k_color              10000           9129 ns/op   110.410 MB/s
Benchmarkrust_gif_decode_10k_bgra               1000         122251 ns/op   329.811 MB/s
Benchmarkrust_gif_decode_10k_indexed            1000          97687 ns/op   103.186 MB/s
Benchmarkrust_gif_decode_20k                     500         157508 ns/op   121.897 MB/s
Benchmarkrust_gif_decode_100k_artificial         150         530540 ns/op   259.931 MB/s
Benchmarkrust_gif_decode_100k_realistic          100        1146479 ns/op   120.284 MB/s
Benchmarkrust_gif_decode_1000k                    10        8132214 ns/op   123.058 MB/s
Benchmarkrust_gif_decode_anim_screencap           10        8920712 ns/op   521.505 MB/s
Benchmarkrust_gif_decode_1k_bw                 20000           3147 ns/op   325.321 MB/s
Benchmarkrust_gif_decode_1k_color              10000           9086 ns/op   110.935 MB/s
Benchmarkrust_gif_decode_10k_bgra               1000         121492 ns/op   331.873 MB/s
Benchmarkrust_gif_decode_10k_indexed            1000         100702 ns/op   100.096 MB/s
Benchmarkrust_gif_decode_20k                     500         156993 ns/op   122.297 MB/s
Benchmarkrust_gif_decode_100k_artificial         150         530643 ns/op   259.880 MB/s
Benchmarkrust_gif_decode_100k_realistic          100        1170923 ns/op   117.773 MB/s
Benchmarkrust_gif_decode_1000k                    10        8373270 ns/op   119.515 MB/s
Benchmarkrust_gif_decode_anim_screencap           10        8980037 ns/op   518.059 MB/s
Benchmarkrust_gif_decode_1k_bw                 20000           3138 ns/op   326.319 MB/s
Benchmarkrust_gif_decode_1k_color              10000           9135 ns/op   110.334 MB/s
Benchmarkrust_gif_decode_10k_bgra               1000         121609 ns/op   331.551 MB/s
Benchmarkrust_gif_decode_10k_indexed            1000          96996 ns/op   103.921 MB/s
Benchmarkrust_gif_decode_20k                     500         150769 ns/op   127.346 MB/s
Benchmarkrust_gif_decode_100k_artificial         150         511933 ns/op   269.378 MB/s
Benchmarkrust_gif_decode_100k_realistic          100        1145734 ns/op   120.363 MB/s
Benchmarkrust_gif_decode_1000k                    10        8124121 ns/op   123.180 MB/s
Benchmarkrust_gif_decode_anim_screencap           10        9107758 ns/op   510.795 MB/s
Benchmarkrust_gif_decode_1k_bw                 20000           3216 ns/op   318.332 MB/s
Benchmarkrust_gif_decode_1k_color              10000           9245 ns/op   109.020 MB/s
Benchmarkrust_gif_decode_10k_bgra               1000         122231 ns/op   329.865 MB/s
Benchmarkrust_gif_decode_10k_indexed            1000          98080 ns/op   102.773 MB/s
Benchmarkrust_gif_decode_20k                     500         152815 ns/op   125.641 MB/s
Benchmarkrust_gif_decode_100k_artificial         150         522883 ns/op   263.737 MB/s
Benchmarkrust_gif_decode_100k_realistic          100        1156334 ns/op   119.259 MB/s
Benchmarkrust_gif_decode_1000k                    10        8178582 ns/op   122.360 MB/s
Benchmarkrust_gif_decode_anim_screencap           10        9022169 ns/op   515.640 MB/s

I'll provide another update for ignoring the missing end code is published as currently the benchmark fails.

Add support for reading X11 cursors

It is an uncompressed format using alpha-premultiplied ARGB32, and it supports animations:
https://www.x.org/releases/current/doc/man/man3/Xcursor.3.xhtml

Reading in only one nominal size would already be useful, e.g., the first contiguous stream of equivalent dimensions in the file.

I might try to implement it myself. I suspect the greatest issue will be handling seeking, since the format uses byte offsets. They are always little-endian, because that is what libXcursor does.

google / wuffs Goto Github PK

wuffs's Issues

Recommend Projects

Recommend Topics

Recommend Org