chromeos / cros-codecs Goto Github PK

License: BSD 3-Clause "New" or "Revised" License

Rust 99.54% Shell 0.07% Python 0.39%

cros-codecs's Introduction

Cros-codecs

A lightweight, simple, low-dependency, and hopefully safe crate for hardware-accelerated video decoding and encoding on Linux.

It is developed for use in ChromeOS (particularly crosvm), but has no dependency to ChromeOS and should be usable anywhere.

Current features

Simple decoder API,
VAAPI decoder support (using cros-libva) for H.264, H.265, VP8, VP9 and AV1,
VAAPI encoder support for H.264, VP9 and AV1,
Stateful V4L2 encoder support.

Planned features

Stateful V4L2 decoder support,
Stateless V4L2 decoder support,
Support for more encoder codecs,
C API to be used in non-Rust projects.

Non-goals

Support for systems other than Linux.

Example programs

The ccdec example program can decode an encoded stream and write the decoded frames to a file. As such it can be used for testing purposes.

$ cargo build --examples
$ ./target/debug/examples/ccdec --help
Usage: ccdec <input> [--output <output>] --input-format <input-format> [--output-format <output-format>] [--synchronous] [--compute-md5 <compute-md5>]

Simple player using cros-codecs

Positional Arguments:
  input             input file

Options:
  --output          output file to write the decoded frames to
  --input-format    input format to decode from.
  --output-format   pixel format to decode into. Default: i420
  --synchronous     whether to decode frames synchronously
  --compute-md5     whether to display the MD5 of the decoded stream, and at
                    which granularity (stream or frame)
  --help            display usage information

Testing

Fluster can be used for testing, using the ccdec example program described above. This branch contains support for cros-codecs testing. Just make sure the ccdec binary is in your PATH, and run Fluster using one of the ccdec decoders, e.g.

python fluster.py run -d ccdec-H.264 -ts JVT-AVC_V1

Credits

The majority of the code in the initial commit has been written by Daniel Almeida as a VAAPI backend for crosvm, before being split into this crate.

cros-codecs's People

Contributors

Stargazers

Watchers

Forkers

dwlsalmeida cazou dabrain34 internetoftofu bgrzesik knopp semigle svg-campus

cros-codecs's Issues

Remove the BackendData generic argument of VaapiBackend

The BackendData generic argument of VaapiBackend has been introduced by the H.265 code, and is used exclusively by it.

It would be preferable (and more readable) to keep VaapiBackend codec-agnostic and remove that argumment. The H.265 BackendData has two members:

last_slice set and used in submit_last_slice and replace_last_slice.
va_references set in handle_picture and used in decode_slice.

It should be possible to replace them by generic associated types of the H265 stateless backend, and have the H.265 decoder manage and pass them to the needed functions.

[Fluster] vp80-00-comprehensive-006 failure

$ python fluster.py run -d cros-codecs-VP8 -tv vp80-00-comprehensive-006
****************************************************************************************************
Running test suite VP8-TEST-VECTORS with decoder cros-codecs-VP8
Test vectors vp80-00-comprehensive-006
Using 12 parallel job(s)
****************************************************************************************************

[TEST SUITE      ] (DECODER        ) TEST VECTOR               ... RESULT
----------------------------------------------------------------------
[VP8-TEST-VECTORS] (cros-codecs-VP8) vp80-00-comprehensive-006 ... Fail


=======================================================================
FAIL: vp80-00-comprehensive-006 (cros-codecs-VP8.VP8-TEST-VECTORS)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/mnt/stateful_partition/fluster/fluster/test.py", line 106, in _test
    self.test_vector.name,
AssertionError: '2d5fa3ec2f88404ae7b305c1074036f4' != '0c345d46643d9da1e1369e8ceb98e122'
- 2d5fa3ec2f88404ae7b305c1074036f4
+ 0c345d46643d9da1e1369e8ceb98e122
 : vp80-00-comprehensive-006

Ran 0/1 tests successfully               in 0.205 secs

The resulting output is visually close to the expected result, but as if the image was offset a little. It's also not a full-pixel shift, the values of most pixels are slightly different, and that from the first frame,as if the image was zoomed it just a little bit.

The bitstream has the unusual resolution of 175x143 (vs the usual 176x144), this is probably related.

Get reference to decoded frame from `StatelessVideoDecoder`

Hi, I'm new to this library, so I apologise in advance if I'm using it wrong. Please feel free to point out any usage errors if that is the case.

I am trying to use a StatelessVideoDecoder to decode an H264 video stream. I've set this up and am repeatedly calling decode on the StatelessVideoDecoder and checking for events using next_event. On the FrameReady event, I call dyn_picture on the given DecodedHandle and then dyn_mappable_handle on the DynHandle.

Now, I can get data from the frame, but only using the read function. As far as I can tell, I can't downcast the DynHandle to an Image (although I'm not sure if that is the only implementation of the DynHandle trait), meaning I cannot get a direct [u8] reference to the buffer with the decoded frame.

As my application is performance-sensitive, I am trying to reduce the number of copies and would like to avoid copying the buffer on my end, so the read function won't cut it. Am I using the library wrong, or is something like this not possible at the moment? I would be interested in contributing a PR if this is the case.

We shouldn't implement codec-specific methods for VaapiBackend

We currently implement quite a few methods for VaapiBackend<(), M> in stateless/h264/vaapi.rs. This pollutes the namespace, as a structure used for all backends now contain methods that only work for a given codec without any extra generic parameters to hide them away or otherwise preclude them from being used elsewhere.

We can possibly get rid of a lot of generics through Cow<'_, [u8]>

I am experimenting with a new design for the AV1 code. One that uses Cow<'_, [u8]> in place of T: AsRef<[u8]>. This means that we can remove this T: AsRef<[u8]> noise from a lot of functions and traits, while making it possible to still build Slices, Frames and etc out of both borrowed and owned data without copying.

What's more, Cow<'_, [u8]> dereferences to &[u8] so nothing fundamentally changes in the code as far as I can see. Not only that, but we de facto do not even support anything other than &[u8] in the decoders, as introducing T in our backend traits would make them not object safe. Using Cow<'_, [u8]> circumvents this problem, as there's no generics involved and lifetimes do not break object safety.

If the current AV1 design proves successful, we should maybe backport these changes to the other codecs as a way to clean up the codebase by a considerable amount.

Can't iterate over ReadyFramesQueue<T> without consuming it

The actual code does not seem to match the docs:

/// Allows us to manipulate the frames list like an iterator without consuming it and resetting its
/// display order counter.
impl<'a, T> Iterator for &'a mut ReadyFramesQueue<T> {
    type Item = T;

    /// Returns the next frame (if any) waiting to be dequeued.
    fn next(&mut self) -> Option<T> {
        self.queue.pop_front()
    }
}

iter_mut() calls pop_front(), so it is actually not possible to iterate without removing items

`decoder::stateless::h264::vaapi::tests::test_25fps_block` regression on Intel

This is a regression introduced by 39e3d00:

thread 'decoder::stateless::h264::vaapi::tests::test_25fps_block' panicked at 'called `Result::unwrap()` on an `Err` value: decoder error: decoder error: while syncing picture

The sync() call on submit_picture when in blocking mode causes an internal decoder error on the VAAPI side. This doesn't happen in non-blocking mode, and also doesn't happen prior to commit 39e3d00. The AMD driver also seems to be unaffected.

Guess the first thing to do would be to set LIBVA_TRACE and check whether the sequence of calls to libva before and after this CL is different - it should not, but obviously something has changed.

`StatelessVideoDecoder` implementations for `StatelessDecoder` can be factorized?

We have one implementation per codec, with many methods that are strictly identical. We should be able to factorize this by implementing the methods that differ as regular methods of StatelessDecoder, and having a single StatelessVideoDecoder impl block that calls these.

next_event would probably require another trait method on the decoder device.

VAAPI: format map should only be used when trying to map frames for the CPU

As far as decoding is concerned, the only relevant format is the RT format. The VA_FOURCC_* are only used to present a view of the buffer in the layout given by the fourcc.

However the current code requires a fourcc to be specified (or at least chosen by default) in the open method, and that even if we have no intent to read the decoded result using the CPU. This should probably be changed to something that does not involve fourccs until we actually want to map a decoded buffer.

The interface would probably be such that we can query which DecodedFormats are supported for the decoder on the current settings ; then we can request a mapping of a given frame to one of the supported formats.

[Fluster] vp80-03-segmentation-1401 failure

$ python fluster.py run -d cros-codecs-VP8 -tv vp80-03-segmentation-1401
****************************************************************************************************
Running test suite VP8-TEST-VECTORS with decoder cros-codecs-VP8
Test vectors vp80-03-segmentation-1401
Using 12 parallel job(s)
****************************************************************************************************

[TEST SUITE      ] (DECODER        ) TEST VECTOR               ... RESULT
----------------------------------------------------------------------
[VP8-TEST-VECTORS] (cros-codecs-VP8) vp80-03-segmentation-1401 ... Fail


=======================================================================
FAIL: vp80-03-segmentation-1401 (cros-codecs-VP8.VP8-TEST-VECTORS)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/mnt/stateful_partition/fluster/fluster/test.py", line 106, in _test
    self.test_vector.name,
AssertionError: 'f7acb74e99528568714129e2994ceca5' != '6e0ed3dd845a44b1239059942e5e53b0'
- f7acb74e99528568714129e2994ceca5
+ 6e0ed3dd845a44b1239059942e5e53b0
 : vp80-03-segmentation-1401

Ran 0/1 tests successfully               in 0.106 secs

Only the first frame appears to be incorrect (and heavily damaged), other frames look ok.

Using LIBVA_TRACE I have compared what we are sent to libva between cros-codecs-VP8 and FFmpeg-VP8-VAAPI (which passes the test). I found this difference on the first frame:

cros-codecs: bool_coder_ctx: range = f8, value = 58, count = 7
ffmpeg: bool_coder_ctx: range = f8, value = 59, count = 7

So I tried the make value match what we see with ffmpeg with the following patch:

diff --git a/src/decoders/vp8/backends/vaapi.rs b/src/decoders/vp8/backends/vaapi.rs
index f3173ef9..ccdc9097 100644
--- a/src/decoders/vp8/backends/vaapi.rs
+++ b/src/decoders/vp8/backends/vaapi.rs
@@ -148,9 +148,12 @@ impl VaapiBackend<Header> {
             u32::from(frame_hdr.loop_filter_level() == 0),
         );

+        let mut bd_value = frame_hdr.bd_value();
+        if bd_value == 0x58 { bd_value = 0x59 };
+
         let bool_coder_ctx = libva::BoolCoderContextVPX::new(
             u8::try_from(frame_hdr.bd_range())?,
-            u8::try_from(frame_hdr.bd_value())?,
+            u8::try_from(bd_value)?,
             u8::try_from(frame_hdr.bd_count())?,
         );

And lo and behold, the test passes with this workaround!

Could this be a bug in the boolean decoder somehow?

Limit use of anyhow

Anyhow should only be used when the returned errors are not well defined. For most cases, thiserror is a better and lighter fit. This issue is to track the removal of anyhow from the codec and decoder code.

Feature: V4L2 stateless decoder support

This issue is to track the development of the V4L2 stateless decoder support. Since we have all the codec parsers already in place, it should just consist of extending and using v4l2r to fill in the appropriate V4L2 control structures and control the device.

JVT-FR-EXT test suite has lots of failures

JVT-FR-EXT is failing quite badly comparatively to JVT-AVC_V1.

ccdec has 12/69 tests passing.

GStreamer-H.264-VAAPI-Gst1.0 has 43/69 tests passing.

Attached the first frame from one of the first failing tests (brcm_freh3), it seems like key frames are not decoded correctly.

Failure to decode H.264 and VP9 in YUV422 and YUV444 formats

The vp91-2-04-yuv422.webm and vp91-2-04-yuv444.webm are failing with vaEndPicture returning a VA_STATUS_ERROR_INVALID_PARAMETER error.

This is a bit strange as vp93-2-20-10bit-yuv444.webm and vp93-2-20-12bit-yuv444.webm, which are also 4:4:4 tests bit with higher bit depth, pass successfully - this suggests that our handling of 4:4:4 is at least correct.

On the other hand vp93-2-20-10bit-yuv422.webm and vp93-2-20-12bit-yuv422.webm are also failing.

It is unclear whether this is a problem with cros-codecs or VAAPI. On these tests the FFmpeg-VP9-VAAPI decoder seems to always fall back to software decoding.

decoder/stateless/h265: output buffers computation does not look correct

We currently need to add 16 to the Sps reported max_dpb_size in order for all Fluster tests to pass. RPS_E_qualcomm_5.bit is the only one that requires this.

In order to avoid wasting memory we don't submit the workaround, but here it is:

diff --git a/src/decoder/stateless/h265/vaapi.rs b/src/decoder/stateless/h265/vaapi.rs
index 43e5a1914..fb337d11c 100644
--- a/src/decoder/stateless/h265/vaapi.rs
+++ b/src/decoder/stateless/h265/vaapi.rs
@@ -124,7 +124,9 @@ impl VaStreamInfo for &Sps {
     }

     fn min_num_surfaces(&self) -> usize {
-        self.max_dpb_size() + 4
+        // TODO: + 16 is needed to make Fluster's RPS_E_qualcomm_5.bit test pass. Other tests are
+        // happy with + 4. We should investigate why this is the case.
+        self.max_dpb_size() + 16
     }

     fn coded_size(&self) -> (u32, u32) {

decoder/stateless: backends should check the number of available output buffers, if needed.

Revealed by PR #83.

The stateless decoders check the number of available output buffers before submitting a picture to the backend. This is needed for VAAPI, but not for V4L2 stateless where we on the contrary need an input buffer to proceed.

This means that backends should validate that they have the resources required to perform their operation. This could probably be done as part of new_picture (which should be extended to all codecs), which could reserve the resources required to process a given pictures. Codecs like VP9 or AV1 that can process several frames per unit of input would call new_picture as many times as necessary, return if any call was failing, and then process the pictures sequentially.

[Fluster] H.264 extended profile not listed in VAAPI (BA3_SVA_C failure)

Fluster's JVT-AVC_V1::BA3_SVA_C test vector requires the extended profile, which we don't support yet:

cargo run --example ccdec -- BA3_SVA_C.264
    Finished dev [unoptimized + debuginfo] target(s) in 0.01s
     Running `target/debug/examples/ccdec BA3_SVA_C.264`
thread 'main' panicked at 'backend error: Invalid profile_idc 88', /usr/local/google/home/acourbot/Work/cros-codecs/src/utils.rs:234:27
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

This profile is not listed in the VAProfile enum: https://github.com/intel/libva/blob/master/va/va.h#L502

Should we substitute another profile for it? The table on this post seems to suggest some features are exclusive to the extended profile, so there is no obvious candidate...

Add support for more H.264 pixel formats

Support for 4:2:2, 4:4:4 and 10 and 12 bit HDR formats has been recently added and validated with VP9, which already had the code to detect HDR formats.

H.264 is lacking that code, but once this is added we should be able to pass more of the JVT-FR-EXT tests.

We must process parameter sets even in DecodingState::{AwaitingStreamInfo|Reset}

This issue was discovered when developing the h.265 code. If we fail to take in new SPS and PPS in this state (using process_nal()) they will be dropped. No test currently exercises this for H.264 but it remains a possibility, nevertheless.

h264: unwrapping of None value

This unwrap in src/decoder/stateless/h264.rs seems to fail once in a while when playing around with ARCVM:

    /// Submits the picture to the accelerator.
    fn submit_picture(&mut self) -> Result<(PictureData, T), DecodeError> {
        let picture = self.cur_pic.take().unwrap();

We should rename all occurrences of the term "Surface" in non-vaapi code

This will be very confusing when/if other backends (like v4l2 backends) come along.

For consistency, we should rename this at the decoder level.

My suggestion is to use the term resource, meaning some backend-specific buffer or data structure consumed during decode, like VASurface, V4L2Request, V4L2Buffer, VkMemory and etc.

h264: we should cope with a new picture during decode_access_unit

Somewhat related to #33

The current code will check for enough surfaces during decode_access_unit

But again, aside from our own FrameIterator (which is used for ccdec), we are not in control of the order of NALUs passed in by users of the library. This can create a problem if we receive the following sequence:

<sps> <pps> <slice A> <slice B> <slice C> <slice D> ... <slice N>

Where A..N are different pictures.

I have just realized that we do not plan for the above scenario, so the first order of business is checking first_mb_in_slice==0 in order to detect that a slice refers to a new picture. Or better yet, we can reuse the logic from Chromium.

While not ideal, we can cope with identifying a new picture in decode_access_unit, but we must:

a) finish the current picture,
b) make sure that we have a surface to decode it to

Note that we already do a) if we notice a new field picture, but not a new frame. This also means moving the check of the current number of surfaces to begin_picture() so that we can have b)

Releasing a new version with encoders

First, thanks a lot for your work on this crate!

I'm able to successfully use the stateless VA-API H.264 encoder (and decoder). However as it's not currently included in the version released on crates.io, I have to add cros-codecs as a git dependency, which prevents me from publishing my own crate on crates.io (as crates.io requires that all dependencies be published on crates.io as well).

Would you consider publishing a new release on crates.io that would include the (stateless) encoders?

Feature: stateful decoder interface

The only decoder interface that we support at the moment is the stateless one. For ease of use, clients should rather use a stateful API, close to the one proposed by Webcodecs.

This means:

Defining a StatefulDecoder trait,
Having an implementation of this trait that can wrap a stateless decoder and expose the stateful interface on top of it,
When we have basic V4L2 support, writing another implementation of this trait to support stateful V4L2 devices.

We should not update the parser state when peeking a SPS

I found out this corner case when developing h.265. Apparently, this is not picked up by any of the tests in the h264 suites we are using, but nevertheless it is still a possibility in the wild.

This line is wrong, because it updates the parser state with a new SPS before processing any pending slices. This can break the decoding process if we send the decoder such a sequence of NALUs:

<old slice> <old slice> <new SPS> <new slice>

The current code will peek the SPS and update the parser state before processing <old slice>s. In the likely event that the SPS simply overwrites a previous SPS held in the parser (by using the same sps_id, e.g.: 0), then the old slices may wrongly refer to a new SPS.

It can be a difficult bug to track down as well.

Note that we are not in control of the sequence of NALUs we are given, as our frame iterators are only used by ccdec, while the client code is in charge of doing that process when using cros-codecs as a library. For our virtualization use-case, it is expected that the guest userspace will have a sane implementation before submitting the data to the virtio video driver.

In order to fix this, we should introduce a new parser function, peek_sps(), which either doesn't take self or takes an immutable reference. This new function will parse a SPS without saving it in the parser. The current function, parse_sps() can be redefined in terms of peek_sps() + an internal save_sps() parser function.

This design fixed a crash in one of the h265 tests.

`codec` module should never panic under any condition

Codecs should be hardened to the point where they cannot panic under any circumstance. This means no unwrap, no array indexes that could be out-of-bounds, etc. Any problem with the codec should return a specific error.

The problem is with detecting these panic conditions. There are a few features that could help:

The missing_panics_doc, unwrap_used and expect_used clippy lints are great to warn about common panic points (the first one also warning upon panic! and assert!).
The no_panic crate looks also helpful, but is limited to actual programs and requires some level of optimization to be really useful.

h264: a slice may refer to a previous SPS but we do not renegotiate

This is not tested by any of our h264 test suites, but a slice may very well refer to a previous SPS than the latest one parsed. If we identify this, we should renegotiate if needed using the SPS referred to by the slice, instead of peeking for a new one.

We should also:

a) refrain from processing any more input in decode_access_unit() if this is detected
b) return the actual bitstream offset, so that the client can reissue the call with the data only from that point on, i.e.: decoder.decode(&bitstream[current_offset..]).

For b), we can query the current offset from the cursor, and return that in DecodingState::AwaitingFormat. together with the actual SPS referred to by the slice.

[Fluster] CAWP5_TOSHIBA_E.264 fails to parse

cargo run --example ccdec -- CAWP5_TOSHIBA_E.264
    Finished dev [unoptimized + debuginfo] target(s) in 0.01s
     Running `target/debug/examples/ccdec CAWP5_TOSHIBA_E.264`
thread 'main' panicked at 'decoder error: No ShortTerm reference found with pic_num -2', /usr/local/google/home/acourbot/Work/cros-codecs/src/utils.rs:234:27
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

FFmpeg-VAAPI is able to pass this test successfully, so this is likely an issue with our H.264 parser.

H.264/5: confirm `visible_rectangle`'s semantics

While this methods returns a rectangle with min and max members, its max member seems to only take the width and height into account.

This ambiguity propagates to the visible_rect method of the VAAPI backend, which should probably use the same Rect type as these methods?

The computation of the visible width and height is performed in the visible_rectangle and in the parse_sps methods, using apparently slightly different methods. Need to verify with the spec which one is correct - also the pre-computed values in parse_sps are never used afterwards.

We should maybe expose DecoderState to upper layers

Imagine some broken media where no SPS or no vpx frames can be parsed at all.

The current design will exit normally, even though no output was produced and nothing useful was done. The decoder will forever be in the DecodingState::AwaitingStreamInfo state.

If by any means we detect EOS (be it at the virtio level, or in cros-codecs itself when executing tests or ccdec) and the decoder is still in that state, we should error out so as to make the user aware that the media is malformed and or corrupt.

How exactly that is to be done is something to be discussed. My initial idea is to add something along the lines of:

fn decoding_state(&self) -> DecodeState<Box<dyn Any>>

To the VideoDecoder trait.

Users can then query the state at EOS to manually assert that the decoder is indeed in DecodingState::Decoding and proceed accordingly.

If for any reason we do not want to copy and or expose T to clients, then another option is:

fn decoding_state(&self) -> DecodeState<()>

In which the decoders would convert from AwaitingFormat(T) to AwaitingFormat(())

@Gnurou any other ideas?

decode() should process a single unit or be able to partially process input

The current implementation of decode expects that the input buffer will contain one or more frames worth of data. This introduces constraints to clients, who cannot e.g. send a single slice of a multi-slice H.264 frame, and to the decoder which should ideally check that there are enough resources for processing the whole input, but currently cannot guarantee that if there are multiple frames.

We could change the signature of decode to something like this:

fn decode(&mut self, timestamp: u64, mut bitstream: &[u8]) -> Result<usize, DecodeError> {

In this new implementation, decode only processes a single unit of data. For H.26x this would be a single NAL unit, for VP8/VP9 a single frame.

The return value is the number of bytes processed in bitstream. If it is less than bitstream.len(), then the caller should call decode again on &bitstream[return_value..] until the whole bistream is processed.

This has the advantage of removing a loop in every implementation of decode, and will allow decode to detect when it doesn't have enough free output buffers to perform the operation and bail out in that case.

On the other hand, this means that H.264 (and probably H.265) will have to keep the current picture in its state again, and also that it won't be able to detect its end until the first buffer of the next picture is queued or flush is called.

Flushing is fishy

When flushing, we current clear all the reference frames, e.g. for VP9:

    fn flush(&mut self) {
        // Note: all the submitted frames are already in the ready queue.
        self.reference_frames = Default::default();
        self.decoding_state = DecodingState::AwaitingStreamInfo;
    }

However, the parser still contains some state related to the reference frames, and other preserved state should also probably be invalid since we are in AwaitingStreamInfo again. Going to this state also means that we will emit a DRC event even if the resolution has not actually changed.

We should probably do either more or less, but the current amount of state clearing does not seem very consistent.

decoder/stateless: `parser` should be moved outside of the decoder state struct

We often perform mutable calls to the state while holding a non-mutable reference to the parser (e.g. to pass SPS information). This is currently not allowed because the parser is part of the state.

Every decoder has both a parser and a state though, so if we move the parser out of the state structure and make it another type parameter of StatelessCodec, we should be able to perform these calls avoid save a few redundant calls to fetch the same data.

av1: max for secondary strength looks incorrect

As per Nicolas Dufresne:

the max for secondary strength is incorrect here, https://github.com/chromeos/croscodecs/blob/main/src/codec/av1/parser.rs#L921
Secondary strength is 2 bit in the bitstream
as you apply the +1, its not even a range, is 0 1 2 4

Let's check and fix if needed.

h264: SPS update race

The end of the current picture cannot be detected before the next one starts, but before this we may have parsed a new SPS through peek_sps. If this happens, we will finish the current picture with the new SPS, which can cause inconsistencies, like the DPB not being bumped because the new max number of reference frames has increased.

In order to fix that, the current picture should keep a reference to its PPS from its creation time, and each PPS should have a reference to their SPS. Putting SPS and PPS into Rc sounds like a good way to achieve this.

We must block on all pending work during flush()

This appears to be a bug in our implementation. If we flush, we must

a) submit any leftover work
b) block on said work before returning

Notice that after we return from flush, a lot of our state is (or will be) invalid, so we must complete the work before returning.

For h.264, for example, the current code will not check cur_pic_opt, nor will it block on it. This means that if (for whatever reason), cur_pic_opt is Some, that picture will be lost.

h264: we should check first_mb_in_slice when decoding

From the specification:

first_mb_in_slice specifies the address of the first macroblock in the slice

If first_mb_in_slice == 0 this means that we have identified that the slice belongs to a new picture.

This should be one of the conditions checked here

This is not a problem when testing with fluster because we check for this when using H264FrameIterator, but clients are free to submit data as they see fit.

decoder/stateless/h265: VPSes are apparently never used

We parse and store VPSes, but they are never used. This might be the reason for some of the few remaining failing tests?

Consider moving codec parsers to separate crate

Hi,

it would be very useful to have access to codec parsers separately from the main crate.

I'm wondering if splitting the crate in two would be worth considering?

We should run Fluster through some CI

Apparently, one of my minor changes broke all of h.264 tests. It took a while before it was fixed by 922d678.

We should have some automation in place somehow to run Fluster on every commit.

@Gnurou I do not know much about CI systems in general, but I can do the work if you give some initial directions.

Decoder interface needs an `InvalidInput` error

Right now we don't have a distinctive error to signal that the decoder could not find any meaningful input in the submitted input. While this is an error, most clients will want to keep submitting input until something catches on, so we should have a dedicated error type to allow proper matching.

[Fluster] JVT-AVC_V1 MR3_TANDBERG_B, MR4_TANDBERG_C, MR5_TANDBERG_C failure

thread 'main' panicked at 'decoder error: Could not find a ShortTerm picture to mark in the DPB', /usr/local/google/home/acourbot/Work/cros-codecs/src/utils.rs:234:27

FFmpeg VAAPI can pass these tests, so this is likely a problem with our H.264 parser.

vp90-2-22-svc_1280x720_3.ivf decodes almost correctly

This is a SVC stream, so there's a few resolution changes at a few points. Looking at YUV, the output is mostly right, but sometimes both resolutions can be seen, with the lower resolution overlaying the larger one by the top-corner.

SVC is quite important in video-conferencing and there's only two test vectors to test this.

Note: This also does not pass on GStreamer.

Largest layer is 720p

ccdec.tar.gz

encoder: Enable zero copy

We can expand the CodedBitstreamBuffer to accept another generic from the backend, a segment, which would require AsRef<[u8]> . In the case of the vaapi it would be mapped to libva::MappedCodedSegment

vaapi: vaDeriveImage should be used for image creation?

It is desirable to use vaDeriveImage in order to access decoded frames using the CPU, but I have seen the following strange behavior when enabling it:

On Intel, all VAAPI decoding tests are passing,
H.264 tests also pass on AMD,
VP9 tests are failing on AMD, sometimes with a SIGSEGV as we try to access the mapped surface.

This may be a bug in Mesa, but we would need to investigate this further. Note that once we start importing buffers this problem will be mitigated since we can map the external buffers ourselves, still it is nice to have this fixed as the self-allocated path is slower than it should be.

[Fluster] MR1_BT_A, MR2_TANDBERG_E, MR6_BT_B, MR7_BT_B, MR8_BT_B, MR9_BT_B failing on Intel only

Strange case of tests failing to pass only on Intel platforms:

MR1_BT_A, MR2_TANDBERG_E, MR8_BT_B show macroblock corruption after a few frames.
MR6_BT_B, MR7_BT_B, MR9_BT_B don't have visible corruption, but rather very small alterations that require a hex editor to see.

Note that on MR2_TANDBERG_E, MR6_BT_B, MR7_BT_B and MR8_BT_B the FFmpeg-H.264-VAAPI decoder seems to fall back to software decoding for some reason. But other files seem to be successfully decoded by FFmpeg on Intel (and all pass with ccdec on AMD).

Feature: device discovery

Right now using cros-codecs requires to know which decoding hardware is available and how to access it. More generic software however would benefit from the ability to discover the accelerated decoding capabilities of the system.

The idea would be to get a list of devices, which in turn can be queried about their capabilities (supported codecs, limits, etc), and finally instantiated into a decoder or encoder instance.

This may require the use of trait objects to abstract the backend used, or maybe we can devise a backend-aware interface that would be a bit more work for the client...

Possible typo in AV1 synthetizer

The AV1 synthetizer's decoder_model_info method outputs the following:

        self.f(32, dm.buffer_removal_time_length_minus_1)?;
        self.f(32, dm.frame_presentation_time_length_minus_1)?;

However, from the AV1 parser, both buffer_removal_time_length_minus_1 and frame_presentation_time_length_minus_1 are read from 5 bits. Could this be a typo?

Remove unwraps and other runtime panics that result from invalid input

An invalid stream should not make the program panic. This issue is to track the removal of all sources of panic from the code:

unwraps as a result of invalid input,
Explicit calls to panic and expect,
Potentially out-of-bounds accesses in vectors or array (replace them with get).

decoder/h264: support streams for which num_slice_groups_minus1 > 0

Currently the H.264 parser will return a Stream contain unsupported/unimplemented NALs anyhow error if num_slice_groups_minus1 > 0. This prevents the FM1_BT_B, FM1_FT_E and FM2_SVA_C test vectors of Fluster's JVT-AVC_V1 test suite from passing.