When using spec.check_trailing_bits = false; with the

Thank you <a class="user-mention notranslate" data-hovercard-type="user" data-hovercar

check_trailing_bits still errors at certain lengths for base32 about data-encoding HOT 3 CLOSED

ia0 commented on July 20, 2024

check_trailing_bits still errors at certain lengths for base32

from data-encoding.

Comments (3)

ia0 commented on July 20, 2024

Thank you @robyoder for the report.

This is actually expected behavior. The check_trailing_bits permits to configure the behavior for the trailing bits of the last byte (for valid length inputs) and not the trailing bytes. There is currently no convenience configuration to accept inputs of invalid length. Maybe it would be useful to add one, please open an issue if you would benefit from such feature.

However there is a way to decode inputs of invalid lengths using the position of the Length error when calling decode (see documentation). This has very small overhead and can be done with a convenience truncate_decode function:

fn truncate_decode(encoding: &Encoding, input: &[u8]) -> Result<Vec<u8>, DecodeError> {
    match encoding.decode(input) {
        Err(DecodeError {
            position,
            kind: DecodeKind::Length,
        }) => encoding.decode(&input[..position]),
        output => output,
    }
}

Playground link

from data-encoding.

robyoder commented on July 20, 2024

Sure, we can work around it and we will, but probably without running the decode function twice.

My question is this: what is the benefit of having check_trailing_bits for only the cases it currently covers? If some algorithm is generating base32 (or some other base) with trailing bits that are not zeros, isn't it plausible that it would also be generating base32 with trailing characters?

One such algorithm would be a naive secret generator that selects characters at random from the base32 alphabet. The final character in the string may represent 1-4 trailing bits, or a full 5 extra bits. I'm curious what kinds of algorithms check_trailing_bits is designed to accommodate if not one like this.

from data-encoding.

ia0 commented on July 20, 2024

You're not running the decode function twice, you're just calling it twice on invalid input and one call returns immediately. The truncate_decode has very low overhead as I said (a few nanoseconds) because the length check is done before everything else. The content of the input is not read at all. This is actually one of the reasons why checking length is a different check than checking trailing bits. One doesn't look at the data while the other one does.

The customization provided by the library is meant to configure the implementation, not to adapt to use-cases. This is because the number of possible use-cases is unbounded, while the number of decisions taken in the implementation is finite. So the right design is the one which is currently taken by the library, which is to provide fine granularity control over the implementation. This is the job of the user to define and use the specification adapted to their use-case.

The current configurations have been added to permit to replicate the behavior of other implementations (like the GNU base64 and base32 programs) for compatibility reasons. I wasn't aware that some implementations accept invalid length and some users rely on that. I'll add this as a configuration option (see #29).

The algorithm you describe doesn't need this configuration to be efficiently implemented, because encode_len which returns the number of base32 characters you need to generate a given number of bytes doesn't return invalid lengths. Something like the following would work fine:

fn generate(base: &Encoding, len: usize) -> Vec<u8> {
    let input = vec![b'F'; base.encode_len(len)]; // Randomize content.
    base.decode(&input).unwrap()
}

Playground link

from data-encoding.

check_trailing_bits still errors at certain lengths for base32 about data-encoding HOT 3 CLOSED

Comments (3)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent