Comments (3)
Thank you @robyoder for the report.
This is actually expected behavior. The check_trailing_bits
permits to configure the behavior for the trailing bits of the last byte (for valid length inputs) and not the trailing bytes. There is currently no convenience configuration to accept inputs of invalid length. Maybe it would be useful to add one, please open an issue if you would benefit from such feature.
However there is a way to decode inputs of invalid lengths using the position
of the Length
error when calling decode
(see documentation). This has very small overhead and can be done with a convenience truncate_decode
function:
fn truncate_decode(encoding: &Encoding, input: &[u8]) -> Result<Vec<u8>, DecodeError> {
match encoding.decode(input) {
Err(DecodeError {
position,
kind: DecodeKind::Length,
}) => encoding.decode(&input[..position]),
output => output,
}
}
from data-encoding.
Sure, we can work around it and we will, but probably without running the decode function twice.
My question is this: what is the benefit of having check_trailing_bits
for only the cases it currently covers? If some algorithm is generating base32 (or some other base) with trailing bits that are not zeros, isn't it plausible that it would also be generating base32 with trailing characters?
One such algorithm would be a naive secret generator that selects characters at random from the base32 alphabet. The final character in the string may represent 1-4 trailing bits, or a full 5 extra bits. I'm curious what kinds of algorithms check_trailing_bits
is designed to accommodate if not one like this.
from data-encoding.
You're not running the decode function twice, you're just calling it twice on invalid input and one call returns immediately. The truncate_decode
has very low overhead as I said (a few nanoseconds) because the length check is done before everything else. The content of the input is not read at all. This is actually one of the reasons why checking length is a different check than checking trailing bits. One doesn't look at the data while the other one does.
The customization provided by the library is meant to configure the implementation, not to adapt to use-cases. This is because the number of possible use-cases is unbounded, while the number of decisions taken in the implementation is finite. So the right design is the one which is currently taken by the library, which is to provide fine granularity control over the implementation. This is the job of the user to define and use the specification adapted to their use-case.
The current configurations have been added to permit to replicate the behavior of other implementations (like the GNU base64
and base32
programs) for compatibility reasons. I wasn't aware that some implementations accept invalid length and some users rely on that. I'll add this as a configuration option (see #29).
The algorithm you describe doesn't need this configuration to be efficiently implemented, because encode_len
which returns the number of base32 characters you need to generate a given number of bytes doesn't return invalid lengths. Something like the following would work fine:
fn generate(base: &Encoding, len: usize) -> Vec<u8> {
let input = vec![b'F'; base.encode_len(len)]; // Randomize content.
base.decode(&input).unwrap()
}
from data-encoding.
Related Issues (20)
- Consider `check_trailing_bits=false` for BASE64_MIME decoder HOT 1
- Release 3.0.0
- (std -> core)::fmt::Display for DecodeError and DecodeKind HOT 2
- feature `std` leaking when using macro in no_std env HOT 7
- Missing a new line and decode error HOT 2
- Rust 2021 edition is a major, not a "patch" change HOT 6
- Removing dependency on syn HOT 7
- Make `{encode,decode}_len` const fns? HOT 4
- Encode into an "impl std::fmt::Write" and/or "impl std::io::Write" HOT 5
- Breaking change wish list
- Use doc_auto_cfg once stable HOT 2
- Use as_chunks family of functions once stable
- Internal symbols array is publicly exposed and unsound HOT 4
- Deny warnings in rustdoc
- Partial encoding with padding HOT 3
- Question about Specification HOT 6
- Release 2.5.0
- SIMD optimization HOT 4
- `Encoding::encode_mut` is very code-size heavy HOT 15
- feature request - base58 d/encoding HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from data-encoding.