Comments (2)
I assumed I solved the issue few month ago
libprotobuf-mutator/src/text_format.cc
Line 31 in dd89da9
Is possible that just 100 breaks your binary?
from libprotobuf-mutator.
Hi @vitalybuka,
Apologies for bumping a 5-year-old thread, however, I believe I've encountered an issue like this one. When using LPM integrated with libFuzzer, it is possible for LPM to generate a deeply nested message beyond the limits of the text or binary parsers. This is problematic in the following situation:
- A
.proto
defines a data structure capable of recursively nested messages. For example, this stripped down.proto
for a subset of JSON:syntax = "proto2"; package data_structure; message Input { required Element element = 1; } message Value { oneof value { Array array = 1; /* other value types supported by JSON would live here */ } } message Element { required Value value = 1; } message Array { repeated Element elements = 1; }
- A fuzzer is created using the above structure with LPM.
- After enough iterations, LPM begins to generate deeply nested messages (beyond 100+ levels deep).
- Suppose one of these deeply nested messages signals new coverage to the fuzzing engine.
- The fuzzer serializes then saves out the deeply nested message to the corpus.
- The user restarts the fuzzer.
- Corpus inputs are executed; however, the following error appears since one of those messages are beyond the parser's limit:
Error parsing text-format data_structure.Input: 131:207: Message is too deep, the parser exceeded the configured recursion limit of 100.
- Coverage is no longer the same after reading all inputs from the corpus since the above input failed to parse and therefore did not execute as per:
libprotobuf-mutator/src/libfuzzer/libfuzzer_macro.h
Lines 72 to 77 in 1f95f80
In this situation, one may increase the recursion limit via SetRecursionLimit
, however, even deeper messages are still capable of being generated and saved to disk, especially via crossover. Additionally, the fuzzer and possibly the function under test may end up spending more time deserializing, processing, & mutating deeply nested messages which could be better spent mutating fields at shallower levels.
With that said, is there any way to enforce a depth limit at the mutation level that I may have overlooked? In other words, during LPM mutation, if it detects that an add/clone/copy will put the mutated message over some user-provided maximum depth, can that be prevented in favor of shallower field mutation? I believe this would:
- Prevent any chance of creating messages that are too deep to deserialize thus preserving coverage between fuzzer restarts
- Prevent the need to periodically increase the parser's recursion limit
- Possibly improve overall exec/s by limiting recursion depth during deserialization, fuzz target processing, and mutation
If there isn't already a way to enforce this, I'm curious what you'd think would be required to implement such a feature. I would be interested in providing a PR to add such functionality.
I've attached a simple example which demonstrates the scenario above.
issue-143-supplement.tar.gz
Thank you.
from libprotobuf-mutator.
Related Issues (20)
- Mutator breaks content of Any
- Ninja check failing HOT 2
- clang++ filed HOT 8
- OSS-Fuzz issue 61338 HOT 1
- OSS-Fuzz issue 61395 HOT 1
- How can I start with macos???? HOT 6
- What is a correct way to disable code coverage instrumentation?
- Ninja check failed HOT 1
- Propagate `seed` to `DEFINE_PROTO_FUZZER`
- OSS-Fuzz issue 63081 HOT 1
- Error during ninja check: no type named `string_view` in namespace `absl` HOT 1
- ninja check fails on stddef.h HOT 1
- feasibility of using libprofobuf-mutator for highly complex nested protobuf in heavy compuation program
- OSS-Fuzz issue 65416 HOT 1
- OSS-Fuzz issue 65495 HOT 1
- OSS-Fuzz issue 65497 HOT 1
- OSS-Fuzz issue 65516 HOT 1
- error: use of undeclared identifier 'absl' HOT 4
- error: use of undeclared identifier 'absl' HOT 2
- Consuming Abseil via Protobuf as transitive dependency is fragile HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from libprotobuf-mutator.