Hi, It seems like one of our fuzzers is failing because of a stack-o

I assumed I solved the issue few month ago <div class="Box Box--condensed my-2

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

Recursive depth on google.protobuf.Structs about libprotobuf-mutator HOT 2 OPEN

google commented on July 28, 2024 2

Recursive depth on google.protobuf.Structs

from libprotobuf-mutator.

Comments (2)

vitalybuka commented on July 28, 2024

I assumed I solved the issue few month ago

libprotobuf-mutator/src/text_format.cc

Line 31 in dd89da9

parser.SetRecursionLimit(100);

Is possible that just 100 breaks your binary?

from libprotobuf-mutator.

dgg5503 commented on July 28, 2024

Hi @vitalybuka,

Apologies for bumping a 5-year-old thread, however, I believe I've encountered an issue like this one. When using LPM integrated with libFuzzer, it is possible for LPM to generate a deeply nested message beyond the limits of the text or binary parsers. This is problematic in the following situation:

A .proto defines a data structure capable of recursively nested messages. For example, this stripped down .proto for a subset of JSON:

syntax = "proto2";
package data_structure;

message Input {
  required Element element = 1;
}

message Value {
  oneof value {
    Array array = 1;
    /* other value types supported by JSON would live here */
  }
}

message Element {
  required Value value = 1;
}

message Array {
  repeated Element elements = 1;
}

A fuzzer is created using the above structure with LPM.
After enough iterations, LPM begins to generate deeply nested messages (beyond 100+ levels deep).
Suppose one of these deeply nested messages signals new coverage to the fuzzing engine.
The fuzzer serializes then saves out the deeply nested message to the corpus.
The user restarts the fuzzer.

Corpus inputs are executed; however, the following error appears since one of those messages are beyond the parser's limit:

Error parsing text-format data_structure.Input: 131:207: Message is too deep, the parser exceeded the configured recursion limit of 100.

Coverage is no longer the same after reading all inputs from the corpus since the above input failed to parse and therefore did not execute as per:

libprotobuf-mutator/src/libfuzzer/libfuzzer_macro.h

Lines 72 to 77 in 1f95f80

 #define DEFINE_TEST_ONE_PROTO_INPUT_IMPL(use_binary, Proto) \ 

 extern "C" int LLVMFuzzerTestOneInput(const uint8_t* data, size_t size) { \ 

 using protobuf_mutator::libfuzzer::LoadProtoInput; \ 

 Proto input; \ 

 if (LoadProtoInput(use_binary, data, size, &input)) \ 

 TestOneProtoInput(input); \

In this situation, one may increase the recursion limit via SetRecursionLimit, however, even deeper messages are still capable of being generated and saved to disk, especially via crossover. Additionally, the fuzzer and possibly the function under test may end up spending more time deserializing, processing, & mutating deeply nested messages which could be better spent mutating fields at shallower levels.

With that said, is there any way to enforce a depth limit at the mutation level that I may have overlooked? In other words, during LPM mutation, if it detects that an add/clone/copy will put the mutated message over some user-provided maximum depth, can that be prevented in favor of shallower field mutation? I believe this would:

Prevent any chance of creating messages that are too deep to deserialize thus preserving coverage between fuzzer restarts
Prevent the need to periodically increase the parser's recursion limit
Possibly improve overall exec/s by limiting recursion depth during deserialization, fuzz target processing, and mutation

If there isn't already a way to enforce this, I'm curious what you'd think would be required to implement such a feature. I would be interested in providing a PR to add such functionality.

I've attached a simple example which demonstrates the scenario above.
issue-143-supplement.tar.gz

Thank you.

from libprotobuf-mutator.

Recursive depth on google.protobuf.Structs about libprotobuf-mutator HOT 2 OPEN

Comments (2)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

	#define DEFINE_TEST_ONE_PROTO_INPUT_IMPL(use_binary, Proto) \
	extern "C" int LLVMFuzzerTestOneInput(const uint8_t* data, size_t size) { \
	using protobuf_mutator::libfuzzer::LoadProtoInput; \
	Proto input; \
	if (LoadProtoInput(use_binary, data, size, &input)) \
	TestOneProtoInput(input); \