Giter VIP home page Giter VIP logo

sca-fuzzer's Introduction

Revizor

GitHub PyPI GitHub all releases GitHub contributors

Revizor is a security-oriented fuzzer for detecting information leaks in CPUs, such as Spectre and Meltdown. It tests CPUs against Leakage Contracts and searches for unexpected leaks.

For more details, see our Paper (open access here), and the follow-up paper.

Installation

Warning: Keep in mind that the Revizor runs randomly-generated code in kernel space. As you can imagine, things could go wrong. Make sure you're not running Revizor on an important machine.

1. Check Requirements

  • Architecture: Revizor supports Intel and AMD x86-64 CPUs. We also have experimental support for ARM CPUs (see arm-port branch) but it is at very early stages, use it on your own peril.

  • No virtualization: You will need a bare-metal OS installation. Testing from inside a VM is not (yet) supported.

  • OS: The target machine has to be running Linux v4.15 or later.

2. Install Revizor Python Package

If you use pip, you can install Revizor with:

pip install revizor-fuzzer

Alternatively, install Revizor from sources:

# run from the project root directory
make install

If the installation fails with 'revizor-fuzzer' requires a different Python:, you'll have to install Python 3.9 and run Revizor from a virtual environment:

sudo apt install python3.9 python3.9-venv
/usr/bin/python3.9 -m pip install virtualenv
/usr/bin/python3.9 -m virtualenv ~/venv-revizor
source ~/venv-revizor/bin/activate
pip install revizor-fuzzer

3. Install Revizor Executor (kernel module)

Then build and install the kernel module:

# building a kernel module require kernel headers
sudo apt-get install linux-headers-$(uname -r)

# get the source code
git clone https://github.com/microsoft/sca-fuzzer.git

# build the executor
cd sca-fuzzer/src/x86/executor
make uninstall  # the command will give an error message, but it's ok!
make clean
make
make install

4. Download ISA spec

rvzr download_spec -a x86-64 --extensions BASE SSE SSE2 CLFLUSHOPT CLFSH --outfile base.json

5. (Optional) System Configuration

For more stable results, disable hyperthreading (there's usually a BIOS option for it). If you do not disable hyperthreading, you will see a warning every time you invoke Revizor; you can ignore it.

Optionally (and it really is optional), you can boot the kernel on a single core by adding -maxcpus=1 to the boot parameters (how to add a boot parameter).

Command Line Interface

The fuzzer is controlled via a single command line interface rvzr (or revizor.py if you're running directly from the source directory).

It accepts the following arguments:

  • -s, --instruction-set PATH - path to the ISA description file
  • -c, --config PATH - path to the fuzzing configuration file
  • -n , --num-test-cases N - number of test cases to be tested
  • -i , --num-inputs N - number of input classes per test case. The number of actual inputs = input classes * inputs_per_class, which is a configuration option
  • -t , --testcase PATH - use an existing test case instead of generating random test cases
  • --timeout TIMEOUT - run fuzzing with a time limit [seconds]
  • -w - working directory where the detected violations will be stored

For example, this command

rvzr fuzz -s base.json -n 100 -i 10  -c config.yaml -w ./violations

will run the fuzzer for 100 iterations (i.e., 100 test cases), with 10 inputs per test case. The fuzzer will use the ISA spec stored in the base.json file, and will read the configuration from config.yaml. If the fuzzer finds a violation, it will be stored in the ./violations directory.

See docs for more details.

How To Fuzz With Revizor

The fuzzing process is controlled by a configuration file in the YAML format, passed via --config option. At the very minimum, this file should contain the following fields:

  • contract_observation_clause and contract_execution_clause describe the contract that the CPU-under-test is tested against. See this page for a list of available contracts. If you don't know what a contract is, Sec. 3 of this paper will give you a high-level introduction to contracts, and this paper will provide a deep dive into contracts.
  • instruction_categories is a list of instruction types that will be tested. Effectively, Revizor uses this list to filter out instructions from base.json (the file you downloaded via rvzr download_spec).

For a full list of configuration options, see docs.

Baseline Experiment

After a fresh installation, it is normally a good idea to do a quick test run to check that everything works ok.

For example, we can create a configuration file config.yaml with only simple arithmetic instructions. As this instruction set does not include any instructions that would trigger speculation on Intel or AMD CPUs (at least that we know of), the expected contract would be CT-SEQ:

# config.yaml
instruction_categories:
  - BASE-BINARY  # arithmetic instructions
max_bb_per_function: 1  # no branches!
min_bb_per_function: 1

contract_observation_clause: loads+stores+pc  # aka CT
contract_execution_clause:
  - no_speculation  # aka SEQ

Start the fuzzer:

rvzr fuzz -s base.json -i 50 -n 100 -c config.yaml  -w .

This command should terminate with no violations.

Detection of a Simple Contract Violation

Next, we could intentionally make a mistake in a contract to check that Revizor can detect it. To this end, we can modify the config file from the previous example to include instructions that trigger speculation (e.g., conditional branches) but keep the contract the same:

# config.yaml
instruction_categories:
  - BASE-BINARY  # arithmetic instructions
  - BASE-COND_BR
max_bb_per_function: 5  # up to 5 branches per test case
min_bb_per_function: 1

contract_observation_clause: loads+stores+pc  # aka CT
contract_execution_clause:
  - no_speculation  # aka SEQ

Start the fuzzer:

rvzr fuzz -s base.json -i 50 -n 1000 -c config.yaml -w .

As your CPU-under-test almost definitely implements branch prediction, Revizor should detect a violation within a few minutes, with a message similar to this:

================================ Violations detected ==========================
  Contract trace (hash):

    0111010000011100111000001010010011110101110011110100000111010110
  Hardware traces:
   Inputs [907599882]:
    .....^......^......^...........................................^
   Inputs [2282448906]:
    ...................^.....^...................................^.^

You can find the violating test case as well as the violation report in the directory named ./violation-*/. It will contain an assembly file program.asm that surfaced a violation, a sequence of inputs input-*.bin to this program, and some details about the violation in report.txt.

Full-Scale Fuzzing Campaign

To start a full-scale test, write your own configuration file (see description here and an example config here), and launch the fuzzer.

Below is a example launch command, which will start a 24-hour fuzzing session, with 100 input classes per test case, and which uses big-fuzz.yaml configuration:

rvzr fuzz -s base.json -c src/tests/big-fuzz.yaml -i 100 -n 100000000 --timeout 86400 -w `pwd` --nonstop

When you find a violation, you will have to do some manual investigation to understand the source of it; this guide is an example of how to do such an investigation.

Need Help with Revizor?

If you find a bug in Revizor, don't hesitate to open an issue.

If something is confusing or you need help in using Revizor, we have a discussion page.

Documentation

For more details, see the website.

Citing Revizor

To cite this project, you can use the following references:

  1. The original paper that introduced the concepts of Model-based Relation Testing, and which describes the main ideas behind Revizor.

    Oleksii Oleksenko, Christof Fetzer, Boris Köpf, Mark Silberstein. "Revizor: Testing Black-box CPUs against Speculation Contracts" in Proceedings of the 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2022.

  2. The paper that introduced the idea of Leakage Contracts, as well as its theoretical foundations.

    Marco Guarnieri, Boris Köpf, Jan Reineke, and Pepe Vila. "Hardware-software contracts for secure speculation" in Proceedings of the 2021 IEEE Symposium on Security and Privacy (SP), 2021.

  3. A more accessible summary of the two papers above, in a journal format.

    Oleksii Oleksenko, Christof Fetzer, Boris Köpf, Mark Silberstein. "Revizor: Testing Black-box CPUs against Speculation Contracts". In IEEE Micro, 2023.

  4. The paper that introduced taint-based input generation, speculation filtering, and observation filtering:

    Oleksii Oleksenko, Marco Guarnieri, Boris Köpf, and Mark Silberstein. "Hide and Seek with Spectres: Efficient discovery of speculative information leaks with random testing" in Proceedings of the 2023 IEEE Symposium on Security and Privacy (SP), 2022.

Contributing

See CONTRIBUTING.md.

Trademarks

This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow Microsoft's Trademark & Brand Guidelines. Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party's policies.

sca-fuzzer's People

Contributors

aidan5806 avatar bkoepf avatar cwshugg avatar janahofmann avatar mguarnieri avatar microsoftopensource avatar oleksiioleksenko avatar van-ema avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

sca-fuzzer's Issues

Some Python 3.9 functions are used

There are several instances of python 3.9 functions being used when python 3.7 is the minimum listed version. Supporting python 3.7 would be good for lower bar to entry. An example of the 3.9 function use is the removeprefix function.

name = name.removeprefix("{load} ")
name = name.removeprefix("{store} ")
name = name.removeprefix("{disp32} ")

msg = msg.removeprefix(asm_file + ":")

func_name = label.removeprefix(".function_")

operands_raw = line.removeprefix(name).split(",")

At present, each of these could be replaced with a call to an older set of functions, for example:

if line.startswith(name):
                operands_raw = line[len(name):].split(",")
            else:
                operands_raw = line.split(",")

postprocessor: automatic detection of speculation sources

One common step in analyzing a contract violation is to determine which of the instructions in a test case trigger speculation (let's call such instructions "speculation sources"). So far, this analysis has been done manually, as for example described in the Fuzzing Guide (Step 3 in Analyzing The Violation).

This issue is a proposal to automate the task, as follows: Revizor steps through the instruction in the test case and remove them one at a time. In each iteration, Revizor monitors the speculation filter (i.e., monitoring the number of issued and retired uops). If the test case with an instruction removed still triggers speculation, then this instruction is not a speculation source. Otherwise, if speculation disappears, Revizor adds a comment in the corresponding assembly line to indicate this instruction as a speculation source, and (optionally) prints the instruction in the terminal.

rvzr minimize generate fenced.asm file and may cause confusions

when running rvzr minimize with/without “–add fences” a file fenced.asm is generated. the file includes lfence after every instruction.

this is very confusion, as wehn using “–add fences” you expect to get assembly file with max number of fences that still fails, and you may thing that fenced.asm is the file and not the file name you put under "-o" option of rvzr.

Model traces are printed only for nesting=1

When logging.dbg_model is set and model_max_nesting != 1, the model traces are still printed as if max_nesting = 1.

Cause: In service.py:trc_fuzzer_dump_traces the nesting level is hard-coded to 1. I also found the same bug in service.py:fuzzer_report_violations.

CC: @janahofmann

Hardcoded addresses

In templates.c of executor folder exist the following hardcoded addresses
#define TEMPLATE_ENTER 0x0fff379000000000 #define TEMPLATE_INSERT_TC 0x0fff2f9000000000 #define TEMPLATE_RETURN 0x0fff279000000000 #define TEMPLATE_JUMP_EXCEPTION 0x0fff479000000000
and the same can be seen also in the arm port in the same file
#define TEMPLATE_ENTER 0x00001111 #define TEMPLATE_INSERT_TC 0x00002222 #define TEMPLATE_RETURN 0x00003333
It is not documented where this addresses came from,as a result we cant find the equivalent addresses in a potential port

Confusion about the paper "Hide and Seek with Spectres"

Hi, Sorry to contact you in this way, but I failed to find your email address.

I am reading your paper "Hide and Seek with Spectres: Efficient discovery of speculative information leaks with random testing". When reading the right side of page 7, I have an incomprehension regarding the value of i'. In order to generate Contract(p, i) = Contract(p, i′ ), I think i' should be {rax=20, rbx=70} rather than {rax=10, rbx=70}.

When i={rax=20,rbx=5} and i'={rax=20, rbx=70}, it can successfully produce Contract(p, i) = Contract(p, i′) and Measure(p, i, µ) != Measure(p, i′, µ).

Description of Image

I'm not sure if my understanding is correct. Can you help clarify my doubt? Thanks a lot.

x86/model: Incorrect handling of string operations in the NullInj contract

String operations can access multiple memory addresses with some of them triggering a fault and some not. Yet the nullinj-fault contract injects zeros into all of the accesses if at least one of them faults. It leads to contract violations, which could be considered a fault positive.

Minimal test case:

.intel_syntax noprefix
MFENCE # instrumentation
.test_case_enter:

AND RDI, 0b1111111111111 # instrumentation
ADD RDI, R14 # instrumentation
AND RSI, 0b1111111111111 # instrumentation
ADD RSI, R14 # instrumentation
AND RCX, 0xff # instrumentation
ADD RCX, 1 # instrumentation
REPNE MOVSW

AND RAX, 0b1111111111111 # instrumentation
MOV rax, qword ptr [R14 + RAX]

.test_case_exit:
MFENCE # instrumentation

Config:

contract_observation_clause: loads+stores+pc
contract_execution_clause:
    - nullinj-fault

input_gen_entropy_bits: 24
memory_access_zeroed_bits: 0
inputs_per_class: 2

permitted_faults:
    - PF-present

logging_modes:
  - info
  - stat
  - dbg_violation

build error - At least one file selection option must be defined in the tool.hatch.build.targets.wheel table

when running make install , python3 -m build fails with the following error message:

Unable to determine which files to ship inside the wheel using the following heuristics: https://hatch.pypa.io/latest/plugins/builder/wheel/#default-file-selection

At least one file selection option must be defined in the tool.hatch.build.targets.wheel table, see: https://hatch.pypa.io/latest/config/build/

As an example, if you intend to ship a directory named foo that resides within a src directory located at the root of your project, you can define the following:

[tool.hatch.build.targets.wheel]
packages = ["src/foo"]

can be fixed by adding this to the pyproject.toml file:

    ...
    [tool.hatch.build.targets.wheel] 
    packages = ["revizor"]  

Arm Generator creates too many unconditional branches

Currently the Arm testcase generator seems to produce a high number of unconditional branches. This causes very little variance in the flow of generated binaries. There should be a control or change to maximize/increase the number of conditional branches.

Error minimizing test cases: Fences

I encountered what I believe to be unexpected behaviour when running the minimizing function with --add-fences enabled:

(1) I got a random generated x86 program from Revizor using a relatively basic configuration
(2) I tried executing the minimize function on the program (minasms/3f.asm) via the following line:
./cli.py minimize -s x86/isa_spec/base.json -c example-config.yaml -i minasms/3.asm -o minasms/3f.asm -n 100 --add-fences
- This failed, giving the following error message: "generator.AsmParserException: Could not parse line 10
Reason: Terminator not at the end of BB"
(3) I then minimized this program (into minasms/min3.asm, without adding fences) and tried adding fences to that minimized version
with the following line:
./cli.py minimize -s x86/isa_spec/base.json -c example-config.yaml -i minasms/min3.asm -o minasms/min3f.asm -n 100 --add-fences
- This failed too, giving the same error message as the one in (2).

There errors were encountered on the 24th of February, with hyperthreading disabled and the latest version of Revizor to that date. The files were later re-fuzzed again in case of false positives, but both files when fuzzed still presented a violation.
I've left a .zip with the two .asm files and the .yaml file which triggered the error. The original test case file was generated with the same configuration as the one used to try to add fences.

cant-fence.zip

Arm Generator creates out of spec loads and stores.

We have seen a couple out of spec instructions generated by the Arm ISA generator. This occurs due to a missing constraint whereby post-indexed or pre-indexed with writeback loads and stores cannot share the same address and src/dest register. For example, the generator will create the instruction str W0,[X0],#-59, which targets R0 for the source of both address and data. This is prohibited according to the Arm developer portal where it states:

Rn must not be the same as Rd, if the instruction:

  • is pre-indexed with writeback (the ! suffix)
  • is post-indexed
  • uses the T suffix.

x86/generator: Could not find an instruction to patch flags

I tried fuzzing using this configuration file:

supported_categories:
  - BASE-COND_BR
  - BASE-BITBYTE
  - BASE-CMOV
  - BASE-FLAGOP
  - BASE-LOGICAL

contract_observation_clause: ct
contract_execution_clause:
   - seq

min_bb_per_function: 2
max_bb_per_function: 2
test_case_size: 24
avg_mem_accesses: 12
program_generator_seed: 0
input_gen_entropy_bits: 16
memory_access_zeroed_bits: 0

and this command:

./cli.py fuzz -s x86/isa_spec/base.json -i 35 -n 500 -c tests/

After a few hunters of rounds, I encounter this exception. Every time it happens in a different round.

generator.GeneratorException: Could not find an instruction to patch flags ['OF'] , when OF modification in the supported instructions

The instructions I enable implicitly set the OF flag, but still this exception is raised.

Linux v6 compatibility issues

When I try to install executor kernel module, inpite of installing cpuid, I keep running into
sca-fuzzer/src/x86/executor/main.c:15:10: fatal error: cpuid.h: No such file or directory
15 | #include <cpuid.h>
| ^~~~~~~~~

kernel and headers version: 6.0.9; ubuntu 22.04. Incase its pertinent- I am using 3.10 version of python.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.