Giter VIP home page Giter VIP logo

pspin's Introduction

GitHub Workflow Status License License

PsPIN: A RISC-V in-network accelerator for flexible high-performance low-power packet processing

PsPIN architecture overview

PsPIN [1] is an implementation of the sPIN programming model [2] based on PULP [3]. This repository includes the RTL code implementing PsPIN, the runtime software, and a set of examples to get started. We provide a toolchain that allows to define, build, and test new handlers through cycle-accurate simulations.

Simulation Workflow summary: The RTL code is verilated into C++ modules that are compiled together with the functional models into two libraries: libpspin.so and libpspin_debug.so. To write your own handler, you need to define the handler code and a simulation driver. The handler code must be compiled with RISC-V GCC. The simulation driver interfaces to libpspin.so for (1) initializing the simulation; (2) defining the content of L2 handler memory; (3) defining the handlers to offload; (4) defining and injecting packets to process; (5) handle events generated by the execution of the handlers (e.g., packets being sent or writes/reads to/from host memory). By linking against libpspin_debug.so you make the simulation a waves.vcd that can be explored with any value-change-dump editor (e.g., GTKWave http://gtkwave.sourceforge.net/).

Repo organization: The repositority has the following structure:

  • hw/: Hardware components and simulation logic.
    • hw/deps/: (RTL) Dependencies from the PULP platform (https://github.com/pulp-platform). Some of them have been adapted to fit in the PsPIN design. License: SolderPad 0.51.
    • hw/src/: (RTL) PsPIN components. License: SolderPad 0.51.
    • hw/verilator_model/: (functional) Components implementing the NIC model shown by the above figure. License: Apache 2.0.
  • sw/: Software components. License: Apache 2.0.
    • sw/pulp-sdk/: Dependencies from the PULP SDK adapted to fit the PsPIN design.
    • sw/rules/: Makefile rules used to ease simulutions setups and runs.
    • sw/runtime/: HPUs runtime code and support functions for the handlers.
    • sw/script/: utilities for extracting data from the simulation output.
  • examples/: Examples of sPIN handlers. License: Apache 2.0.
    • examples/*/driver/. Simulation driver.
    • examples/*/handlers/. Handlers code.

Getting started

Please check the docs: https://spcl.github.io/pspin/.

Citation

Please include this citation if you use this work as part of your project:

@inproceedings{pspin,
	title={A RISC-V in-network accelerator for flexible high-performance low-power packet processing},
	author={Di Girolamo, Salvatore and Kurth, Andreas and Calotoiu, Alexandru and Benz, Thomas and Schneider, Timo and Beranek, Jakub and Benini, Luca and Hoefler, Torsten},
	booktitle={2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA)},
	year={2021}
}

References

[1] Di Girolamo Salvatore, Kurth Andreas, Calotoiu Alexandru, Benz Thomas, Schneider Timo, Beranek Jakub, Benini Luca, Hoefler Torsten. "A RISC-V in-network accelerator for flexible high-performance low-power packet processing." 2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA). IEEE, 2021.

[2] Hoefler Torsten, Salvatore Di Girolamo, Konstantin Taranov, Ryan E. Grant, and Ron Brightwell. "sPIN: High-performance streaming Processing in the Network." In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 1-16. 2017.

[3] Rossi, Davide, Francesco Conti, Andrea Marongiu, Antonio Pullini, Igor Loi, Michael Gautschi, Giuseppe Tagliavini, Alessandro Capotondi, Philippe Flatresse, and Luca Benini. "PULP: A parallel ultra low power platform for next generation IoT applications." In 2015 IEEE Hot Chips 27 Symposium (HCS), pp. 1-39. IEEE, 2015.

pspin's People

Contributors

andreaskurth avatar clemenkl avatar kireinahoro avatar kistlers avatar miharulidze avatar salvatoredigirolamo avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pspin's Issues

Command responses should carry status code

If a handler issues a buggy command (e.g., to an undefined command interface) then the command request gets lost and no response is sent back.

We should:

  • (1) add an error bit to the command response;
  • (2) have a "sink" command interface that replies with responses flagged as errors;
  • (3) propagate success/fail status to the handler whenever it tests/waits for a command.

What happens if a handler issued a buggy command and then exits without checking? We should probably send an event to the host in this case. This would probably make (3) redundant because we shouldn't spend too much time into the handlers, e.g., checking for status codes: if sth goes wrong, it could be enough to signal this to the host via an event.

How to generate KVS YCSB workload?

Hi,
I'm reading your article (A RISC-V in-network accelerator for flexible high-performance low-power packet processing) and I noticed that there were some experiments conducted regarding KVS in the article.

It was mentioned in the article that:

We generate a YCSB [17] workload of 1,000 requests (50/50 read/write ratio, θ=1.1).

The experiments were conducted using simulated network interface devices, as mentioned in the article.
I wonder how to generate such a KVS YCSB packet trace? is there any reference code available as well?

Thanks.

Driver supplying packets smaller than `max_pkt_size` triggers assertion

The assertion looks like the following:

sim_slp_l1: src/NICInbound.hpp:475: void PsPIN::NICInbound<AXIPortType>::feedback_progress() [with AXIPortType = PsPIN::AXIPort<unsigned int, long unsigned int>]: Assertion `*ni_ctrl.feedback_her_size_i == pktentry.size' failed.

I'm currently on commit cf70804.

This happens if the fill_packet function does not return max_pkt_size (even if just off by one i.e. max_pkt_size-1). I'm pretty sure I'm not overflowing the packet buffer.

How is core_region's clock controlled and gated?

I'm recently reading your paper "A RISC-V in-network accelerator for flexible high-performance low-power packet processing", along with the source code. And I find there're some mismatches between the paper and the source code, which are quite confusing for me.

I'm reading the source code on tag v0.6.1, and I make no changes to the source files. There's no significant changes for hardware design in hw/ according to git diff with branch master, so I think it's okay to consider v0.6.1 as "update-to-date".

There are connections in hw/deps/pulp_cluster/rtl/pulp_cluster.sv, that I believe play the role of clock-gating the core_region.

// line 1031
cluster_peripherals #(
...
) cluster_peripherals_i (
...
  .core_busy_i(core_busy),
  .core_clk_en_o(clk_core_en),
...
);

// line 1155
core_region #(
...
) core_region_i (
...
  .clock_en_i(clk_core_en[i]),
...
  .core_busy_o(core_busy[i]),
...
);

Looks like that this cluster_peripherals_i instance is controlling/clock-gating the RISC-V cores. However, the paper mentions that

If the HPU driver has no task/handler to execute, it stops the HPUs by clock-gating it.

But I didn't find any connection between HPU driver and cluster_peripherals_i in the source code... Yet I don't find much description about this instance in the paper. So here are my questions:

  1. In current implemenation, by which module is core controlled/clock-gated, and what behavior is the module to control the core?
  2. What role is cluster_peripherals_i playing in the design? I noticed that it manages "events" from timer, DMA and etc., but how do these events and their sources work as a part of the design?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.