Giter VIP home page Giter VIP logo

rshim-user-space's Introduction

              BlueField Rshim Host Driver

The rshim driver provides a way to access the rshim resources on the BlueField target from external host machine. The current version implements device files for boot image push and virtual console access. It also creates virtual network interface to connect to the BlueField target and provides a way to access the internal rshim registers.

*) Build

Linux:

Make sure autoconf/automake/pkg-config tools are available. Run bootstrap.sh for the first time to generate the configure file. Then run the ./configure script followed by make & make install to build and install it.

FreeBSD:

Require FreeBSD 12.0+ with packages autoconf, automake, gmake, libepoll-shim, libpciaccess, libpci, pkgconf.

Follow the same steps above to build it. Use 'gmake install' to install it.

*) Usage

rshim -h syntax: rshim [--help|-h] [--backend|-b usb|pcie|pcie_lf] [--device|-d device-name] [--foreground|-f] [--debug-level|-l <0~4>]

*) Device Files

Each rshim target will create a directory /dev/rshim<N>/ with the following device files. <N> is the device id, which could be 0, 1, etc.

  • /dev/rshim<N>/boot

Boot device file used to push boot stream to the ARM side, for example,

cat install.bfb > /dev/rshim<N>/boot
  • /dev/rshim<N>/console

Console device, which can be used by console apps to connect to the ARM side, such as

screen /dev/rshim<N>/console
  • /dev/rshim<N>/rshim

Device file used to access the rshim registers. The read/write offset is encoded as "((rshim_channel << 16) | register_offset)".

  • /dev/rshim<N>/misc

Key/Value pairs used to read/write misc information. For example,

Display the content:

cat /dev/rshim<N>/misc
  DISPLAY_LEVEL   0 (0:basic, 1:advanced, 2:log)
  BOOT_MODE       1 (0:rshim, 1:emmc, 2:emmc-boot-swap)
  BOOT_TIMEOUT    100 (seconds)
  DROP_MODE       0 (0:normal, 1:drop)
  SW_RESET        0 (1: reset)
  DEV_NAME        usb-3.3
  DEV_INFO        BlueField-3(Rev 1)
  OPN_STR         9009D3B400ENEA
  UP_TIME         179752(s)
  SECURE_NIC_MODE 1 (0:no, 1:yes)

Display more infomation:

echo "DISPLAY_LEVEL 1" > cat /dev/rshim<N>/misc
cat /dev/rshim<N>/misc
  ...
  PEER_MAC  00:1a:ca:ff:ff:01   # Target-side MAC address
  PXE_ID    0x01020304          # PXE DHCP-client-identifier

The 'PEER_MAC' attribute can be used to display and set the target-side MAC
address of the rshim network interface. It works when the target-side is in
UEFI BootManager or in Linux where the tmfifo has been loaded. The new MAC
address will take effect in next boot.

Initiate a SW reset:

echo "SW_RESET 1" > /dev/rshim<N>/misc

When 'SECURE_NIC_MODE' is shown as 1, the NIC firmware is in Secure NIC mode and most rshim functionalities are disabled. This mode applies to PCIe rshim backend only. PCIe LF and USB rshim backends are not affected.

*) Multiple Boards Support

Multiple boards could connect to the same host machine. Each of them has its own device directory /dev/rshim/. Network subnet needs to be set properly just like any other standard NIC.

*) How to change the MAC address of the ARM side interface

Update the 'PEER_MAC' attribute in the misc file like below. Display the value to confirm it's set. Reboot the device to take effect.

echo "PEER_MAC 00:1a:ca:ff:ff:10" > /dev/rshim\<N\>/misc

rshim-user-space's People

Contributors

alaahl avatar asmaamellanox avatar dwoods2 avatar hselasky avatar lsun100 avatar mbgg avatar pgeng-nv avatar shravankumarr avatar tzafrir-mellanox avatar vladsokolovsky avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

rshim-user-space's Issues

Lsun, rshim and Fedora

Sorry for opening a ticket for this.

The Fedora Infrastructure has been trying to contact @lsun100 since September 5th without any success.
We raised the issue in the Fedora devel list (cf: https://lists.fedoraproject.org/archives/list/[email protected]/message/3NZNSFKSS25EVEA6RY42LVOPZ74NGUQU/ ), also without success.

@lsun100 if you see this, feel free to close it and reach out by email (either to me or to the address (admin@...) that is sending you the daily notifications).

Thanks!

rshim needs to respect SIGTERM and SIGINT

I needed to make a modification to /etc/rshim.conf. After doing so, it seemed prudent to restart rshim.service. That took a very long time because rshim seems to ignore SIGTERM. Or perhaps it just ignores SIGTERM in the bad state that caused me to want to modify the configuration. journalctl tells me:

Aug 12 16:39:30 xxx systemd[1]: Stopping rshim driver for BlueField SoC...
Aug 12 16:41:00 xxx systemd[1]: rshim.service: State 'stop-sigterm' timed out. Killing.
Aug 12 16:41:00 xxx systemd[1]: rshim.service: Killing process 31084 (rshim) with signal SIGKILL.
Aug 12 16:41:00 xxx systemd[1]: rshim.service: Failed with result 'timeout'.
Aug 12 16:41:00 xxx systemd[1]: Stopped rshim driver for BlueField SoC.
Aug 12 16:41:00 xxx systemd[1]: Starting rshim driver for BlueField SoC...
Aug 12 16:41:00 xxx systemd[1]: Started rshim driver for BlueField SoC.

I observed similar problems with SIGINT when trying to use ^C to kill an rshim instance that was running in the foreground.

Note that this happened when I was debugging problems on a machine where the running kernel is missing cuse.ko (see #26), in case that matters. It was observed with the 2.0.5 build and with bits I built from commit 9b1da84.

Please avoid using a dash in the version

Please consider using a simpler versioning scheme. The current version format ("2.0.6-19") that contains a dash is not friendly to downstream packaging. Distribution packages (e.g. Fedora, Debian, ...) use a dash as the separator between the upstream version and the distro packaging release fields.
Maybe semantic versioning (MAJOR.MINOR.PATCH)?: https://semver.org/

Rshim IOCTL stats

Dear team,

I'm working on Bluefield 2 DPU devices and have received scripts that can access low-level statistics. I don't know how much of the details are neccesary but in brief what the script does is open a file descriptor to rshim.

self.rshim_fd = os.open(filename, os.O_RDWR)

with filename /dev/rshim0
and then the script tries to

adapted_addr = (channel << 16) | addr
buf= array.array("B", 16*[0])                                                                                                               
struct.pack_into('<LQ',buf, 0, adapted_addr, 0)                                                                                             
sys.stdout.flush()                                                                                                                          
fcntl.ioctl(self.rshim_fd, 1, buf) 
val=buf[4:12]                                                                                                                               
rv=struct.unpack("<Q",bytearray(val))[0] 

which results in
IOError: [Errno 25] Inappropriate ioctl for device

I've looped through all the others possible values instead of 1 and the error is the same.
The version:

rshim -v
rshim 2.0.20

If you need any additional info about what the script does we can go deeper into it. I would be grateful for any help.

Best regards

rshim boot interface copy hang with larger buffers

When using dd to copy a bootstream to the rshim, I noticed that larger buffer sizes (e.g. 1MB) never finish copying. The last buffer always does not finish copying.

Example:
dd if=installer.bf of=/dev/rshim0/boot bs=1M conv=sync status=progress

When terminating dd with SIGINT, it shows the following:
435159040 bytes (435 MB, 415 MiB) copied, 80 s, 5.4 MB/s
^C
441+1 records in
441+0 records out
462422016 bytes (462 MB, 441 MiB) copied, 170.944 s, 2.7 MB/s

My bfb file was 442 MB large, so the last buffer is missing/never finished copying.

rshim should exit when cuse is missing

rshim has no chance of doing anything useful when the cuse kernel module cannot be loaded. It should log an appropriate message and exit non-zero immediately. This would make debugging this failure mode much easier.

This is with rshim built from commit 9b1da84.

# ./rshim -b pcie -d 0000:d8:00.2 -l 4 -f
modprobe: FATAL: Module cuse not found in directory /lib/modules/4.18.0-80.el8.27782638.x86_64

At this point it just keeps running.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.