casys-kaist / linefs Goto Github PK
View Code? Open in Web Editor NEWLineFS: Efficient SmartNIC Offload of a Distributed File System with Pipeline Parallelism
LineFS: Efficient SmartNIC Offload of a Distributed File System with Pipeline Parallelism
Hi, Kim, I'm trying to follow your work and met some problems when performing the step 9.3
I am trying to set up a cluster with 3 nodes.
and I got a segmentation fault
After debugging with gdb, it seems the error is caused by rdma_bind_addr in line 71 file /usr/include/aarch64-linux-gnu/bits/string_fortified.h
I find that the cm_id is null and addr is 0.
Could you please help me with this problem? Thanks a lot!
I want to read this paper....
It seems that some repos in .gitsubmodules can not found, such as casys-kaist-internal/hyperloop.code.git
and casys-kaist-internal/Pipeline-ioat-dma-kernel-module.git
.
So where can we find the submoule repos?
I'm starting linefs with 3 nodes, ip addresses of them are as below:
// 1st machine - Bluefield NIC
{ .ip = "10.10.3.101", .role = HOT_REPLICA, .type = KERNFS_NIC_PEER},
// 1st machine - Host
{ .ip = "10.10.3.1", .role = HOT_REPLICA, .type = KERNFS_PEER},
// 2nd machine - Bluefield NIC
{ .ip = "10.10.3.102", .role = HOT_REPLICA, .type = KERNFS_NIC_PEER},
// 2nd machine - Host
{ .ip = "10.10.3.2", .role = HOT_REPLICA, .type = KERNFS_PEER},
// 3rd machine - Bluefield NIC
{ .ip = "10.10.3.103", .role = HOT_REPLICA, .type = KERNFS_NIC_PEER},
// 3rd machine - Host
{ .ip = "10.10.3.3", .role = HOT_REPLICA, .type = KERNFS_PEER},
and SmartNIC of node2 has been successfully connected with its host. The terminal of node2-nic is like this:
[New Thread 0xfffd0b1e6850 (LWP 253800)]
[New Thread 0xfffd0a9e5850 (LWP 253801)]
Connecting to KernFS instance 5 [ip: 10.10.3.3]
[New Thread 0xfffd095fe850 (LWP 253802)]
Wait for connections established. 0/2
Wait for connections established. 0/2
Wait for connections established. 0/2
Wait for connections established. 0/2
Wait for connections established. 0/2
But node1-host gets the wrong ip of node1-nic. The terminal of node1-host is like this:
ip address on interface 'enp129s0f0' is 10.10.3.2
It is Ready. Write 1 to file: /opt/LineFS/LineFS_x86/signals/kernfs/node1.jinwei-121792.bfkvs-pg0.clemson.cloudlab.us
Server: cmd_hdr is s
Server: cmd_hdr is s
Reading root inode with inum: 1
[New Thread 0x7ff6721ff700 (LWP 12727)]
Connecting to KernFS instance 4 [ip: 10.10.3.103]
12767 connection.c:1357 rc_die(): unknown event
Thread 34 "kernfs" received signal SIGTRAP, Trace/breakpoint trap
Could you help me to solve this problem? Thanks a lot!
Thank you for not letting this rot away as a student project we'll only ever find papers about - instead you shared it.
And it's feasible to try out, albeit in a smaller scale once one has some pmem enabled systems. Wonderful!
(I wish I were able to port this over to the 48core Cavium stuff, since they also got partitioning. From the paper, you stopped measuring at 8 clients due to line rate exhaustion, as a sysadmin I have to say that is where it starts to get interesting and your parallel "busy" handling looked very good. It would have been interesting to see how it holds up under very high overload, i.e. 1024 clients, as that's the expectation for real enterprise storage to handle gracefully (IOW: an architecture that is load-independent, as to not topple, fall over, implode, fold into itself when queues become excessive. your work seems to have had potential for that, even if it of course was very specific with the use of pmem. Today with nvme direct (gpu direct?) reads it would probably even be able to integrate slower storage in the same manner)
oh well. maybe some lucky day an engineer notices this code and it becomes a thing ;-) till then, again ty for sharing.)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.