Implementing a simple Distributed File System (DFS) using python as our programming language.
The Distributed File System (DFS) is a file system with data stored on a server. The data is accessed and processed as if it was stored on the local client machine. The DFS makes it convenient to share information and files among users on a network.
Client side
-
Initialize:
- Initialize the client storage on a new system, should remove any existing file in the dfs root directory and return available size.
-
File create:
- Allows creation of a new empty file.
-
File read:
- Allows reading any file from DFS (download a file from the DFS to the Client side).
-
File write:
- Allows putting any file to DFS (upload a file from the Client side to the DFS)
-
File delete:
- Allows deleting any file from DFS
-
File info:
- Provides information about the file (any useful information - size, node id, etc.)
-
File copy:
- Allows creating a copy of file.
-
File move:
- Allows moving a file to the specified path.
-
Open directory:
- Allows changing directory
-
Read directory:
- Returns list of files, which are stored in the directory.
-
Make directory:
- Allows creation of a new directory.
-
Delete directory:
- Allows deletion of a directory. If the directory contains files the system asks for confirmation from the user before deletion.
Storage Server
-
Replication:
- Files will be replicated on multiple storage servers.
-
Directory management:
- Accesses files using
DATA_DIR + file_name
whereDATA_DIR
is\tmp\storage
andn
is the storage server number
- Accesses files using
-
Handles client requests
Naming Server
- File striping:
- Slices a file into several chunks or blocks; and our
BLOCK_SIZE
is 128
- Slices a file into several chunks or blocks; and our
-
About 5 EC2 instance running and a lot of money to support them ๐ญ
-
Instance ip's:
- naming_server: 3.23.185.197
- storage_server1: 3.23.149.225
- storage_server2: 3.23.228.220
- storage_server3: 52.15.190.62
-
DockerHub account
-
DockerHub images:
- ozziekins/client
- ozziekins/naming
- ozziekins/storage
Step 1: Launch the amazon instances
Step 2: ssh into the instances using the command in the connect tab
Step 3: Pull our docker images on the various instances using the command in step 4
Step 4: docker pull <image_name>
Step 5: Make use of any of the commands listed below on the instance hosting the client
$: init
$: create <file_name>
$: read <file_name>
$: write <src_file> <dest_file>
$: delete <file_name>
$: info <file_name>
$: copy <file_name>
$: root
$: move src dest
$: cd directory
$: ls directory
$: mkdir directory
$: dltdir directory
Overall architecture
Naming server and file system
Storage servers and replication
Our system uses an iterative manner of communicating where the client first goes to the naming server to get the required metadata, such as file chunk id and the particular storage server that the chunk can be found on.
Then the client takes this information and goes to the required storage server to carry out whatever command the client needs.
For communication, the various parts of our system make use of rpyc (remote python call) to communicate with each other. Hence, each component has it's specified ip address and port.
Team name:
DFST
Members:
Ozioma Okonicha ([email protected])๐ฉ๐พโ๐ป๐ณ๐ฌ
Anastassiya Ryabkova ([email protected])๐ฉ๐ผโ๐ป๐ฐ๐ฟ
Daniel Atonge ([email protected])๐ง๐ฟโ๐ป๐จ๐ฒ
Contributions proven by Trello
Contributions proven by Github