A high-performance, cloud-native distributed file system engineered with C++20, gRPC, and Docker, orchestrated by Kubernetes.
Designed to demonstrate advanced concepts in distributed systems, including data replication, strong consistency, high availability, and self-healing architectures.
Modern distributed systems require scalable and fault-tolerant storage layers. VertexFS was built as a learning and demonstration project to explore how real-world distributed file systems handle:
- Data replication and durability
- Failure detection and recovery
- Consistency guarantees
- Cloud-native deployment patterns
VertexFS follows a Master-Worker architecture, decoupling metadata management from raw data storage.
- Metadata Master (C++): Manages namespace, file-to-block mapping, and cluster health
- Storage Nodes (C++): Store data blocks and handle replication
- Client: CLI interface for interacting with the system
- Communication Layer (gRPC): Efficient, low-latency RPC communication
graph TD
Client -->|Upload/Download| Master
Master -->|Metadata Ops| Client
Master -->|Replication Commands| Worker1
Master -->|Replication Commands| Worker2
Master -->|Replication Commands| Worker3
Worker1 --> Worker2
Worker2 --> Worker3
Worker3 --> Worker1
Worker1 -->|Heartbeat| Master
Worker2 -->|Heartbeat| Master
Worker3 -->|Heartbeat| Master
- Strong Consistency: Writes require a quorum (N/2 + 1) of replicas before commit
- Fault Tolerance: Automatic failure detection via gRPC heartbeats
- Self-Healing: Kubernetes restarts failed pods; Master re-replicates missing blocks
- Replication: N-way replication across storage nodes
- Container-Native: Optimized multi-stage Docker builds
- Quorum-based write strategy
- Reads served from up-to-date replicas
- Ensures no stale data is returned after committed writes
VertexFS is designed to tolerate:
- Node crashes (fail-stop failures)
- Pod restarts in Kubernetes
- Partial network failures (best-effort handling)
- Language: C++20 (Abseil, Google Test)
- RPC Framework: gRPC / Protocol Buffers
- Containerization: Docker
- Orchestration: Kubernetes (StatefulSets)
- Build System: CMake
The VertexFS client provides a simple CLI interface:
$ client --upload myfile.txt
$ client --download myfile.txtFeatures:
- File upload/download
- Metadata interaction
git clone https://github.com/benginsternas/VertexFS.git
cd VertexFS
docker-compose up --buildStarts:
- 1 Metadata Master
- 3 Storage Nodes
mkdir build && cd build
cmake ..
make -j$(nproc)
./storage_node --port=50051kubectl apply -f k8s/master-deploy.yamlkubectl apply -f k8s/worker-statefulset.yaml# Upload a file
client --upload myfile.txt
# Simulate failure
kubectl delete pod dfs-worker-0
# Verify availability
client --download myfile.txt
# Observe recovery
kubectl logs <master-pod>- Logs accessible via
kubectl logs - Health monitoring via heartbeat system
- Future: Prometheus & Grafana integration
- Single Metadata Master (single point of failure)
- No consensus protocol (e.g., Raft) implemented yet
- No erasure coding (replication only)
Planned improvements:
- Multi-master support
- Dynamic load balancing
- Improved failure handling
- Metrics and monitoring stack
Distributed under the MIT License.