Skip to content

OpenCSD: eBPF Computational Storage Device (CSD) for Zoned Namespace (ZNS) SSDs in QEMU

License

Notifications You must be signed in to change notification settings

Dantali0n/OpenCSD

Repository files navigation

pipeline status coverage report latest commit source code license MIT follow me on mastodon

Publications

OpenCSD

OpenCSD is an improved version of ZCSD achieving snapshot consistency log-structured filesystem (LFS) (FluffleFS) integration on Zoned Namespaces (ZNS) Computational Storage Devices (CSD). Below is a diagram of the overall architecture as presented to the end user. However, the actual implementation differs due to the use of emulation using technologies such as QEMU, uBPF and SPDK.

FluffleFS

FluffleFS is the filesystem built on using the OpenCSD framework. Designed based on a LFS with the flash optimized F2FS filesystem as inspiration. FluffleFS is unique in that it is written in user space code thanks to the FUSE library while still offering simulated CSD offloading support with concurrent regular user access to the same file!

Getting Started

asciicast

Index

Directory Structure

  • cmake - Small cmake snippets to enable various features
  • dependencies - Project dependencies
  • docs - Doxygen generated source code documentation
  • fosdem2023 - FOSDEM 2023 emulator devroom presentation
  • ictopen2022 - ICTOPEN 2022 presentation
  • measurements - Raw experiment data used during thesis
  • opencsd - Project source files
  • playground - Small toy examples or other experiments
  • python - Python scripts to aid in visualization or measurements
  • scripts - Shell scripts primarily used by CMake to install project dependencies
  • tests - Unit tests and possibly integration tests
  • thesis - Thesis written on OpenCSD using LaTeX
  • thesis-presentation - Thesis presentation written on OpenCSD using LaTeX
  • zcsd - Documentation on the previous prototype.
    • compsys 2021 - CompSys 2021 presentation written in LaTeX
    • documentation - Individual Systems Project report written in LaTeX
    • presentation - Individual Systems Project midterm presentation written in LaTeX
  • .vscode - Launch targets and settings to debug programs running inside QEMU over SSH

Modules

Module Task
arguments Parse commandline arguments to relevant components
bpf_helpers Headers to define functions available from within BPF
bpf_programs BPF programs ready to run on a CSD using bpf_helpers
fuse_lfs Log Structured Filesystem in FUSE
nvme_csd Emulated additional NVMe commands to enable BPF CSDs
nvme_zns Interface to handle zoned I/O using abstracted backends
nvme_zns_memory Non-persistent memory backed emulated ZNS SSD backend
nvme_zns_spdk Persistent SPDK backed ZNS SSD backend
output Neatly control messages to stdout and stderr with levels
spdk_init Provides SPDK initialization and handles for nvme_zns & nvme_csd

Dependencies

This project has a large selection of dependencies as shown below. Note however, these dependencies are already available in the image QEMU base image.

Warning Meson must be below version 0.60 due to a bug in DPDK

  • General
    • Linux 6.0 or higher
    • compiler with c++17 support
    • clang 10 or higher
    • cmake 3.18 or higher
    • python 3.x
    • mesonbuild < 0.60 (pip3 install meson==0.59)
    • pyelftools (pip3 install pyelftools)
    • libslirp
    • ninja
    • cunit
  • Documentation
    • doxygen
    • LaTeX
  • Code Coverage
    • ctest
    • lcov
    • gcov
    • gcovr
  • Continuous Integration
    • valgrind
  • Python scripts
    • virtualenv

The following dependencies are automatically compiled and installed into the build directory.

Dependency Version
backward 1.6
boost 1.74.0
bpftool 5.14
bpf_load 5.10
dpdk spdk-21.11
generic-ebpf c9cee73
fuse-lfs 526454b
libbpf 0.5
libfuse 3.10.5
libbpf-bootstrap 67a29e5
linux 5.14
spdk 22.09
isa-l spdk-v2.30.0
rocksdb 6.25.3
qemu 7.2.0
uBPF 9eb26b4
xenium f1d28d0

Setup

Several setups are available of which two are officially supported. We recommend using QEMU for a non-volatile filesystem setup provided by QEMU ZNS emulation.

  1. Through QEMU; non-volatile filesystem
  2. Direct on host; volatile memory backed filesystem

QEMU Setup

QEMU setup will provide an emulated environment with emulated Zoned Namespaces NVMe device providing a non-volatile CSD filesystem experience. In addition, the use of QEMU ensures software libraries and frameworks use supported versions.

The QEMU setup will try to download a 4.5 GB qcow2 image that will fail if not downloaded within 30 minutes. Alternatively, the file can be downloaded as torrent through this link. This file should be saved as ./build/opencsd/arch-qemucsd.qcow2.

Alternatively the QEMU image can be downloaded after executing make qemu-build by running cd oepncsd; ./download-image.sh.

# git clone https://gitlab.dantalion.nl/vu/opencsd.git
cd opencsd
git submodule update --init
mkdir build
cd build
cmake ..
# This will also create a 32gb zns image
make qemu-build
cmake .. # this prevents re-compiling dependencies on every next make command
cd opencsd
source activate
# By default qemu will use 4 CPU cores and 8GB of memory + kvm
./qemu-start-256-kvm.sh
# Wait for QEMU VM to fully boot... (might take some time)
# Type password (arch)
ssh arch@localhost -p 7777
cd opencsd
git pull origin master
git -c submodule."dependencies/qemu".update=none submodule update --init
mkdir build
cd build
cmake -DENABLE_DOCUMENTATION=off -DIS_DEPLOYED=on ..
make fuse-entry-spdk -j $(nproc)
cmake .. # this prevents re-compiling dependencies on every next make command

Host Setup

Note, in case of failure to detect native kernel sources install location; a fixed version from ./dependencies/linux will be used. This can cause failures in vmlinux.h with bpftool when accessing /sys/kernel/btf/vmlinux.

# git clone https://gitlab.dantalion.nl/vu/opencsd.git
cd opencsd
git submodule update --init
mkdir build
cd build
cmake ..
make fuse-entry -j $(nproc)
cmake .. # this prevents re-compiling dependencies on every next make command

Environment

Within the build folder will be a opencsd/activate script. This script can be sourced using any shell source opencsd/activate. This script configures environment variables such as LD_LIBRARY_PATH while also exposing an essential sudo alias: ld-sudo.

The environment variables ensure any linked libraries can be found for targets compiled by Cmake. Additionally, ld-sudo provides a mechanism to start targets with sudo privileges while retaining these environment variables. The environment can be deactivated at any time by executing deactivate.

Usage Examples

All usage examples assume the steps of the previous example have been executed prior!

  1. Start the filesystem in a memory backed mode (volatile) and mount it on test.

Mounts and starts the filesystem in a volatile mode under the test directory. Any output will be printed to stdout / stderr.

# working directory: opencsd (root)
cd build
make fuse-entry
cmake ..
cd opencsd
mkdir −p test
source activate
ld−sudo ./fuse−entry −− −d −o max_read=2147483647 test &
  1. Run the passthrough kernel on the filesystem mounted under test using the python script.

On a mounted filesystem copy the pre-compiled passthrough read kernel. Next, place data in a test file and execute a example python script to orchestrate executing the read kernel on the example file.

# working directory: opencsd/build/opencsd
cp ../bin/bpf_flfs_read.o test/
echo "hello world" > test/test
ld-sudo python3 ../../python/csd-read-passthrough.py
  1. Manually register and start the passthrough kernel step by step with python.
# ld-sudo python3
import os
import xattr
import pdb

read_stride = 524288

pdb.set_trace()

fd = os.open("test/test", os.O_RDWR)
filesize = os.stat("test/test").st_size

kern_ino = os.stat("test/bpf_flfs_read.o").st_ino

xattr.setxattr(
  "test/test", "user.process.csd_read_stream", bytes(f"{kern_ino}", "utf-8")
)

steps = int(filesize / read_stride)
if steps % read_stride != 0: steps += 1

for i in range(0, steps):
  os.pread(fd, read_stride, i * read_stride)

Roadmap

These are grouped by component and ordered by importance.

  1. Support & Infrastructure
    1. Use clang-tidy to apply code formatting
    2. Integrate CI/CD job to check clang-tidy formatting
  2. Testing & Verification
    1. Complete runtime testbench to ensure filesystem behavior
  3. FUSE
    1. Upgrade FUSE to 3.13.0
    2. Remove requirement for redundant -o max_read=... argument
    3. Increase performance
      1. Convert all datastructures to support concurrent access
      2. Less restrictive locking on FUSE requests
      3. Test disabling FUSE_CAP_AUTO_INVAL_DATA with high attr_timeout in combination of using direct_io in open when offloading.
      4. Actually implement modification time so FUSE can do its work
  4. eBPF / uBPF
    1. Automate endian conversion for end users
    2. System to stall kernel execution to normalize for specific processors
    3. Fully implement stream and event kernels for both read / write operations
      1. read event
      2. write stream
      3. write event
      4. Optimize efficiency of write event
        1. Introduce two stage kernels (filesystem + user-program)
  5. CSx FS runtime (Filesystem agnostic kernels)
    1. Create dummy runtime service component
    2. Create ICD loader
    3. Create first official draft of filesystem helper API
      1. Implement fixed point operations for decimal math
  6. Create first official draft of CSx ABI
  7. SPDK / xNVME
    1. Create additional ZNS backend using xNVME
    2. Allocate larger SPDK buffers so multiple I/O requests can be queued

CMake Configuration

This section documents all configuration parameters that the CMake project exposes and how they influence the project. For more information about the CMake project see the report generated from the documentation folder. Below all parameters are listed along their default value and a brief description.

Parameter Default Use case
ENABLE_TESTS ON Enables unit tests and adds tests target
ENABLE_CODECOV OFF Produce code coverage report \w unit tests
ENABLE_DOCUMENTATION ON Produce code documentation using doxygen & LaTeX
ENABLE_PLAYGROUND OFF Enables playground targets
ENABLE_LEAK_TESTS OFF Add compile parameter for address sanitizer
IS_DEPLOYED OFF Indicate that CMake project is deployed in QEMU

For several parameters a more in depth explanation is required, primarily IS_DEPLOYED. This parameter is used as the CMake project is both used to compile QEMU and configure it as well as compile binaries to run inside QEMU. As a results, the CMake project needs to be able to identify if it is being executed outside of QEMU or not. This is what IS_DEPLOYED facilitates. Particularly, IS_DEPLOYED prevents the compilation of QEMU from source.

Licensing

This project is available under the MIT license, several limitations apply including:

  • Source files with an alternative author or license statement other than Dantali0n and MIT respectively.
  • Images subject to copyright or usage terms, such the VU and UvA logo.
  • CERN beamer template files by Jerome Belleman.
  • Configuration files that can't be subject to licensing such as doxygen.cnf or .vscode/launch.json

References