postgres hook 测试成功
This commit is contained in:
366
README.md
366
README.md
@@ -1,101 +1,126 @@
|
||||
# ZVFS
|
||||
|
||||
ZVFS 是一个基于 `SPDK Blobstore` 的轻量级用户态文件系统原型,
|
||||
通过 `LD_PRELOAD` 拦截常见 POSIX 文件 API,把 `/zvfs` 路径下的文件 I/O 转换为 Blob I/O。
|
||||
ZVFS 是一个基于 SPDK Blobstore 的用户态文件系统原型,目标是在不改业务代码的前提下,将常见 POSIX 文件 I/O 重定向到用户态高性能存储路径。
|
||||
核心思想是复用 Linux 文件管理机制(命名空间/目录/元数据),把文件数据平面放到 ZVFS。
|
||||
|
||||
目标是让上层应用尽量少改动地复用阻塞式文件接口,同时接近 SPDK 在低队列深度(QD≈1)场景的性能上限。
|
||||
- Hook 方式:`LD_PRELOAD`
|
||||
- 挂载前缀:`/zvfs`
|
||||
- 架构:多进程 Client + 独立 Daemon + SPDK
|
||||
- 语义:同步阻塞(请求-响应)
|
||||
|
||||
## 1. 项目结构
|
||||
---
|
||||
|
||||
## 1. 项目定位
|
||||
|
||||
这个项目重点不只是“把 I/O 跑起来”,而是把以下工程问题串起来:
|
||||
|
||||
1. 在多线程/多进程应用(RocksDB / PostgreSQL)里做透明接管。
|
||||
2. 保留 POSIX 语义(open/close/dup/fork/append/sync 等)。
|
||||
3. 把 SPDK 资源集中在 daemon 管理,避免每进程重复初始化。
|
||||
4. 在同步阻塞语义下,把协议、并发、错误处理做完整。
|
||||
|
||||
---
|
||||
|
||||
## 2. 架构设计
|
||||
|
||||

|
||||
|
||||
```text
|
||||
zvfs/
|
||||
├── src/
|
||||
│ ├── hook/ # POSIX API hook 层(open/read/write/...)
|
||||
│ ├── fs/ # inode/path/fd 运行时元数据管理
|
||||
│ ├── spdk_engine/ # SPDK Blobstore 封装
|
||||
│ ├── common/ # 对齐与缓冲区工具函数
|
||||
│ ├── config.h # 默认配置(JSON、bdev、xattr key 等)
|
||||
│ └── Makefile # 产出 libzvfs.so
|
||||
├── tests/
|
||||
│ ├── hook/ # hook API 语义测试
|
||||
│ ├── ioengine_test/ # Blob 引擎单元测试
|
||||
│ └── Makefile
|
||||
├── scripts/ # db_bench/hook 测试辅助脚本
|
||||
├── spdk/ # SPDK 子模块
|
||||
└── README.md
|
||||
App (PostgreSQL / RocksDB / db_bench / pgbench)
|
||||
-> LD_PRELOAD libzvfs.so
|
||||
-> Hook Client (POSIX 拦截 + 本地状态)
|
||||
-> Unix Domain Socket IPC (sync/blocking)
|
||||
-> zvfs_daemon
|
||||
-> 协议反序列化 + 分发
|
||||
-> metadata thread + io threads
|
||||
-> SPDK Blobstore / bdev
|
||||
```
|
||||
|
||||
## 2. 核心架构
|
||||
### 2.1 透传策略
|
||||
|
||||
### 2.1 分层
|
||||
**控制面复用 Linux,数据面走 ZVFS**。
|
||||
|
||||
当前实现:
|
||||
- 控制面(Linux 负责)
|
||||
- 目录/命名空间管理。
|
||||
- 文件节点生命周期与权限语义(create/open/close/stat/rename/unlink 等)。
|
||||
- 这些操作在 `/zvfs` 下也会真实执行系统调用,ZVFS 不重复实现目录树管理。
|
||||
|
||||
```text
|
||||
App (open/read/write/fstat/...)
|
||||
-> LD_PRELOAD Hook (src/hook)
|
||||
-> ZVFS Runtime Metadata (src/fs)
|
||||
-> SPDK Engine (src/spdk_engine)
|
||||
-> SPDK Blobstore
|
||||
-> bdev (Malloc/NVMe)
|
||||
```
|
||||
- 数据面(ZVFS 负责)
|
||||
- 文件内容读写由 blob 承载。
|
||||
- `read/write` 的真实数据路径不走 Linux 文件数据面,而走 ZVFS IPC + SPDK。
|
||||
|
||||
目标架构(Daemon + IPC):
|
||||
- 关键绑定方式
|
||||
- `create`:真实创建 Linux 文件 + 在 ZVFS 创建 blob + 把 `blob_id` 写入文件 xattr。
|
||||
- `open`:真实 `open` Linux 文件 + 读取 xattr 获取 `blob_id` + 在 ZVFS 打开 blob。
|
||||
- `write`:写入 blob 成功后,使用 `ftruncate` 同步 Linux 视角 `st_size`。
|
||||
|
||||
```text
|
||||
App (multi-process, e.g. PostgreSQL)
|
||||
-> LD_PRELOAD Hook Client
|
||||
-> IPC (Unix Domain Socket)
|
||||
-> zvfs daemon
|
||||
-> metadata manager
|
||||
-> SPDK worker threads
|
||||
-> SPDK Blobstore / bdev
|
||||
```
|
||||
- 工程收益
|
||||
- 直接减少约 50% 的实现工作量。
|
||||
- 兼容性更好,数据库可直接复用现有文件组织方式。
|
||||
|
||||
### 2.2 目标架构简版(HOOK 层 + daemon 层)
|
||||
### 2.2 分层职责
|
||||
|
||||
- `HOOK 层`
|
||||
- 拦截 `/zvfs` 路径的 POSIX API 并同步发起 IPC 请求。
|
||||
- 维护本地最小状态(如 `fd -> remote_handle_id`)。
|
||||
- 对非 `/zvfs` 路径继续透传到 `real_*` syscall(POSIX passthrough)。
|
||||
- `daemon 层`
|
||||
- 独占 SPDK 资源(`spdk_env/blobstore/spdk_thread`)。
|
||||
- 统一处理元数据与并发控制(path/inode/handle)。
|
||||
- 接收 IPC 请求并执行实际 I/O,返回 POSIX 风格结果与 errno。
|
||||
- Client(`src/hook` + `src/spdk_engine/io_engine.c`)
|
||||
- 判断是否 `/zvfs` 路径。
|
||||
- 拦截 POSIX API 并发起同步 IPC。
|
||||
- 维护最小本地状态(`fd_table/path_cache/inode_table`)。
|
||||
|
||||
### 2.3 元数据与数据映射
|
||||
- Daemon(`src/daemon`)
|
||||
- 独占 SPDK 环境与线程。
|
||||
- 统一执行 blob create/open/read/write/resize/sync/delete。
|
||||
- 统一管理 handle/ref_count。
|
||||
|
||||
- 文件数据:存储在 SPDK blob 中。
|
||||
- 文件到 blob 的映射:写入真实文件的 `xattr`(key: `user.zvfs.blob_id`)。
|
||||
- 运行时维护三张表:
|
||||
- `inode_table`:`blob_id -> inode`
|
||||
- `path_cache`:`path -> inode`
|
||||
- `fd_table`:`fd -> open_file`
|
||||
- 协议层(`src/proto/ipc_proto.*`)
|
||||
- 统一头 + per-op body。
|
||||
- Request Header:`opcode + payload_len`
|
||||
- Response Header:`opcode + status + payload_len`
|
||||
|
||||
### 2.4 当前实现的 I/O 路径要点
|
||||
### 2.3 为什么是同步阻塞 IPC
|
||||
|
||||
- `blob_read/blob_write` 统一走按 `io_unit_size` 对齐的 DMA 缓冲。
|
||||
- 非对齐写会触发读改写(RMW):先读对齐块,再覆盖局部写回。
|
||||
- `readv/writev` 在 hook 层会做聚合,减少多次 I/O 提交。
|
||||
- `fsync/fdatasync` 对 zvfs fd 调用 `blob_sync_md`;`sync_file_range` 在 zvfs 路径直接返回成功。
|
||||
- 业务侧兼容成本低,最容易对齐 POSIX 语义。
|
||||
- 调试路径更直接(一个请求对应一个响应)。
|
||||
- 先解决正确性和语义完整,再考虑异步化。
|
||||
|
||||
## 3. 构建
|
||||
---
|
||||
|
||||
> 下面命令以仓库根目录为 `/home/lian/try/zvfs` 为例。
|
||||
## 3. 功能覆盖(当前)
|
||||
|
||||
### 3.1 初始化并构建 SPDK
|
||||
### 3.1 已接管的核心接口
|
||||
|
||||
- 控制面协同:`open/openat/creat/rename/unlink/...`(真实 syscall + ZVFS 元数据协同)
|
||||
- 数据面接管:`read/write/pread/pwrite/readv/writev/pwritev`
|
||||
- 元数据:`fstat/lseek/ftruncate/fallocate`
|
||||
- 同步:`fsync/fdatasync/sync_file_range`
|
||||
- FD 语义:`dup/dup2/dup3/fork/close_range`
|
||||
|
||||
### 3.2 语义要点
|
||||
|
||||
- `write` 默认使用 `AUTO_GROW`。
|
||||
- 非 `AUTO_GROW` 写越界返回 `ENOSPC`。
|
||||
- `O_APPEND` 语义由 inode `logical_size` 保证。
|
||||
- `write` 成功后会同步更新 Linux 文件大小(`ftruncate`),保持 `stat` 视角一致。
|
||||
- `mmap` 对 zvfs fd 当前返回 `ENOTSUP`(非 zvfs fd 透传)。
|
||||
|
||||
### 3.3 映射关系
|
||||
|
||||
- 文件数据在 SPDK blob 中。
|
||||
- 文件到 blob 的映射通过 xattr:`user.zvfs.blob_id`。
|
||||
|
||||
---
|
||||
|
||||
## 4. 构建与运行
|
||||
|
||||
### 4.1 构建
|
||||
|
||||
```bash
|
||||
cd /home/lian/try/zvfs
|
||||
git submodule update --init --recursive
|
||||
|
||||
cd spdk
|
||||
./scripts/pkgdep.sh
|
||||
./configure --with-shared
|
||||
make -j"$(nproc)"
|
||||
```
|
||||
|
||||
### 3.2 构建 ZVFS 与测试
|
||||
|
||||
```bash
|
||||
cd /home/lian/try/zvfs
|
||||
make -j"$(nproc)"
|
||||
make test -j"$(nproc)"
|
||||
@@ -104,115 +129,158 @@ make test -j"$(nproc)"
|
||||
产物:
|
||||
|
||||
- `src/libzvfs.so`
|
||||
- `tests/bin/hook_api_test`
|
||||
- `tests/bin/ioengine_single_blob_test`
|
||||
- `tests/bin/ioengine_multi_blob_test`
|
||||
- `tests/bin/ioengine_same_blob_mt_test`
|
||||
- `src/daemon/zvfs_daemon`
|
||||
- `tests/bin/*`
|
||||
|
||||
## 4. 运行与验证
|
||||
### 4.2 启动 daemon
|
||||
|
||||
### 4.1 Hook API 语义测试
|
||||
```bash
|
||||
cd /home/lian/try/zvfs
|
||||
./src/daemon/zvfs_daemon
|
||||
```
|
||||
|
||||
可选环境变量:
|
||||
|
||||
- `SPDK_BDEV_NAME`
|
||||
- `SPDK_JSON_CONFIG`
|
||||
- `ZVFS_SOCKET_PATH` / `ZVFS_IPC_SOCKET_PATH`
|
||||
|
||||
### 4.3 快速验证
|
||||
|
||||
```bash
|
||||
mkdir -p /zvfs
|
||||
cd /home/lian/try/zvfs
|
||||
LD_PRELOAD=$PWD/src/libzvfs.so ZVFS_TEST_ROOT=/zvfs ./tests/bin/hook_api_test
|
||||
LD_PRELOAD=./src/libzvfs.so ZVFS_TEST_ROOT=/zvfs ./tests/bin/hook_api_test
|
||||
./tests/bin/ipc_zvfs_test
|
||||
```
|
||||
|
||||
覆盖点包括:
|
||||
---
|
||||
|
||||
- `open/openat/rename/unlink`
|
||||
- `read/write/pread/pwrite/readv/writev/pwritev`
|
||||
- `fstat/lseek/ftruncate`
|
||||
- `fcntl/ioctl(FIONREAD)`
|
||||
- `fsync/fdatasync`
|
||||
## 5. 性能测试
|
||||
|
||||
### 4.2 SPDK 引擎测试
|
||||
### 5.1 测试目标
|
||||
|
||||
```bash
|
||||
cd /home/lian/try/zvfs
|
||||
SPDK_BDEV_NAME=Malloc0 ./tests/bin/ioengine_single_blob_test
|
||||
SPDK_BDEV_NAME=Malloc0 ./tests/bin/ioengine_multi_blob_test
|
||||
SPDK_BDEV_NAME=Malloc0 ./tests/bin/ioengine_same_blob_mt_test
|
||||
```
|
||||
- 目标场景:低队列深度下阻塞 I/O 性能。
|
||||
- 对比对象:`spdk_nvme_perf` 与内核路径(`O_DIRECT`)。
|
||||
|
||||
## 5. 关键环境变量
|
||||
### 5.2 工具与脚本
|
||||
|
||||
- `SPDK_BDEV_NAME`:选择后端 bdev(默认 `Malloc0`)。
|
||||
- `ZVFS_BDEV`:`zvfs_ensure_init` 使用的 bdev 名称(未设置时使用 `config.h` 默认值)。
|
||||
- `SPDK_JSON_CONFIG`:覆盖默认 SPDK JSON 配置路径。
|
||||
- RocksDB:`scripts/run_db_bench_zvfs.sh`
|
||||
- PostgreSQL:`codex/run_pgbench_no_mmap.sh`
|
||||
|
||||
## 6. 性能说明(仅保留趋势)
|
||||
建议:
|
||||
|
||||
`README` 历史压测数据来自旧版本,不能直接当作当前版本结论;但可作为设计趋势参考:
|
||||
- PostgreSQL 测试时关闭 mmap 路径(shared memory 改为 sysv,避免 mmap 干扰)。
|
||||
|
||||
- 目标工作负载为阻塞 API,近似 `QD=1`。
|
||||
- 旧数据下,ZVFS 在 `QD=1` 时约达到 `spdk_nvme_perf` 的 `90%~95%`。
|
||||
- 4K:约 `95 MiB/s` vs `100 MiB/s`
|
||||
- 128K:约 `1662 MiB/s` vs `1843 MiB/s`
|
||||
- 相对同机 `O_DIRECT` 路径,旧数据写带宽约有 `2.2x~2.3x` 提升。
|
||||
- 非对齐写存在 RMW,吞吐明显下降(旧数据常见接近对齐写的一半)。
|
||||
### 5.3 历史结果
|
||||
|
||||
如果需要用于对外汇报,请重新在当前 commit 与固定硬件环境下复测。
|
||||
> 以下是历史版本结论,用于说明设计方向。
|
||||
|
||||
## 7. 当前限制
|
||||
- QD=1 下可达到 `spdk_nvme_perf` 的约 `90%~95%`。
|
||||
- 相对同机 `O_DIRECT`,顺序写吞吐可有约 `2.2x~2.3x` 提升。
|
||||
- 非对齐写因 RMW 开销,吞吐明显下降。
|
||||
|
||||
- 仅拦截 `/zvfs` 路径。
|
||||
- `mmap` 对 zvfs fd 当前返回 `ENOTSUP`(建议上层关闭 mmap 读写)。
|
||||
- `dup/dup2/dup3` 对 zvfs fd 当前返回 `ENOTSUP`。
|
||||
- `rename` 跨 `/zvfs` 与非 `/zvfs` 路径返回 `EXDEV`。
|
||||
- `fallocate(FALLOC_FL_PUNCH_HOLE)` 未实现。
|
||||
---
|
||||
|
||||
## 8. 后续建议
|
||||
## 6. 关键工程难点与踩坑复盘(重点)
|
||||
|
||||
- 补齐 mmap 路径(mmap table + 脏页回写)。
|
||||
- 完善多线程/高并发下的语义与压测基线。
|
||||
- 增加版本化 benchmark 报告,避免 README 中历史数据失真。
|
||||
这一节是项目最有价值的部分,记录了从“能跑”到“可用于数据库 workload”过程中遇到的关键问题。
|
||||
|
||||
## 9. Blob Store 血泪教训
|
||||
### 6.1 SPDK 元数据回调线程模型
|
||||
|
||||
### Owner Thread 绑定
|
||||
问题:把 metadata 操作随意派发到任意线程,容易卡住或回调不回来。
|
||||
|
||||
blobstore内部负责并发控制,让所有metadata操作都在一个线程上执行,回调固定绑定给创建blobstore的线程。所以多线程模型下不是send给谁谁就能poll到回调的。
|
||||
根因:
|
||||
|
||||
正确架构:
|
||||
```
|
||||
metadata thread
|
||||
spdk_bs_load()
|
||||
resize
|
||||
delete
|
||||
sync_md
|
||||
- blobstore metadata 操作与创建线程/通道绑定。
|
||||
- `resize/delete/unload` 内部会走 `spdk_for_each_channel()` barrier。
|
||||
|
||||
worker thread
|
||||
blob_io_read
|
||||
blob_io_write
|
||||
```
|
||||
修复策略:
|
||||
|
||||
### spdk_for_each_channel() Barrier
|
||||
某些 metadata 操作非常慢:
|
||||
```
|
||||
resize
|
||||
delete
|
||||
unload
|
||||
snapshot
|
||||
```
|
||||
这些操作内部会调用:spdk_for_each_channel()
|
||||
- 明确 metadata thread 和 io thread 分工。
|
||||
- 保证持有 channel 的线程持续 poll。
|
||||
- 线程退出时严格释放 channel,避免 barrier 永久等待。
|
||||
|
||||
语义:在所有 io_channel 所属线程执行 callback
|
||||
### 6.2 Daemon 卡住(请求已收但流程停滞)
|
||||
|
||||
类似
|
||||
```c
|
||||
for each channel:
|
||||
send_msg(channel->thread)
|
||||
```
|
||||
现象:请求日志打印到一半后卡住,压测进程阻塞。
|
||||
|
||||
#### 问题1:持有 Channel 的 Thread 不 poll
|
||||
如果所属线程不poll,就会卡住。
|
||||
#### 问题2:线程退出 Channel 没有释放
|
||||
永远卡住
|
||||
根因:
|
||||
|
||||
### IO 操作的回调行为与 metadata 操作不同
|
||||
spdk_blob_io_read / spdk_blob_io_write 的回调,是通过传入的 io_channel 投递的,回调回到分配该 channel 的 thread。
|
||||
- UDS 流式读取没有完整分帧处理。
|
||||
- 固定小缓冲导致回包序列化失败(`serialize resp failed`)。
|
||||
|
||||
### 超时任务
|
||||
设置超时就免不了超时后回调成功执行,超时后回调仍会触发,存在 UAF 风险
|
||||
修复:
|
||||
|
||||
- 改为连接级接收缓冲,循环读到 `EAGAIN`。
|
||||
- 按“完整包”消费,残包保留到下一轮。
|
||||
- 回包序列化改为动态缓冲 + `send_all`。
|
||||
|
||||
### 6.3 PostgreSQL Tablespace 无法命中 Hook
|
||||
|
||||
现象:建表空间后文件操作路径是 `pg_tblspc/...`,daemon 无请求日志。
|
||||
|
||||
根因:
|
||||
|
||||
- PostgreSQL 通过符号链接访问 tablespace。
|
||||
- 仅按字符串前缀 `/zvfs` 判断会漏判。
|
||||
|
||||
修复:
|
||||
|
||||
- 路径判定增加 `realpath()` 后再判断。
|
||||
- `O_CREAT` 且文件尚不存在时,使用 `realpath(parent)+basename` 判定。
|
||||
|
||||
### 6.4 PostgreSQL 报 `Permission denied`(跨用户连接 daemon)
|
||||
|
||||
现象:`CREATE DATABASE ... TABLESPACE ...` 报权限错误。
|
||||
|
||||
根因:
|
||||
|
||||
- daemon 由 root 启动,UDS 文件权限受 umask 影响。
|
||||
- postgres 用户无法 `connect(/tmp/zvfs.sock)`。
|
||||
|
||||
修复:
|
||||
|
||||
- daemon `bind` 后显式 `chmod(socket, 0666)`。
|
||||
|
||||
### 6.5 PostgreSQL 报 `Message too long`
|
||||
|
||||
现象:部分 SQL(尤其 `CREATE DATABASE` 路径)失败,错误为 `Message too long`。
|
||||
|
||||
根因:
|
||||
|
||||
- 不是 daemon 解析失败,而是 client 序列化请求时超出 `ZVFS_IPC_BUF_SIZE`。
|
||||
- 当前 hook 会把 `writev` 聚合成一次大写请求,容易触发上限。
|
||||
|
||||
当前处理:
|
||||
|
||||
- 将 `ZVFS_IPC_BUF_SIZE` 提高到 `16MB`(`src/common/config.h`)。
|
||||
|
||||
后续优化方向:
|
||||
|
||||
- 在 client `blob_write_ex` 做透明分片发送(保持同步阻塞语义)。
|
||||
|
||||
### 6.6 dup/dup2/fork 语义一致性
|
||||
|
||||
问题:多个 fd 指向同一 open file description 时,如何保证 handle 引用计数一致。
|
||||
|
||||
方案:
|
||||
|
||||
- 协议增加 `ADD_REF` / `ADD_REF_BATCH`。
|
||||
- 在 hook 中对 `dup/dup2/dup3/fork` 明确执行引用增加。
|
||||
- `close_range` 增加边界保护(避免 `UINT_MAX` 场景死循环)。
|
||||
|
||||
---
|
||||
|
||||
## 7. 当前限制与下一步
|
||||
|
||||
### 7.1 当前限制
|
||||
|
||||
- 单请求仍受 `ZVFS_IPC_BUF_SIZE` 约束。
|
||||
- `mmap` 暂不支持 zvfs fd。
|
||||
- `ADD_REF_BATCH` 当前优先功能,不保证原子性。
|
||||
|
||||
### 7.2 下一步计划
|
||||
|
||||
1. 实现 `WRITE` 客户端透明分片,彻底消除单包上限问题。
|
||||
2. 持续完善 PostgreSQL 场景(tablespace + pgbench + crash/restart)。
|
||||
3. 补齐更系统的性能复测(固定硬件、固定参数、全量报告)。
|
||||
|
||||
751
postgresql.conf
Normal file
751
postgresql.conf
Normal file
@@ -0,0 +1,751 @@
|
||||
# -----------------------------
|
||||
# PostgreSQL configuration file
|
||||
# -----------------------------
|
||||
#
|
||||
# This file consists of lines of the form:
|
||||
#
|
||||
# name = value
|
||||
#
|
||||
# (The "=" is optional.) Whitespace may be used. Comments are introduced with
|
||||
# "#" anywhere on a line. The complete list of parameter names and allowed
|
||||
# values can be found in the PostgreSQL documentation.
|
||||
#
|
||||
# The commented-out settings shown in this file represent the default values.
|
||||
# Re-commenting a setting is NOT sufficient to revert it to the default value;
|
||||
# you need to reload the server.
|
||||
#
|
||||
# This file is read on server startup and when the server receives a SIGHUP
|
||||
# signal. If you edit the file on a running system, you have to SIGHUP the
|
||||
# server for the changes to take effect, run "pg_ctl reload", or execute
|
||||
# "SELECT pg_reload_conf()". Some parameters, which are marked below,
|
||||
# require a server shutdown and restart to take effect.
|
||||
#
|
||||
# Any parameter can also be given as a command-line option to the server, e.g.,
|
||||
# "postgres -c log_connections=on". Some parameters can be changed at run time
|
||||
# with the "SET" SQL command.
|
||||
#
|
||||
# Memory units: B = bytes Time units: us = microseconds
|
||||
# kB = kilobytes ms = milliseconds
|
||||
# MB = megabytes s = seconds
|
||||
# GB = gigabytes min = minutes
|
||||
# TB = terabytes h = hours
|
||||
# d = days
|
||||
|
||||
|
||||
#------------------------------------------------------------------------------
|
||||
# FILE LOCATIONS
|
||||
#------------------------------------------------------------------------------
|
||||
|
||||
# The default values of these variables are driven from the -D command-line
|
||||
# option or PGDATA environment variable, represented here as ConfigDir.
|
||||
|
||||
#data_directory = 'ConfigDir' # use data in another directory
|
||||
# (change requires restart)
|
||||
#hba_file = 'ConfigDir/pg_hba.conf' # host-based authentication file
|
||||
# (change requires restart)
|
||||
#ident_file = 'ConfigDir/pg_ident.conf' # ident configuration file
|
||||
# (change requires restart)
|
||||
|
||||
# If external_pid_file is not explicitly set, no extra PID file is written.
|
||||
#external_pid_file = '' # write an extra PID file
|
||||
# (change requires restart)
|
||||
|
||||
|
||||
#------------------------------------------------------------------------------
|
||||
# CONNECTIONS AND AUTHENTICATION
|
||||
#------------------------------------------------------------------------------
|
||||
|
||||
# - Connection Settings -
|
||||
|
||||
#listen_addresses = 'localhost' # what IP address(es) to listen on;
|
||||
# comma-separated list of addresses;
|
||||
# defaults to 'localhost'; use '*' for all
|
||||
# (change requires restart)
|
||||
#port = 5432 # (change requires restart)
|
||||
max_connections = 100 # (change requires restart)
|
||||
#superuser_reserved_connections = 3 # (change requires restart)
|
||||
#unix_socket_directories = '/var/run/postgresql' # comma-separated list of directories
|
||||
# (change requires restart)
|
||||
#unix_socket_group = '' # (change requires restart)
|
||||
#unix_socket_permissions = 0777 # begin with 0 to use octal notation
|
||||
# (change requires restart)
|
||||
#bonjour = off # advertise server via Bonjour
|
||||
# (change requires restart)
|
||||
#bonjour_name = '' # defaults to the computer name
|
||||
# (change requires restart)
|
||||
|
||||
# - TCP settings -
|
||||
# see "man 7 tcp" for details
|
||||
|
||||
#tcp_keepalives_idle = 0 # TCP_KEEPIDLE, in seconds;
|
||||
# 0 selects the system default
|
||||
#tcp_keepalives_interval = 0 # TCP_KEEPINTVL, in seconds;
|
||||
# 0 selects the system default
|
||||
#tcp_keepalives_count = 0 # TCP_KEEPCNT;
|
||||
# 0 selects the system default
|
||||
#tcp_user_timeout = 0 # TCP_USER_TIMEOUT, in milliseconds;
|
||||
# 0 selects the system default
|
||||
|
||||
# - Authentication -
|
||||
|
||||
#authentication_timeout = 1min # 1s-600s
|
||||
#password_encryption = md5 # md5 or scram-sha-256
|
||||
#db_user_namespace = off
|
||||
|
||||
# GSSAPI using Kerberos
|
||||
#krb_server_keyfile = 'FILE:${sysconfdir}/krb5.keytab'
|
||||
#krb_caseins_users = off
|
||||
|
||||
# - SSL -
|
||||
|
||||
#ssl = off
|
||||
#ssl_ca_file = ''
|
||||
#ssl_cert_file = 'server.crt'
|
||||
#ssl_crl_file = ''
|
||||
#ssl_key_file = 'server.key'
|
||||
#ssl_ciphers = 'HIGH:MEDIUM:+3DES:!aNULL' # allowed SSL ciphers
|
||||
#ssl_prefer_server_ciphers = on
|
||||
#ssl_ecdh_curve = 'prime256v1'
|
||||
#ssl_min_protocol_version = 'TLSv1'
|
||||
#ssl_max_protocol_version = ''
|
||||
#ssl_dh_params_file = ''
|
||||
#ssl_passphrase_command = ''
|
||||
#ssl_passphrase_command_supports_reload = off
|
||||
|
||||
|
||||
#------------------------------------------------------------------------------
|
||||
# RESOURCE USAGE (except WAL)
|
||||
#------------------------------------------------------------------------------
|
||||
|
||||
# - Memory -
|
||||
|
||||
shared_buffers = 128MB # min 128kB
|
||||
# (change requires restart)
|
||||
#huge_pages = try # on, off, or try
|
||||
# (change requires restart)
|
||||
#temp_buffers = 8MB # min 800kB
|
||||
#max_prepared_transactions = 0 # zero disables the feature
|
||||
# (change requires restart)
|
||||
# Caution: it is not advisable to set max_prepared_transactions nonzero unless
|
||||
# you actively intend to use prepared transactions.
|
||||
#work_mem = 4MB # min 64kB
|
||||
#maintenance_work_mem = 64MB # min 1MB
|
||||
#autovacuum_work_mem = -1 # min 1MB, or -1 to use maintenance_work_mem
|
||||
#max_stack_depth = 2MB # min 100kB
|
||||
shared_memory_type = sysv # the default is the first option
|
||||
# supported by the operating system:
|
||||
# mmap
|
||||
# sysv
|
||||
# windows
|
||||
# (change requires restart)
|
||||
dynamic_shared_memory_type = sysv # the default is the first option
|
||||
# supported by the operating system:
|
||||
# posix
|
||||
# sysv
|
||||
# windows
|
||||
# mmap
|
||||
# (change requires restart)
|
||||
|
||||
# - Disk -
|
||||
|
||||
#temp_file_limit = -1 # limits per-process temp file space
|
||||
# in kB, or -1 for no limit
|
||||
|
||||
# - Kernel Resources -
|
||||
|
||||
#max_files_per_process = 1000 # min 25
|
||||
# (change requires restart)
|
||||
|
||||
# - Cost-Based Vacuum Delay -
|
||||
|
||||
#vacuum_cost_delay = 0 # 0-100 milliseconds (0 disables)
|
||||
#vacuum_cost_page_hit = 1 # 0-10000 credits
|
||||
#vacuum_cost_page_miss = 10 # 0-10000 credits
|
||||
#vacuum_cost_page_dirty = 20 # 0-10000 credits
|
||||
#vacuum_cost_limit = 200 # 1-10000 credits
|
||||
|
||||
# - Background Writer -
|
||||
|
||||
#bgwriter_delay = 200ms # 10-10000ms between rounds
|
||||
#bgwriter_lru_maxpages = 100 # max buffers written/round, 0 disables
|
||||
#bgwriter_lru_multiplier = 2.0 # 0-10.0 multiplier on buffers scanned/round
|
||||
#bgwriter_flush_after = 512kB # measured in pages, 0 disables
|
||||
|
||||
# - Asynchronous Behavior -
|
||||
|
||||
#effective_io_concurrency = 1 # 1-1000; 0 disables prefetching
|
||||
#max_worker_processes = 8 # (change requires restart)
|
||||
#max_parallel_maintenance_workers = 2 # limited by max_parallel_workers
|
||||
#max_parallel_workers_per_gather = 2 # limited by max_parallel_workers
|
||||
#parallel_leader_participation = on
|
||||
#max_parallel_workers = 8 # number of max_worker_processes that
|
||||
# can be used in parallel operations
|
||||
#old_snapshot_threshold = -1 # 1min-60d; -1 disables; 0 is immediate
|
||||
# (change requires restart)
|
||||
#backend_flush_after = 0 # measured in pages, 0 disables
|
||||
|
||||
|
||||
#------------------------------------------------------------------------------
|
||||
# WRITE-AHEAD LOG
|
||||
#------------------------------------------------------------------------------
|
||||
|
||||
# - Settings -
|
||||
|
||||
#wal_level = replica # minimal, replica, or logical
|
||||
# (change requires restart)
|
||||
#fsync = on # flush data to disk for crash safety
|
||||
# (turning this off can cause
|
||||
# unrecoverable data corruption)
|
||||
#synchronous_commit = on # synchronization level;
|
||||
# off, local, remote_write, remote_apply, or on
|
||||
#wal_sync_method = fsync # the default is the first option
|
||||
# supported by the operating system:
|
||||
# open_datasync
|
||||
# fdatasync (default on Linux and FreeBSD)
|
||||
# fsync
|
||||
# fsync_writethrough
|
||||
# open_sync
|
||||
#full_page_writes = on # recover from partial page writes
|
||||
#wal_compression = off # enable compression of full-page writes
|
||||
#wal_log_hints = off # also do full page writes of non-critical updates
|
||||
# (change requires restart)
|
||||
#wal_init_zero = on # zero-fill new WAL files
|
||||
#wal_recycle = on # recycle WAL files
|
||||
#wal_buffers = -1 # min 32kB, -1 sets based on shared_buffers
|
||||
# (change requires restart)
|
||||
#wal_writer_delay = 200ms # 1-10000 milliseconds
|
||||
#wal_writer_flush_after = 1MB # measured in pages, 0 disables
|
||||
|
||||
#commit_delay = 0 # range 0-100000, in microseconds
|
||||
#commit_siblings = 5 # range 1-1000
|
||||
|
||||
# - Checkpoints -
|
||||
|
||||
#checkpoint_timeout = 5min # range 30s-1d
|
||||
max_wal_size = 1GB
|
||||
min_wal_size = 80MB
|
||||
#checkpoint_completion_target = 0.5 # checkpoint target duration, 0.0 - 1.0
|
||||
#checkpoint_flush_after = 256kB # measured in pages, 0 disables
|
||||
#checkpoint_warning = 30s # 0 disables
|
||||
|
||||
# - Archiving -
|
||||
|
||||
#archive_mode = off # enables archiving; off, on, or always
|
||||
# (change requires restart)
|
||||
#archive_command = '' # command to use to archive a logfile segment
|
||||
# placeholders: %p = path of file to archive
|
||||
# %f = file name only
|
||||
# e.g. 'test ! -f /mnt/server/archivedir/%f && cp %p /mnt/server/archivedir/%f'
|
||||
#archive_timeout = 0 # force a logfile segment switch after this
|
||||
# number of seconds; 0 disables
|
||||
|
||||
# - Archive Recovery -
|
||||
|
||||
# These are only used in recovery mode.
|
||||
|
||||
#restore_command = '' # command to use to restore an archived logfile segment
|
||||
# placeholders: %p = path of file to restore
|
||||
# %f = file name only
|
||||
# e.g. 'cp /mnt/server/archivedir/%f %p'
|
||||
# (change requires restart)
|
||||
#archive_cleanup_command = '' # command to execute at every restartpoint
|
||||
#recovery_end_command = '' # command to execute at completion of recovery
|
||||
|
||||
# - Recovery Target -
|
||||
|
||||
# Set these only when performing a targeted recovery.
|
||||
|
||||
#recovery_target = '' # 'immediate' to end recovery as soon as a
|
||||
# consistent state is reached
|
||||
# (change requires restart)
|
||||
#recovery_target_name = '' # the named restore point to which recovery will proceed
|
||||
# (change requires restart)
|
||||
#recovery_target_time = '' # the time stamp up to which recovery will proceed
|
||||
# (change requires restart)
|
||||
#recovery_target_xid = '' # the transaction ID up to which recovery will proceed
|
||||
# (change requires restart)
|
||||
#recovery_target_lsn = '' # the WAL LSN up to which recovery will proceed
|
||||
# (change requires restart)
|
||||
#recovery_target_inclusive = on # Specifies whether to stop:
|
||||
# just after the specified recovery target (on)
|
||||
# just before the recovery target (off)
|
||||
# (change requires restart)
|
||||
#recovery_target_timeline = 'latest' # 'current', 'latest', or timeline ID
|
||||
# (change requires restart)
|
||||
#recovery_target_action = 'pause' # 'pause', 'promote', 'shutdown'
|
||||
# (change requires restart)
|
||||
|
||||
|
||||
#------------------------------------------------------------------------------
|
||||
# REPLICATION
|
||||
#------------------------------------------------------------------------------
|
||||
|
||||
# - Sending Servers -
|
||||
|
||||
# Set these on the master and on any standby that will send replication data.
|
||||
|
||||
#max_wal_senders = 10 # max number of walsender processes
|
||||
# (change requires restart)
|
||||
#wal_keep_segments = 0 # in logfile segments; 0 disables
|
||||
#wal_sender_timeout = 60s # in milliseconds; 0 disables
|
||||
|
||||
#max_replication_slots = 10 # max number of replication slots
|
||||
# (change requires restart)
|
||||
#track_commit_timestamp = off # collect timestamp of transaction commit
|
||||
# (change requires restart)
|
||||
|
||||
# - Master Server -
|
||||
|
||||
# These settings are ignored on a standby server.
|
||||
|
||||
#synchronous_standby_names = '' # standby servers that provide sync rep
|
||||
# method to choose sync standbys, number of sync standbys,
|
||||
# and comma-separated list of application_name
|
||||
# from standby(s); '*' = all
|
||||
#vacuum_defer_cleanup_age = 0 # number of xacts by which cleanup is delayed
|
||||
|
||||
# - Standby Servers -
|
||||
|
||||
# These settings are ignored on a master server.
|
||||
|
||||
#primary_conninfo = '' # connection string to sending server
|
||||
# (change requires restart)
|
||||
#primary_slot_name = '' # replication slot on sending server
|
||||
# (change requires restart)
|
||||
#promote_trigger_file = '' # file name whose presence ends recovery
|
||||
#hot_standby = on # "off" disallows queries during recovery
|
||||
# (change requires restart)
|
||||
#max_standby_archive_delay = 30s # max delay before canceling queries
|
||||
# when reading WAL from archive;
|
||||
# -1 allows indefinite delay
|
||||
#max_standby_streaming_delay = 30s # max delay before canceling queries
|
||||
# when reading streaming WAL;
|
||||
# -1 allows indefinite delay
|
||||
#wal_receiver_status_interval = 10s # send replies at least this often
|
||||
# 0 disables
|
||||
#hot_standby_feedback = off # send info from standby to prevent
|
||||
# query conflicts
|
||||
#wal_receiver_timeout = 60s # time that receiver waits for
|
||||
# communication from master
|
||||
# in milliseconds; 0 disables
|
||||
#wal_retrieve_retry_interval = 5s # time to wait before retrying to
|
||||
# retrieve WAL after a failed attempt
|
||||
#recovery_min_apply_delay = 0 # minimum delay for applying changes during recovery
|
||||
|
||||
# - Subscribers -
|
||||
|
||||
# These settings are ignored on a publisher.
|
||||
|
||||
#max_logical_replication_workers = 4 # taken from max_worker_processes
|
||||
# (change requires restart)
|
||||
#max_sync_workers_per_subscription = 2 # taken from max_logical_replication_workers
|
||||
|
||||
|
||||
#------------------------------------------------------------------------------
|
||||
# QUERY TUNING
|
||||
#------------------------------------------------------------------------------
|
||||
|
||||
# - Planner Method Configuration -
|
||||
|
||||
#enable_bitmapscan = on
|
||||
#enable_hashagg = on
|
||||
#enable_hashjoin = on
|
||||
#enable_indexscan = on
|
||||
#enable_indexonlyscan = on
|
||||
#enable_material = on
|
||||
#enable_mergejoin = on
|
||||
#enable_nestloop = on
|
||||
#enable_parallel_append = on
|
||||
#enable_seqscan = on
|
||||
#enable_sort = on
|
||||
#enable_tidscan = on
|
||||
#enable_partitionwise_join = off
|
||||
#enable_partitionwise_aggregate = off
|
||||
#enable_parallel_hash = on
|
||||
#enable_partition_pruning = on
|
||||
|
||||
# - Planner Cost Constants -
|
||||
|
||||
#seq_page_cost = 1.0 # measured on an arbitrary scale
|
||||
#random_page_cost = 4.0 # same scale as above
|
||||
#cpu_tuple_cost = 0.01 # same scale as above
|
||||
#cpu_index_tuple_cost = 0.005 # same scale as above
|
||||
#cpu_operator_cost = 0.0025 # same scale as above
|
||||
#parallel_tuple_cost = 0.1 # same scale as above
|
||||
#parallel_setup_cost = 1000.0 # same scale as above
|
||||
|
||||
#jit_above_cost = 100000 # perform JIT compilation if available
|
||||
# and query more expensive than this;
|
||||
# -1 disables
|
||||
#jit_inline_above_cost = 500000 # inline small functions if query is
|
||||
# more expensive than this; -1 disables
|
||||
#jit_optimize_above_cost = 500000 # use expensive JIT optimizations if
|
||||
# query is more expensive than this;
|
||||
# -1 disables
|
||||
|
||||
#min_parallel_table_scan_size = 8MB
|
||||
#min_parallel_index_scan_size = 512kB
|
||||
#effective_cache_size = 4GB
|
||||
|
||||
# - Genetic Query Optimizer -
|
||||
|
||||
#geqo = on
|
||||
#geqo_threshold = 12
|
||||
#geqo_effort = 5 # range 1-10
|
||||
#geqo_pool_size = 0 # selects default based on effort
|
||||
#geqo_generations = 0 # selects default based on effort
|
||||
#geqo_selection_bias = 2.0 # range 1.5-2.0
|
||||
#geqo_seed = 0.0 # range 0.0-1.0
|
||||
|
||||
# - Other Planner Options -
|
||||
|
||||
#default_statistics_target = 100 # range 1-10000
|
||||
#constraint_exclusion = partition # on, off, or partition
|
||||
#cursor_tuple_fraction = 0.1 # range 0.0-1.0
|
||||
#from_collapse_limit = 8
|
||||
#join_collapse_limit = 8 # 1 disables collapsing of explicit
|
||||
# JOIN clauses
|
||||
#force_parallel_mode = off
|
||||
#jit = on # allow JIT compilation
|
||||
#plan_cache_mode = auto # auto, force_generic_plan or
|
||||
# force_custom_plan
|
||||
|
||||
|
||||
#------------------------------------------------------------------------------
|
||||
# REPORTING AND LOGGING
|
||||
#------------------------------------------------------------------------------
|
||||
|
||||
# - Where to Log -
|
||||
|
||||
#log_destination = 'stderr' # Valid values are combinations of
|
||||
# stderr, csvlog, syslog, and eventlog,
|
||||
# depending on platform. csvlog
|
||||
# requires logging_collector to be on.
|
||||
|
||||
# This is used when logging to stderr:
|
||||
#logging_collector = off # Enable capturing of stderr and csvlog
|
||||
# into log files. Required to be on for
|
||||
# csvlogs.
|
||||
# (change requires restart)
|
||||
|
||||
# These are only used if logging_collector is on:
|
||||
#log_directory = 'log' # directory where log files are written,
|
||||
# can be absolute or relative to PGDATA
|
||||
#log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log' # log file name pattern,
|
||||
# can include strftime() escapes
|
||||
#log_file_mode = 0600 # creation mode for log files,
|
||||
# begin with 0 to use octal notation
|
||||
#log_truncate_on_rotation = off # If on, an existing log file with the
|
||||
# same name as the new log file will be
|
||||
# truncated rather than appended to.
|
||||
# But such truncation only occurs on
|
||||
# time-driven rotation, not on restarts
|
||||
# or size-driven rotation. Default is
|
||||
# off, meaning append to existing files
|
||||
# in all cases.
|
||||
#log_rotation_age = 1d # Automatic rotation of logfiles will
|
||||
# happen after that time. 0 disables.
|
||||
#log_rotation_size = 10MB # Automatic rotation of logfiles will
|
||||
# happen after that much log output.
|
||||
# 0 disables.
|
||||
|
||||
# These are relevant when logging to syslog:
|
||||
#syslog_facility = 'LOCAL0'
|
||||
#syslog_ident = 'postgres'
|
||||
#syslog_sequence_numbers = on
|
||||
#syslog_split_messages = on
|
||||
|
||||
# This is only relevant when logging to eventlog (win32):
|
||||
# (change requires restart)
|
||||
#event_source = 'PostgreSQL'
|
||||
|
||||
# - When to Log -
|
||||
|
||||
#log_min_messages = warning # values in order of decreasing detail:
|
||||
# debug5
|
||||
# debug4
|
||||
# debug3
|
||||
# debug2
|
||||
# debug1
|
||||
# info
|
||||
# notice
|
||||
# warning
|
||||
# error
|
||||
# log
|
||||
# fatal
|
||||
# panic
|
||||
|
||||
#log_min_error_statement = error # values in order of decreasing detail:
|
||||
# debug5
|
||||
# debug4
|
||||
# debug3
|
||||
# debug2
|
||||
# debug1
|
||||
# info
|
||||
# notice
|
||||
# warning
|
||||
# error
|
||||
# log
|
||||
# fatal
|
||||
# panic (effectively off)
|
||||
|
||||
#log_min_duration_statement = -1 # -1 is disabled, 0 logs all statements
|
||||
# and their durations, > 0 logs only
|
||||
# statements running at least this number
|
||||
# of milliseconds
|
||||
|
||||
#log_transaction_sample_rate = 0.0 # Fraction of transactions whose statements
|
||||
# are logged regardless of their duration. 1.0 logs all
|
||||
# statements from all transactions, 0.0 never logs.
|
||||
|
||||
# - What to Log -
|
||||
|
||||
#debug_print_parse = off
|
||||
#debug_print_rewritten = off
|
||||
#debug_print_plan = off
|
||||
#debug_pretty_print = on
|
||||
#log_checkpoints = off
|
||||
#log_connections = off
|
||||
#log_disconnections = off
|
||||
#log_duration = off
|
||||
#log_error_verbosity = default # terse, default, or verbose messages
|
||||
#log_hostname = off
|
||||
#log_line_prefix = '%m [%p] ' # special values:
|
||||
# %a = application name
|
||||
# %u = user name
|
||||
# %d = database name
|
||||
# %r = remote host and port
|
||||
# %h = remote host
|
||||
# %p = process ID
|
||||
# %t = timestamp without milliseconds
|
||||
# %m = timestamp with milliseconds
|
||||
# %n = timestamp with milliseconds (as a Unix epoch)
|
||||
# %i = command tag
|
||||
# %e = SQL state
|
||||
# %c = session ID
|
||||
# %l = session line number
|
||||
# %s = session start timestamp
|
||||
# %v = virtual transaction ID
|
||||
# %x = transaction ID (0 if none)
|
||||
# %q = stop here in non-session
|
||||
# processes
|
||||
# %% = '%'
|
||||
# e.g. '<%u%%%d> '
|
||||
#log_lock_waits = off # log lock waits >= deadlock_timeout
|
||||
#log_statement = 'none' # none, ddl, mod, all
|
||||
#log_replication_commands = off
|
||||
#log_temp_files = -1 # log temporary files equal or larger
|
||||
# than the specified size in kilobytes;
|
||||
# -1 disables, 0 logs all temp files
|
||||
log_timezone = 'Etc/UTC'
|
||||
|
||||
#------------------------------------------------------------------------------
|
||||
# PROCESS TITLE
|
||||
#------------------------------------------------------------------------------
|
||||
|
||||
#cluster_name = '' # added to process titles if nonempty
|
||||
# (change requires restart)
|
||||
#update_process_title = on
|
||||
|
||||
|
||||
#------------------------------------------------------------------------------
|
||||
# STATISTICS
|
||||
#------------------------------------------------------------------------------
|
||||
|
||||
# - Query and Index Statistics Collector -
|
||||
|
||||
#track_activities = on
|
||||
#track_counts = on
|
||||
#track_io_timing = off
|
||||
#track_functions = none # none, pl, all
|
||||
#track_activity_query_size = 1024 # (change requires restart)
|
||||
#stats_temp_directory = 'pg_stat_tmp'
|
||||
|
||||
|
||||
# - Monitoring -
|
||||
|
||||
#log_parser_stats = off
|
||||
#log_planner_stats = off
|
||||
#log_executor_stats = off
|
||||
#log_statement_stats = off
|
||||
|
||||
|
||||
#------------------------------------------------------------------------------
|
||||
# AUTOVACUUM
|
||||
#------------------------------------------------------------------------------
|
||||
|
||||
#autovacuum = on # Enable autovacuum subprocess? 'on'
|
||||
# requires track_counts to also be on.
|
||||
#log_autovacuum_min_duration = -1 # -1 disables, 0 logs all actions and
|
||||
# their durations, > 0 logs only
|
||||
# actions running at least this number
|
||||
# of milliseconds.
|
||||
#autovacuum_max_workers = 3 # max number of autovacuum subprocesses
|
||||
# (change requires restart)
|
||||
#autovacuum_naptime = 1min # time between autovacuum runs
|
||||
#autovacuum_vacuum_threshold = 50 # min number of row updates before
|
||||
# vacuum
|
||||
#autovacuum_analyze_threshold = 50 # min number of row updates before
|
||||
# analyze
|
||||
#autovacuum_vacuum_scale_factor = 0.2 # fraction of table size before vacuum
|
||||
#autovacuum_analyze_scale_factor = 0.1 # fraction of table size before analyze
|
||||
#autovacuum_freeze_max_age = 200000000 # maximum XID age before forced vacuum
|
||||
# (change requires restart)
|
||||
#autovacuum_multixact_freeze_max_age = 400000000 # maximum multixact age
|
||||
# before forced vacuum
|
||||
# (change requires restart)
|
||||
#autovacuum_vacuum_cost_delay = 2ms # default vacuum cost delay for
|
||||
# autovacuum, in milliseconds;
|
||||
# -1 means use vacuum_cost_delay
|
||||
#autovacuum_vacuum_cost_limit = -1 # default vacuum cost limit for
|
||||
# autovacuum, -1 means use
|
||||
# vacuum_cost_limit
|
||||
|
||||
|
||||
#------------------------------------------------------------------------------
|
||||
# CLIENT CONNECTION DEFAULTS
|
||||
#------------------------------------------------------------------------------
|
||||
|
||||
# - Statement Behavior -
|
||||
|
||||
#client_min_messages = notice # values in order of decreasing detail:
|
||||
# debug5
|
||||
# debug4
|
||||
# debug3
|
||||
# debug2
|
||||
# debug1
|
||||
# log
|
||||
# notice
|
||||
# warning
|
||||
# error
|
||||
#search_path = '"$user", public' # schema names
|
||||
#row_security = on
|
||||
#default_tablespace = '' # a tablespace name, '' uses the default
|
||||
#temp_tablespaces = '' # a list of tablespace names, '' uses
|
||||
# only default tablespace
|
||||
#default_table_access_method = 'heap'
|
||||
#check_function_bodies = on
|
||||
#default_transaction_isolation = 'read committed'
|
||||
#default_transaction_read_only = off
|
||||
#default_transaction_deferrable = off
|
||||
#session_replication_role = 'origin'
|
||||
#statement_timeout = 0 # in milliseconds, 0 is disabled
|
||||
#lock_timeout = 0 # in milliseconds, 0 is disabled
|
||||
#idle_in_transaction_session_timeout = 0 # in milliseconds, 0 is disabled
|
||||
#vacuum_freeze_min_age = 50000000
|
||||
#vacuum_freeze_table_age = 150000000
|
||||
#vacuum_multixact_freeze_min_age = 5000000
|
||||
#vacuum_multixact_freeze_table_age = 150000000
|
||||
#vacuum_cleanup_index_scale_factor = 0.1 # fraction of total number of tuples
|
||||
# before index cleanup, 0 always performs
|
||||
# index cleanup
|
||||
#bytea_output = 'hex' # hex, escape
|
||||
#xmlbinary = 'base64'
|
||||
#xmloption = 'content'
|
||||
#gin_fuzzy_search_limit = 0
|
||||
#gin_pending_list_limit = 4MB
|
||||
|
||||
# - Locale and Formatting -
|
||||
|
||||
datestyle = 'iso, mdy'
|
||||
#intervalstyle = 'postgres'
|
||||
timezone = 'Etc/UTC'
|
||||
#timezone_abbreviations = 'Default' # Select the set of available time zone
|
||||
# abbreviations. Currently, there are
|
||||
# Default
|
||||
# Australia (historical usage)
|
||||
# India
|
||||
# You can create your own file in
|
||||
# share/timezonesets/.
|
||||
#extra_float_digits = 1 # min -15, max 3; any value >0 actually
|
||||
# selects precise output mode
|
||||
#client_encoding = sql_ascii # actually, defaults to database
|
||||
# encoding
|
||||
|
||||
# These settings are initialized by initdb, but they can be changed.
|
||||
lc_messages = 'en_US.UTF-8' # locale for system error message
|
||||
# strings
|
||||
lc_monetary = 'en_US.UTF-8' # locale for monetary formatting
|
||||
lc_numeric = 'en_US.UTF-8' # locale for number formatting
|
||||
lc_time = 'en_US.UTF-8' # locale for time formatting
|
||||
|
||||
# default configuration for text search
|
||||
default_text_search_config = 'pg_catalog.english'
|
||||
|
||||
# - Shared Library Preloading -
|
||||
|
||||
#shared_preload_libraries = '' # (change requires restart)
|
||||
#local_preload_libraries = ''
|
||||
#session_preload_libraries = ''
|
||||
#jit_provider = 'llvmjit' # JIT library to use
|
||||
|
||||
# - Other Defaults -
|
||||
|
||||
#dynamic_library_path = '$libdir'
|
||||
|
||||
|
||||
#------------------------------------------------------------------------------
|
||||
# LOCK MANAGEMENT
|
||||
#------------------------------------------------------------------------------
|
||||
|
||||
#deadlock_timeout = 1s
|
||||
#max_locks_per_transaction = 64 # min 10
|
||||
# (change requires restart)
|
||||
#max_pred_locks_per_transaction = 64 # min 10
|
||||
# (change requires restart)
|
||||
#max_pred_locks_per_relation = -2 # negative values mean
|
||||
# (max_pred_locks_per_transaction
|
||||
# / -max_pred_locks_per_relation) - 1
|
||||
#max_pred_locks_per_page = 2 # min 0
|
||||
|
||||
|
||||
#------------------------------------------------------------------------------
|
||||
# VERSION AND PLATFORM COMPATIBILITY
|
||||
#------------------------------------------------------------------------------
|
||||
|
||||
# - Previous PostgreSQL Versions -
|
||||
|
||||
#array_nulls = on
|
||||
#backslash_quote = safe_encoding # on, off, or safe_encoding
|
||||
#escape_string_warning = on
|
||||
#lo_compat_privileges = off
|
||||
#operator_precedence_warning = off
|
||||
#quote_all_identifiers = off
|
||||
#standard_conforming_strings = on
|
||||
#synchronize_seqscans = on
|
||||
|
||||
# - Other Platforms and Clients -
|
||||
|
||||
#transform_null_equals = off
|
||||
|
||||
|
||||
#------------------------------------------------------------------------------
|
||||
# ERROR HANDLING
|
||||
#------------------------------------------------------------------------------
|
||||
|
||||
#exit_on_error = off # terminate session on any error?
|
||||
#restart_after_crash = on # reinitialize after backend crash?
|
||||
#data_sync_retry = off # retry or panic on failure to fsync
|
||||
# data?
|
||||
# (change requires restart)
|
||||
|
||||
|
||||
#------------------------------------------------------------------------------
|
||||
# CONFIG FILE INCLUDES
|
||||
#------------------------------------------------------------------------------
|
||||
|
||||
# These options allow settings to be loaded from files other than the
|
||||
# default postgresql.conf. Note that these are directives, not variable
|
||||
# assignments, so they can usefully be given more than once.
|
||||
|
||||
#include_dir = '...' # include files ending in '.conf' from
|
||||
# a directory, e.g., 'conf.d'
|
||||
#include_if_exists = '...' # include file only if it exists
|
||||
#include = '...' # include file
|
||||
|
||||
|
||||
#------------------------------------------------------------------------------
|
||||
# CUSTOMIZED OPTIONS
|
||||
#------------------------------------------------------------------------------
|
||||
|
||||
# Add settings for extensions here
|
||||
77
scripts/do_pgbecnh.md
Normal file
77
scripts/do_pgbecnh.md
Normal file
@@ -0,0 +1,77 @@
|
||||
```shell
|
||||
# 1. 安装 PostgreSQL 和 pgbench
|
||||
|
||||
sudo apt-get update
|
||||
sudo apt-get install -y postgresql postgresql-contrib
|
||||
|
||||
# 2. 找到 postgresql.conf(Ubuntu 通常在这个目录)
|
||||
|
||||
ls /etc/postgresql/*/main/postgresql.conf
|
||||
|
||||
# 3. 配置禁用 mmap(编辑 postgresql.conf)
|
||||
|
||||
shared_memory_type = sysv
|
||||
dynamic_shared_memory_type = sysv
|
||||
|
||||
# 4. 重启 PostgreSQL
|
||||
|
||||
sudo systemctl stop postgresql
|
||||
rm -rf /home/lian/pg/pgdata
|
||||
rm -rf /zvfs/pg_ts_bench
|
||||
|
||||
sudo chown -R postgres:postgres /home/lian/pg
|
||||
sudo -u postgres mkdir -p /home/lian/pg/pgdata
|
||||
sudo chown -R postgres:postgres /home/lian/pg/pgdata
|
||||
|
||||
sudo -u postgres env LD_PRELOAD=/home/lian/try/zvfs/src/libzvfs.so \
|
||||
/usr/lib/postgresql/12/bin/initdb -D /home/lian/pg/pgdata
|
||||
|
||||
cp ./postgresql.conf /home/lian/pg/pgdata/
|
||||
|
||||
sudo -u postgres env LD_PRELOAD=/home/lian/try/zvfs/src/libzvfs.so \
|
||||
/usr/lib/postgresql/12/bin/pg_ctl -D /home/lian/pg/pgdata -l /tmp/pg.log start
|
||||
|
||||
sudo -u postgres env LD_PRELOAD=/home/lian/try/zvfs/src/libzvfs.so \
|
||||
/usr/lib/postgresql/12/bin/psql
|
||||
|
||||
sudo -u postgres env LD_PRELOAD=/home/lian/try/zvfs/src/libzvfs.so \
|
||||
/usr/lib/postgresql/12/bin/pg_ctl -D /home/lian/pg/pgdata -l /tmp/pg.log restart
|
||||
|
||||
# 创建测试环境
|
||||
sudo -u postgres mkdir -p /zvfs/pg_ts_bench
|
||||
sudo chown -R postgres:postgres /zvfs/pg_ts_bench
|
||||
sudo chmod 700 /zvfs/pg_ts_bench
|
||||
|
||||
CREATE TABLESPACE zvfs_ts LOCATION '/zvfs/pg_ts_bench';
|
||||
DROP DATABASE IF EXISTS benchdb;
|
||||
CREATE DATABASE benchdb TABLESPACE zvfs_ts;
|
||||
|
||||
DROP TABLE IF EXISTS hook_probe;
|
||||
CREATE TABLE hook_probe(id int) TABLESPACE zvfs_ts;
|
||||
INSERT INTO hook_probe VALUES (1);
|
||||
INSERT INTO hook_probe VALUES (2);
|
||||
INSERT INTO hook_probe VALUES (3);
|
||||
INSERT INTO hook_probe VALUES (4);
|
||||
SELECT * FROM hook_probe;
|
||||
DELETE FROM hook_probe WHERE id = 1;
|
||||
UPDATE hook_probe SET id = 11 WHERE id = 2;
|
||||
SELECT * FROM hook_probe;
|
||||
|
||||
|
||||
# 5. 验证配置生效
|
||||
pid=$(pgrep -u postgres -xo postgres)
|
||||
echo "pid=$pid"
|
||||
sudo grep libzvfs /proc/$pid/maps
|
||||
|
||||
sudo -u postgres psql -p 5432 -c "show data_directory;"
|
||||
sudo -u postgres psql -c "SHOW shared_memory_type;"
|
||||
sudo -u postgres psql -c "SHOW dynamic_shared_memory_type;"
|
||||
|
||||
# 6. 创建测试库(如未创建)
|
||||
|
||||
sudo -u postgres createdb benchdb
|
||||
|
||||
# 7. 运行你的 bench 脚本
|
||||
|
||||
bash /home/lian/try/zvfs/scripts/run_pgbench_no_mmap.sh
|
||||
```
|
||||
@@ -21,7 +21,7 @@ BENCHMARKS="fillrandom,readrandom"
|
||||
|
||||
# key数
|
||||
# NUM=1000000
|
||||
NUM=50000
|
||||
NUM=500
|
||||
|
||||
# 线程数
|
||||
THREADS=2
|
||||
|
||||
91
scripts/run_pgbench_no_mmap.sh
Executable file
91
scripts/run_pgbench_no_mmap.sh
Executable file
@@ -0,0 +1,91 @@
|
||||
#!/usr/bin/env bash
|
||||
set -euo pipefail
|
||||
|
||||
# 仅执行 pgbench 的脚本(不安装 PostgreSQL,不 initdb,不启停服务,不改配置)。
|
||||
#
|
||||
# 前提条件:
|
||||
# 1) PostgreSQL 已经在运行。
|
||||
# 2) 测试库已经存在(默认 benchdb)。
|
||||
# 3) PostgreSQL 已经在外部配置为禁用 mmap 共享内存:
|
||||
# shared_memory_type = sysv
|
||||
# dynamic_shared_memory_type = sysv
|
||||
#
|
||||
# 关于 Malloc0:
|
||||
# - 当前后端是内存虚拟设备,容量有限。
|
||||
# - 默认参数故意设置得较小,避免一次灌入过多数据。
|
||||
#
|
||||
# 关于 LD_PRELOAD:
|
||||
# - USE_LD_PRELOAD_INIT=1:初始化阶段(pgbench -i)启用 LD_PRELOAD
|
||||
# - USE_LD_PRELOAD_RUN=1 :压测阶段启用 LD_PRELOAD
|
||||
# - 设为 0 即可关闭对应阶段的 LD_PRELOAD
|
||||
#
|
||||
# 用法:
|
||||
# bash codex/run_pgbench_no_mmap.sh
|
||||
#
|
||||
# 可选环境变量(含义):
|
||||
# PG_HOST=127.0.0.1
|
||||
# PostgreSQL 服务器地址。
|
||||
# PG_PORT=5432
|
||||
# PostgreSQL 服务器端口(默认改为 5432)。
|
||||
# PG_DB=benchdb
|
||||
# 压测数据库名。
|
||||
# PG_SCALE=2
|
||||
# pgbench 初始化规模因子(-s),越大初始数据越多。
|
||||
# PG_TIME=20
|
||||
# 压测持续时间(秒,pgbench -T)。
|
||||
# PG_CLIENTS=2
|
||||
# 并发客户端数(pgbench -c)。
|
||||
# PG_JOBS=2
|
||||
# 工作线程数(pgbench -j)。
|
||||
# PG_SUPERUSER=postgres
|
||||
# 执行 pgbench 的系统用户(通常是 postgres)。
|
||||
# LD_PRELOAD_PATH=/home/lian/try/zvfs/src/libzvfs.so
|
||||
# LD_PRELOAD 目标库路径(你的 zvfs hook so)。
|
||||
# PG_BIN_DIR=/usr/lib/postgresql/16/bin
|
||||
# pgbench 所在目录;不填时自动从 PATH 查找。
|
||||
# USE_LD_PRELOAD_INIT=1
|
||||
# 初始化阶段(pgbench -i)是否启用 LD_PRELOAD:1=启用,0=关闭。
|
||||
# USE_LD_PRELOAD_RUN=1
|
||||
# 压测阶段是否启用 LD_PRELOAD:1=启用,0=关闭。
|
||||
|
||||
PG_HOST="${PG_HOST:-127.0.0.1}"
|
||||
PG_PORT="${PG_PORT:-5432}"
|
||||
PG_DB="${PG_DB:-benchdb}"
|
||||
PG_SCALE="${PG_SCALE:-2}"
|
||||
PG_TIME="${PG_TIME:-20}"
|
||||
PG_CLIENTS="${PG_CLIENTS:-2}"
|
||||
PG_JOBS="${PG_JOBS:-2}"
|
||||
PG_SUPERUSER="${PG_SUPERUSER:-postgres}"
|
||||
LD_PRELOAD_PATH="${LD_PRELOAD_PATH:-/home/lian/try/zvfs/src/libzvfs.so}"
|
||||
PG_BIN_DIR="${PG_BIN_DIR:-$(dirname "$(command -v pgbench 2>/dev/null || true)")}"
|
||||
USE_LD_PRELOAD_INIT="${USE_LD_PRELOAD_INIT:-1}"
|
||||
USE_LD_PRELOAD_RUN="${USE_LD_PRELOAD_RUN:-1}"
|
||||
|
||||
if [[ -z "${PG_BIN_DIR}" || ! -x "${PG_BIN_DIR}/pgbench" ]]; then
|
||||
echo "未找到 pgbench,请设置 PG_BIN_DIR 或把 pgbench 放到 PATH 中。" >&2
|
||||
exit 1
|
||||
fi
|
||||
|
||||
run_pgbench_cmd() {
|
||||
local use_preload="$1"
|
||||
shift
|
||||
if [[ "${use_preload}" == "1" ]]; then
|
||||
sudo -u "${PG_SUPERUSER}" env LD_PRELOAD="${LD_PRELOAD_PATH}" "$@"
|
||||
else
|
||||
sudo -u "${PG_SUPERUSER}" "$@"
|
||||
fi
|
||||
}
|
||||
|
||||
echo "当前参数:"
|
||||
echo " host=${PG_HOST} port=${PG_PORT} db=${PG_DB}"
|
||||
echo " scale=${PG_SCALE} clients=${PG_CLIENTS} jobs=${PG_JOBS} time=${PG_TIME}s"
|
||||
echo " preload_init=${USE_LD_PRELOAD_INIT} preload_run=${USE_LD_PRELOAD_RUN}"
|
||||
|
||||
echo "[1/2] 初始化数据(pgbench -i)"
|
||||
run_pgbench_cmd "${USE_LD_PRELOAD_INIT}" \
|
||||
"${PG_BIN_DIR}/pgbench" -h "${PG_HOST}" -p "${PG_PORT}" -i -s "${PG_SCALE}" "${PG_DB}"
|
||||
|
||||
echo "[2/2] 执行压测(pgbench -T)"
|
||||
run_pgbench_cmd "${USE_LD_PRELOAD_RUN}" \
|
||||
"${PG_BIN_DIR}/pgbench" -h "${PG_HOST}" -p "${PG_PORT}" \
|
||||
-c "${PG_CLIENTS}" -j "${PG_JOBS}" -T "${PG_TIME}" -P 5 "${PG_DB}"
|
||||
4
scripts/search_libzvfs.sh
Executable file
4
scripts/search_libzvfs.sh
Executable file
@@ -0,0 +1,4 @@
|
||||
pgrep -u postgres -x postgres | while read p; do
|
||||
echo "PID=$p"
|
||||
sudo grep -m1 libzvfs /proc/$p/maps || echo " (no libzvfs)"
|
||||
done
|
||||
30
src/Makefile
30
src/Makefile
@@ -6,7 +6,6 @@
|
||||
SPDK_ROOT_DIR := $(abspath $(CURDIR)/../spdk)
|
||||
include $(SPDK_ROOT_DIR)/mk/spdk.common.mk
|
||||
include $(SPDK_ROOT_DIR)/mk/spdk.modules.mk
|
||||
include $(SPDK_ROOT_DIR)/mk/spdk.app_vars.mk
|
||||
|
||||
LIBZVFS := libzvfs.so
|
||||
|
||||
@@ -18,6 +17,7 @@ C_SRCS := \
|
||||
fs/zvfs_path_entry.c \
|
||||
fs/zvfs_open_file.c \
|
||||
fs/zvfs_sys_init.c \
|
||||
proto/ipc_proto.c \
|
||||
hook/zvfs_hook_init.c \
|
||||
hook/zvfs_hook_fd.c \
|
||||
hook/zvfs_hook_rw.c \
|
||||
@@ -28,24 +28,40 @@ C_SRCS := \
|
||||
hook/zvfs_hook_dir.c \
|
||||
hook/zvfs_hook_mmap.c \
|
||||
|
||||
# 指定头文件搜索路径
|
||||
CFLAGS += -I$(abspath $(CURDIR)) -fPIC
|
||||
|
||||
# SPDK 库依赖
|
||||
SPDK_LIB_LIST = $(ALL_MODULES_LIST) event event_bdev
|
||||
|
||||
LIBS += $(SPDK_LIB_LINKER_ARGS)
|
||||
CFLAGS += -I$(abspath $(CURDIR))
|
||||
LDFLAGS += -shared -rdynamic -Wl,-z,nodelete -Wl,--disable-new-dtags \
|
||||
# 链接选项
|
||||
LDFLAGS += -shared -Wl,-soname,$(LIBZVFS) -Wl,-z,nodelete \
|
||||
-Wl,--disable-new-dtags \
|
||||
-Wl,-rpath,$(SPDK_ROOT_DIR)/build/lib \
|
||||
-Wl,-rpath,$(SPDK_ROOT_DIR)/dpdk/build/lib
|
||||
|
||||
# 系统库
|
||||
SYS_LIBS += -ldl
|
||||
|
||||
# 获取 SPDK 库的链接参数
|
||||
SPDK_LIBS = $(call spdk_lib_list_to_linker_args,$(SPDK_LIB_LIST))
|
||||
|
||||
DEPS = $(OBJS:.o=.d)
|
||||
|
||||
all: $(LIBZVFS)
|
||||
@:
|
||||
$(MAKE) -C daemon
|
||||
|
||||
$(LIBZVFS): $(OBJS) $(SPDK_LIB_FILES) $(ENV_LIBS)
|
||||
$(LINK_C)
|
||||
# 构建目标文件
|
||||
$(OBJDIR)/%.o: %.c
|
||||
$(CC) $(CFLAGS) -c $< -o $@
|
||||
|
||||
# 构建共享库
|
||||
$(LIBZVFS): $(OBJS)
|
||||
$(CC) $(LDFLAGS) -o $@ $^ $(SPDK_LIBS) $(SYS_LIBS)
|
||||
|
||||
clean:
|
||||
$(CLEAN_C) $(LIBZVFS)
|
||||
rm -f $(DEPS) $(OBJS) $(LIBZVFS)
|
||||
$(MAKE) -C daemon clean
|
||||
|
||||
include $(SPDK_ROOT_DIR)/mk/spdk.deps.mk
|
||||
|
||||
@@ -1,33 +1,20 @@
|
||||
#ifndef __ZVFS_CONFIG_H__
|
||||
#define __ZVFS_CONFIG_H__
|
||||
|
||||
/**
|
||||
* ZVFS
|
||||
*/
|
||||
|
||||
#define ZVFS_XATTR_BLOB_ID "user.zvfs.blob_id"
|
||||
|
||||
/**
|
||||
* SPDK
|
||||
*/
|
||||
|
||||
// dev
|
||||
#define SPDK_JSON_PATH "/home/lian/try/zvfs/src/zvfsmalloc.json"
|
||||
// #define ZVFS_BDEV "Nvme0n1"
|
||||
#ifndef ZVFS_BDEV
|
||||
#define ZVFS_BDEV "Malloc0"
|
||||
#endif
|
||||
|
||||
// super blob
|
||||
#define ZVFS_SB_MAGIC UINT64_C(0x5A5646535F534200) /* "ZVFS_SB\0" */
|
||||
#define ZVFS_SB_VERSION UINT32_C(1)
|
||||
|
||||
// dma
|
||||
#define ZVFS_DMA_BUF_SIZE (1024 * 1024)
|
||||
|
||||
// waiter
|
||||
#define WAITER_MAX_TIME 10000000
|
||||
#define ZVFS_DMA_BUF_SIZE (1024 * 1024)
|
||||
#define ZVFS_WAIT_TIME 5000ULL
|
||||
|
||||
|
||||
|
||||
#define ZVFS_IPC_DEFAULT_SOCKET_PATH "/tmp/zvfs.sock"
|
||||
// #define ZVFS_IPC_BUF_SIZE 4096
|
||||
#define ZVFS_IPC_BUF_SIZE (16 * 1024 * 1024)
|
||||
|
||||
#endif // __ZVFS_CONFIG_H__
|
||||
@@ -50,44 +50,3 @@ int zvfs_calc_ceil_units(uint64_t bytes,
|
||||
}
|
||||
return 0;
|
||||
}
|
||||
|
||||
int buf_init(zvfs_buf_t *b, size_t initial)
|
||||
{
|
||||
b->data = malloc(initial);
|
||||
if (!b->data) return -1;
|
||||
b->cap = initial;
|
||||
b->len = 0;
|
||||
return 0;
|
||||
}
|
||||
|
||||
void buf_free(zvfs_buf_t *b)
|
||||
{
|
||||
free(b->data);
|
||||
b->data = NULL;
|
||||
b->len = b->cap = 0;
|
||||
}
|
||||
|
||||
/*
|
||||
* 确保缓冲区还有 need 字节可用,不够则 realloc 两倍。
|
||||
*/
|
||||
int buf_reserve(zvfs_buf_t *b, size_t need)
|
||||
{
|
||||
if (b->len + need <= b->cap) return 0;
|
||||
|
||||
size_t new_cap = b->cap * 2;
|
||||
while (new_cap < b->len + need) new_cap *= 2;
|
||||
|
||||
uint8_t *p = realloc(b->data, new_cap);
|
||||
if (!p) return -1;
|
||||
b->data = p;
|
||||
b->cap = new_cap;
|
||||
return 0;
|
||||
}
|
||||
|
||||
int buf_append(zvfs_buf_t *b, const void *src, size_t n)
|
||||
{
|
||||
if (buf_reserve(b, n) != 0) return -1;
|
||||
memcpy(b->data + b->len, src, n);
|
||||
b->len += n;
|
||||
return 0;
|
||||
}
|
||||
|
||||
@@ -15,15 +15,4 @@ int zvfs_calc_ceil_units(uint64_t bytes,
|
||||
uint64_t unit_size,
|
||||
uint64_t *units_out);
|
||||
|
||||
typedef struct {
|
||||
uint8_t *data;
|
||||
size_t cap;
|
||||
size_t len;
|
||||
} zvfs_buf_t;
|
||||
|
||||
int buf_init(zvfs_buf_t *b, size_t initial);
|
||||
void buf_free(zvfs_buf_t *b);
|
||||
int buf_reserve(zvfs_buf_t *b, size_t need);
|
||||
int buf_append(zvfs_buf_t *b, const void *src, size_t n);
|
||||
|
||||
#endif // __ZVFS_COMMON_UTILS_H__
|
||||
|
||||
20
src/daemon/Makefile
Normal file
20
src/daemon/Makefile
Normal file
@@ -0,0 +1,20 @@
|
||||
# SPDX-License-Identifier: BSD-3-Clause
|
||||
# Copyright (C) 2017 Intel Corporation
|
||||
# All rights reserved.
|
||||
#
|
||||
|
||||
SPDK_ROOT_DIR := $(abspath $(CURDIR)/../../spdk)
|
||||
PROTO_DIR := $(abspath $(CURDIR)/../proto)
|
||||
COMMON_DIR := $(abspath $(CURDIR)/../common)
|
||||
include $(SPDK_ROOT_DIR)/mk/spdk.common.mk
|
||||
include $(SPDK_ROOT_DIR)/mk/spdk.modules.mk
|
||||
|
||||
APP = zvfs_daemon
|
||||
|
||||
CFLAGS += -I$(abspath $(CURDIR)/..)
|
||||
|
||||
C_SRCS := main.c ipc_cq.c ipc_reactor.c spdk_engine.c spdk_engine_wrapper.c $(PROTO_DIR)/ipc_proto.c $(COMMON_DIR)/utils.c
|
||||
|
||||
SPDK_LIB_LIST = $(ALL_MODULES_LIST) event event_bdev
|
||||
|
||||
include $(SPDK_ROOT_DIR)/mk/spdk.app.mk
|
||||
61
src/daemon/ipc_cq.c
Normal file
61
src/daemon/ipc_cq.c
Normal file
@@ -0,0 +1,61 @@
|
||||
#include "ipc_cq.h"
|
||||
#include <stdint.h>
|
||||
#include <stdlib.h>
|
||||
#include <unistd.h>
|
||||
#include <string.h>
|
||||
#include <stdio.h>
|
||||
|
||||
struct cq *g_cq;
|
||||
|
||||
struct cq *CQ_Create(void) {
|
||||
struct cq *q = (struct cq*)malloc(sizeof(*q));
|
||||
if (!q) return NULL;
|
||||
q->head = q->tail = NULL;
|
||||
pthread_mutex_init(&q->lock, NULL);
|
||||
return q;
|
||||
}
|
||||
|
||||
void CQ_Destroy(struct cq *q) {
|
||||
while (q->head) {
|
||||
struct cq_item *tmp = q->head;
|
||||
q->head = tmp->next;
|
||||
free(tmp->resp->data); // 如果 resp 有 data
|
||||
free(tmp->resp);
|
||||
free(tmp);
|
||||
}
|
||||
pthread_mutex_destroy(&q->lock);
|
||||
free(q);
|
||||
}
|
||||
|
||||
/* 推入响应 */
|
||||
void CQ_Push(struct cq *q, struct zvfs_resp *resp) {
|
||||
struct cq_item *item = (struct cq_item *)malloc(sizeof(*item));
|
||||
item->resp = resp;
|
||||
item->next = NULL;
|
||||
|
||||
pthread_mutex_lock(&q->lock);
|
||||
if (q->tail) {
|
||||
q->tail->next = item;
|
||||
q->tail = item;
|
||||
} else {
|
||||
q->head = q->tail = item;
|
||||
}
|
||||
pthread_mutex_unlock(&q->lock);
|
||||
}
|
||||
|
||||
/* 弹出响应 */
|
||||
struct zvfs_resp *CQ_Pop(struct cq *q) {
|
||||
pthread_mutex_lock(&q->lock);
|
||||
struct cq_item *item = q->head;
|
||||
if (!item) {
|
||||
pthread_mutex_unlock(&q->lock);
|
||||
return NULL;
|
||||
}
|
||||
q->head = item->next;
|
||||
if (!q->head) q->tail = NULL;
|
||||
pthread_mutex_unlock(&q->lock);
|
||||
|
||||
struct zvfs_resp *resp = item->resp;
|
||||
free(item);
|
||||
return resp;
|
||||
}
|
||||
26
src/daemon/ipc_cq.h
Normal file
26
src/daemon/ipc_cq.h
Normal file
@@ -0,0 +1,26 @@
|
||||
#ifndef __ZVFS_IPC_CQ_H__
|
||||
#define __ZVFS_IPC_CQ_H__
|
||||
|
||||
#include "proto/ipc_proto.h"
|
||||
#include <pthread.h>
|
||||
|
||||
|
||||
struct cq_item {
|
||||
struct zvfs_resp *resp;
|
||||
struct cq_item *next;
|
||||
};
|
||||
|
||||
struct cq {
|
||||
struct cq_item *head;
|
||||
struct cq_item *tail;
|
||||
pthread_mutex_t lock;
|
||||
};
|
||||
|
||||
struct cq *CQ_Create(void);
|
||||
void CQ_Destroy(struct cq *q);
|
||||
void CQ_Push(struct cq *q, struct zvfs_resp *resp);
|
||||
struct zvfs_resp *CQ_Pop(struct cq *q);
|
||||
|
||||
extern struct cq *g_cq;
|
||||
|
||||
#endif
|
||||
309
src/daemon/ipc_reactor.c
Normal file
309
src/daemon/ipc_reactor.c
Normal file
@@ -0,0 +1,309 @@
|
||||
#include "ipc_reactor.h"
|
||||
#include "ipc_cq.h"
|
||||
#include "common/config.h"
|
||||
|
||||
#include <stdlib.h>
|
||||
#include <string.h>
|
||||
#include <unistd.h>
|
||||
#include <errno.h>
|
||||
#include <fcntl.h>
|
||||
#include <stdio.h>
|
||||
#include <sys/socket.h>
|
||||
#include <sys/un.h>
|
||||
#include <sys/epoll.h>
|
||||
#include <sys/stat.h>
|
||||
#include <stdint.h>
|
||||
|
||||
static int send_all(int fd, const uint8_t *buf, size_t len) {
|
||||
size_t off = 0;
|
||||
|
||||
while (off < len) {
|
||||
ssize_t sent = send(fd, buf + off, len - off, 0);
|
||||
if (sent > 0) {
|
||||
off += (size_t)sent;
|
||||
continue;
|
||||
}
|
||||
if (sent < 0 && errno == EINTR) {
|
||||
continue;
|
||||
}
|
||||
if (sent < 0 && (errno == EAGAIN || errno == EWOULDBLOCK)) {
|
||||
/* 当前实现优先功能,等待对端可写后重试。 */
|
||||
usleep(100);
|
||||
continue;
|
||||
}
|
||||
return -1;
|
||||
}
|
||||
return 0;
|
||||
}
|
||||
|
||||
/** ====================================================== */
|
||||
/* CQ OP */
|
||||
/** ====================================================== */
|
||||
static void cq_consume_send(struct cq *q) {
|
||||
struct zvfs_resp *resp;
|
||||
while ((resp = CQ_Pop(q)) != NULL) {
|
||||
struct zvfs_conn *conn = resp->conn;
|
||||
size_t cap = ZVFS_IPC_BUF_SIZE;
|
||||
uint8_t *buf = NULL;
|
||||
|
||||
// printf("[resp][%s]\n",cast_opcode2string(resp->opcode));
|
||||
|
||||
buf = malloc(cap);
|
||||
if (!buf) {
|
||||
fprintf(stderr, "serialize resp failed: alloc %zu bytes\n", cap);
|
||||
free(resp->data);
|
||||
free(resp);
|
||||
continue;
|
||||
}
|
||||
|
||||
size_t n = zvfs_serialize_resp(resp, buf, cap);
|
||||
if (n == 0 && resp->status == 0 && resp->opcode == ZVFS_OP_READ) {
|
||||
if (resp->length <= SIZE_MAX - 64) {
|
||||
size_t need = (size_t)resp->length + 64;
|
||||
uint8_t *bigger = realloc(buf, need);
|
||||
if (bigger) {
|
||||
buf = bigger;
|
||||
cap = need;
|
||||
n = zvfs_serialize_resp(resp, buf, cap);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
if (n == 0) {
|
||||
fprintf(stderr, "serialize resp failed: op=%u status=%d len=%lu cap=%zu\n",
|
||||
resp->opcode, resp->status, resp->length, cap);
|
||||
free(buf);
|
||||
free(resp->data);
|
||||
free(resp);
|
||||
continue;
|
||||
}
|
||||
|
||||
if (send_all(conn->fd, buf, n) != 0) {
|
||||
perror("send");
|
||||
free(buf);
|
||||
free(resp->data);
|
||||
free(resp);
|
||||
continue;
|
||||
}
|
||||
free(buf);
|
||||
|
||||
// 清理
|
||||
if(resp->data) free(resp->data);
|
||||
free(resp);
|
||||
}
|
||||
}
|
||||
|
||||
static int set_nonblock(int fd){
|
||||
int flags = fcntl(fd, F_GETFL, 0);
|
||||
if (flags < 0)
|
||||
return -1;
|
||||
|
||||
return fcntl(fd, F_SETFL, flags | O_NONBLOCK);
|
||||
}
|
||||
|
||||
static void epoll_add(struct zvfs_reactor *r, int fd, void *ptr, uint32_t events)
|
||||
{
|
||||
struct epoll_event ev;
|
||||
|
||||
memset(&ev, 0, sizeof(ev));
|
||||
ev.events = events;
|
||||
ev.data.ptr = ptr;
|
||||
|
||||
epoll_ctl(r->epfd, EPOLL_CTL_ADD, fd, &ev);
|
||||
}
|
||||
|
||||
static void epoll_mod(struct zvfs_reactor *r, int fd, void *ptr, uint32_t events){
|
||||
struct epoll_event ev;
|
||||
|
||||
memset(&ev, 0, sizeof(ev));
|
||||
ev.events = events;
|
||||
ev.data.ptr = ptr;
|
||||
|
||||
epoll_ctl(r->epfd, EPOLL_CTL_MOD, fd, &ev);
|
||||
}
|
||||
|
||||
static void conn_destroy(struct zvfs_conn *c){
|
||||
close(c->fd);
|
||||
free(c);
|
||||
}
|
||||
|
||||
int zvfs_conn_get_fd(struct zvfs_conn *conn){
|
||||
return conn->fd;
|
||||
}
|
||||
|
||||
void zvfs_conn_set_ctx(struct zvfs_conn *conn, void *ctx){
|
||||
conn->user_ctx = ctx;
|
||||
}
|
||||
|
||||
void *zvfs_conn_get_ctx(struct zvfs_conn *conn){
|
||||
return conn->user_ctx;
|
||||
}
|
||||
|
||||
void zvfs_conn_enable_write(struct zvfs_conn *conn){
|
||||
if (conn->want_write)
|
||||
return;
|
||||
|
||||
conn->want_write = 1;
|
||||
|
||||
struct zvfs_reactor *r = conn->reactor;
|
||||
|
||||
epoll_mod(r, conn->fd, conn,
|
||||
EPOLLIN | EPOLLOUT | EPOLLET);
|
||||
}
|
||||
|
||||
void zvfs_conn_disable_write(struct zvfs_conn *conn){
|
||||
if (!conn->want_write)
|
||||
return;
|
||||
|
||||
conn->want_write = 0;
|
||||
|
||||
struct zvfs_reactor *r = conn->reactor;
|
||||
|
||||
epoll_mod(r, conn->fd, conn,
|
||||
EPOLLIN | EPOLLET);
|
||||
}
|
||||
|
||||
void zvfs_conn_close(struct zvfs_conn *conn){
|
||||
struct zvfs_reactor *r = conn->reactor;
|
||||
|
||||
if (r->opts.on_close)
|
||||
r->opts.on_close(conn, r->opts.cb_ctx);
|
||||
|
||||
epoll_ctl(r->epfd, EPOLL_CTL_DEL, conn->fd, NULL);
|
||||
|
||||
conn_destroy(conn);
|
||||
}
|
||||
|
||||
/**
|
||||
* AF_UNIX -> Unix Domain Socket
|
||||
* SOCK_STREAM -> 类似 TCP
|
||||
* path -> 通过某个文件进行通信
|
||||
*/
|
||||
static int create_listen_socket(const char *path, int backlog){
|
||||
int fd = socket(AF_UNIX, SOCK_STREAM, 0);
|
||||
if (fd < 0)
|
||||
return -1;
|
||||
|
||||
struct sockaddr_un addr;
|
||||
|
||||
memset(&addr, 0, sizeof(addr));
|
||||
addr.sun_family = AF_UNIX;
|
||||
strncpy(addr.sun_path, path, sizeof(addr.sun_path) - 1);
|
||||
|
||||
unlink(path);
|
||||
|
||||
if (bind(fd, (struct sockaddr*)&addr, sizeof(addr)) < 0)
|
||||
return -1;
|
||||
|
||||
/*
|
||||
* 避免 daemon 由 root 启动时,socket 权限受 umask 影响导致
|
||||
* 其它用户(如 postgres)connect() 被 EACCES 拒绝。
|
||||
*/
|
||||
if (chmod(path, 0666) < 0)
|
||||
return -1;
|
||||
|
||||
if (listen(fd, backlog) < 0)
|
||||
return -1;
|
||||
|
||||
set_nonblock(fd);
|
||||
|
||||
return fd;
|
||||
}
|
||||
|
||||
struct zvfs_reactor *zvfs_reactor_create(const struct zvfs_reactor_opts *opts){
|
||||
struct zvfs_reactor *r = calloc(1, sizeof(*r));
|
||||
|
||||
r->opts = *opts;
|
||||
|
||||
r->epfd = epoll_create1(0);
|
||||
|
||||
r->listen_fd = create_listen_socket(
|
||||
opts->socket_path,
|
||||
opts->backlog);
|
||||
|
||||
epoll_add(r, r->listen_fd, NULL, EPOLLIN);
|
||||
|
||||
return r;
|
||||
}
|
||||
|
||||
static void handle_accept(struct zvfs_reactor *r){
|
||||
for (;;) {
|
||||
|
||||
int fd = accept(r->listen_fd, NULL, NULL);
|
||||
|
||||
if (fd < 0) {
|
||||
|
||||
if (errno == EAGAIN || errno == EWOULDBLOCK)
|
||||
return;
|
||||
|
||||
return;
|
||||
}
|
||||
|
||||
set_nonblock(fd);
|
||||
|
||||
struct zvfs_conn *conn = calloc(1, sizeof(*conn));
|
||||
|
||||
conn->fd = fd;
|
||||
conn->reactor = r;
|
||||
|
||||
epoll_add(r, fd, conn, EPOLLIN | EPOLLET);
|
||||
|
||||
if (r->opts.on_accept)
|
||||
r->opts.on_accept(conn, r->opts.cb_ctx);
|
||||
}
|
||||
}
|
||||
|
||||
int
|
||||
zvfs_reactor_run(struct zvfs_reactor *r){
|
||||
struct epoll_event events[64];
|
||||
|
||||
r->running = 1;
|
||||
|
||||
while (r->running) {
|
||||
|
||||
int n = epoll_wait(r->epfd, events, 64, 0);
|
||||
|
||||
for (int i = 0; i < n; i++) {
|
||||
|
||||
if (events[i].data.ptr == NULL) {
|
||||
|
||||
handle_accept(r);
|
||||
continue;
|
||||
}
|
||||
|
||||
struct zvfs_conn *conn = events[i].data.ptr;
|
||||
|
||||
if (events[i].events & (EPOLLHUP | EPOLLERR)) {
|
||||
|
||||
zvfs_conn_close(conn);
|
||||
continue;
|
||||
}
|
||||
|
||||
if ((events[i].events & EPOLLIN) &&
|
||||
r->opts.on_read) {
|
||||
|
||||
r->opts.on_read(conn, r->opts.cb_ctx);
|
||||
}
|
||||
|
||||
if ((events[i].events & EPOLLOUT) &&
|
||||
r->opts.on_write) {
|
||||
|
||||
r->opts.on_write(conn, r->opts.cb_ctx);
|
||||
}
|
||||
}
|
||||
cq_consume_send(g_cq);
|
||||
}
|
||||
return 0;
|
||||
}
|
||||
|
||||
void zvfs_reactor_stop(struct zvfs_reactor *r){
|
||||
r->running = 0;
|
||||
}
|
||||
|
||||
|
||||
void zvfs_reactor_destroy(struct zvfs_reactor *r){
|
||||
close(r->listen_fd);
|
||||
close(r->epfd);
|
||||
free(r);
|
||||
}
|
||||
|
||||
118
src/daemon/ipc_reactor.h
Normal file
118
src/daemon/ipc_reactor.h
Normal file
@@ -0,0 +1,118 @@
|
||||
#ifndef __ZVFS_IPC_REACTOR_H__
|
||||
#define __ZVFS_IPC_REACTOR_H__
|
||||
|
||||
#include <stdint.h>
|
||||
#include <stddef.h>
|
||||
|
||||
#ifdef __cplusplus
|
||||
extern "C" {
|
||||
#endif
|
||||
|
||||
struct zvfs_reactor_opts;
|
||||
struct zvfs_conn;
|
||||
struct zvfs_reactor;
|
||||
|
||||
/* callbacks */
|
||||
|
||||
|
||||
typedef void (*zvfs_on_accept_fn)(
|
||||
struct zvfs_conn *conn,
|
||||
void *ctx);
|
||||
|
||||
typedef void (*zvfs_on_read_fn)(
|
||||
struct zvfs_conn *conn,
|
||||
void *ctx);
|
||||
|
||||
typedef void (*zvfs_on_write_fn)(
|
||||
struct zvfs_conn *conn,
|
||||
void *ctx);
|
||||
|
||||
typedef void (*zvfs_on_close_fn)(
|
||||
struct zvfs_conn *conn,
|
||||
void *ctx);
|
||||
|
||||
/* configuration */
|
||||
|
||||
struct zvfs_reactor_opts {
|
||||
|
||||
const char *socket_path;
|
||||
|
||||
int backlog;
|
||||
|
||||
int max_events;
|
||||
|
||||
zvfs_on_accept_fn on_accept;
|
||||
|
||||
zvfs_on_read_fn on_read;
|
||||
|
||||
zvfs_on_write_fn on_write;
|
||||
|
||||
zvfs_on_close_fn on_close;
|
||||
|
||||
void *cb_ctx;
|
||||
};
|
||||
|
||||
struct zvfs_conn {
|
||||
|
||||
int fd;
|
||||
|
||||
int want_write;
|
||||
|
||||
void *user_ctx;
|
||||
|
||||
struct zvfs_reactor *reactor;
|
||||
};
|
||||
|
||||
struct zvfs_reactor {
|
||||
|
||||
int epfd;
|
||||
|
||||
int listen_fd;
|
||||
|
||||
int running;
|
||||
|
||||
struct zvfs_reactor_opts opts;
|
||||
};
|
||||
|
||||
|
||||
/* reactor lifecycle */
|
||||
|
||||
struct zvfs_reactor *
|
||||
zvfs_reactor_create(const struct zvfs_reactor_opts *opts);
|
||||
|
||||
int
|
||||
zvfs_reactor_run(struct zvfs_reactor *reactor);
|
||||
|
||||
void
|
||||
zvfs_reactor_stop(struct zvfs_reactor *reactor);
|
||||
|
||||
void
|
||||
zvfs_reactor_destroy(struct zvfs_reactor *reactor);
|
||||
|
||||
|
||||
/* connection helpers */
|
||||
|
||||
int
|
||||
zvfs_conn_get_fd(struct zvfs_conn *conn);
|
||||
|
||||
void
|
||||
zvfs_conn_close(struct zvfs_conn *conn);
|
||||
|
||||
void
|
||||
zvfs_conn_enable_write(struct zvfs_conn *conn);
|
||||
|
||||
void
|
||||
zvfs_conn_disable_write(struct zvfs_conn *conn);
|
||||
|
||||
void
|
||||
zvfs_conn_set_ctx(struct zvfs_conn *conn, void *ctx);
|
||||
|
||||
void *
|
||||
zvfs_conn_get_ctx(struct zvfs_conn *conn);
|
||||
|
||||
|
||||
#ifdef __cplusplus
|
||||
}
|
||||
#endif
|
||||
|
||||
#endif
|
||||
259
src/daemon/main.c
Normal file
259
src/daemon/main.c
Normal file
@@ -0,0 +1,259 @@
|
||||
|
||||
#include "common/config.h"
|
||||
#include "proto/ipc_proto.h"
|
||||
#include "ipc_reactor.h"
|
||||
#include "ipc_cq.h"
|
||||
#include "spdk_engine_wrapper.h"
|
||||
|
||||
#include <stdio.h>
|
||||
#include <unistd.h>
|
||||
#include <string.h>
|
||||
#include <sys/socket.h>
|
||||
#include <sys/un.h>
|
||||
#include <sys/types.h>
|
||||
#include <errno.h>
|
||||
#include <stdlib.h>
|
||||
|
||||
// #define IPC_REACTOR_ECHO
|
||||
|
||||
#define IPC_REACTOR_ZVFS
|
||||
|
||||
extern struct zvfs_spdk_io_engine g_engine;
|
||||
|
||||
|
||||
#ifdef IPC_REACTOR_ECHO
|
||||
static void on_accept(struct zvfs_conn *conn, void *ctx)
|
||||
{
|
||||
printf("client connected fd=%d\n",
|
||||
zvfs_conn_get_fd(conn));
|
||||
}
|
||||
|
||||
|
||||
static void on_read(struct zvfs_conn *c, void *ctx)
|
||||
{
|
||||
int fd = zvfs_conn_get_fd(c);
|
||||
|
||||
char buf[4096];
|
||||
|
||||
ssize_t n = read(fd, buf, sizeof(buf));
|
||||
|
||||
if (n == 0) {
|
||||
zvfs_conn_close(c);
|
||||
return;
|
||||
}
|
||||
|
||||
if (n < 0) {
|
||||
|
||||
if (errno == EAGAIN || errno == EWOULDBLOCK)
|
||||
return;
|
||||
|
||||
perror("read");
|
||||
zvfs_conn_close(c);
|
||||
return;
|
||||
}
|
||||
|
||||
printf("recv %ld bytes: %.*s\n", n, (int)n, buf);
|
||||
|
||||
ssize_t w = write(fd, buf, n);
|
||||
|
||||
if (w < 0) {
|
||||
perror("write");
|
||||
zvfs_conn_close(c);
|
||||
return;
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
static void on_write(struct zvfs_conn *conn, void *ctx)
|
||||
{
|
||||
/* echo server 不需要 write queue */
|
||||
}
|
||||
|
||||
|
||||
static void on_close(struct zvfs_conn *conn, void *ctx)
|
||||
{
|
||||
printf("connection closed fd=%d\n",
|
||||
zvfs_conn_get_fd(conn));
|
||||
}
|
||||
|
||||
|
||||
int main()
|
||||
{
|
||||
struct zvfs_reactor_opts opts = {
|
||||
.socket_path = "/tmp/zvfs.sock",
|
||||
.backlog = 128,
|
||||
.max_events = 64,
|
||||
.on_accept = on_accept,
|
||||
.on_read = on_read,
|
||||
.on_write = on_write,
|
||||
.on_close = on_close,
|
||||
.cb_ctx = NULL
|
||||
};
|
||||
|
||||
struct zvfs_reactor *r = zvfs_reactor_create(&opts);
|
||||
|
||||
printf("echo server started: %s\n", opts.socket_path);
|
||||
|
||||
zvfs_reactor_run(r);
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
|
||||
#else
|
||||
static void on_accept(struct zvfs_conn *conn, void *ctx)
|
||||
{
|
||||
struct {
|
||||
uint8_t *buf;
|
||||
size_t len;
|
||||
size_t cap;
|
||||
} *rctx = calloc(1, sizeof(*rctx));
|
||||
|
||||
if (!rctx) {
|
||||
fprintf(stderr, "[accept] alloc conn ctx failed\n");
|
||||
zvfs_conn_close(conn);
|
||||
return;
|
||||
}
|
||||
|
||||
rctx->cap = ZVFS_IPC_BUF_SIZE;
|
||||
rctx->buf = calloc(1, rctx->cap);
|
||||
if (!rctx->buf) {
|
||||
fprintf(stderr, "[accept] alloc conn rx buffer failed\n");
|
||||
free(rctx);
|
||||
zvfs_conn_close(conn);
|
||||
return;
|
||||
}
|
||||
zvfs_conn_set_ctx(conn, rctx);
|
||||
|
||||
printf("client connected fd=%d\n",
|
||||
zvfs_conn_get_fd(conn));
|
||||
}
|
||||
|
||||
static void on_read(struct zvfs_conn *c, void *ctx)
|
||||
{
|
||||
int fd = zvfs_conn_get_fd(c);
|
||||
struct {
|
||||
uint8_t *buf;
|
||||
size_t len;
|
||||
size_t cap;
|
||||
} *rctx = zvfs_conn_get_ctx(c);
|
||||
|
||||
if (!rctx || !rctx->buf || rctx->cap == 0) {
|
||||
fprintf(stderr, "[read] invalid conn ctx fd=%d\n", fd);
|
||||
zvfs_conn_close(c);
|
||||
return;
|
||||
}
|
||||
|
||||
for (;;) {
|
||||
if (rctx->len >= rctx->cap) {
|
||||
fprintf(stderr, "[read] rx buffer overflow fd=%d len=%zu cap=%zu\n",
|
||||
fd, rctx->len, rctx->cap);
|
||||
zvfs_conn_close(c);
|
||||
return;
|
||||
}
|
||||
|
||||
ssize_t n = read(fd, rctx->buf + rctx->len, rctx->cap - rctx->len);
|
||||
if (n == 0) {
|
||||
fprintf(stderr, "[read] fd=%d closed\n", fd);
|
||||
zvfs_conn_close(c);
|
||||
return;
|
||||
}
|
||||
|
||||
if (n < 0) {
|
||||
if (errno != EAGAIN && errno != EWOULDBLOCK) {
|
||||
perror("[read]");
|
||||
zvfs_conn_close(c);
|
||||
return;
|
||||
}
|
||||
break;
|
||||
}
|
||||
|
||||
rctx->len += (size_t)n;
|
||||
}
|
||||
|
||||
size_t offset = 0;
|
||||
while (offset < rctx->len) {
|
||||
struct zvfs_req *req = calloc(1, sizeof(*req));
|
||||
if (!req) {
|
||||
fprintf(stderr, "malloc failed\n");
|
||||
break;
|
||||
}
|
||||
|
||||
size_t consumed = zvfs_deserialize_req(rctx->buf + offset, rctx->len - offset, req);
|
||||
if (consumed == 0) {
|
||||
free(req);
|
||||
break; /* 等待更多数据 */
|
||||
}
|
||||
|
||||
printf("[req][%s]\n", cast_opcode2string(req->opcode));
|
||||
req->conn = c;
|
||||
offset += consumed;
|
||||
|
||||
if (dispatch_to_worker(req) < 0) {
|
||||
fprintf(stderr, "[dispatcher] [fd:%d] dispatch error\n", c->fd);
|
||||
}
|
||||
}
|
||||
|
||||
if (offset > 0) {
|
||||
size_t remain = rctx->len - offset;
|
||||
if (remain > 0) {
|
||||
memmove(rctx->buf, rctx->buf + offset, remain);
|
||||
}
|
||||
rctx->len = remain;
|
||||
}
|
||||
|
||||
if (rctx->len == rctx->cap) {
|
||||
fprintf(stderr, "[read] request too large or malformed fd=%d cap=%zu\n",
|
||||
fd, rctx->cap);
|
||||
zvfs_conn_close(c);
|
||||
}
|
||||
}
|
||||
|
||||
static void on_close(struct zvfs_conn *conn, void *ctx)
|
||||
{
|
||||
struct {
|
||||
uint8_t *buf;
|
||||
size_t len;
|
||||
size_t cap;
|
||||
} *rctx = zvfs_conn_get_ctx(conn);
|
||||
|
||||
if (rctx) {
|
||||
free(rctx->buf);
|
||||
free(rctx);
|
||||
zvfs_conn_set_ctx(conn, NULL);
|
||||
}
|
||||
|
||||
printf("connection closed fd=%d\n",
|
||||
zvfs_conn_get_fd(conn));
|
||||
}
|
||||
|
||||
|
||||
int main(void){
|
||||
|
||||
|
||||
const char *bdev_name = getenv("SPDK_BDEV_NAME") ? getenv("SPDK_BDEV_NAME") : ZVFS_BDEV;
|
||||
const char *json_file = getenv("SPDK_JSON_CONFIG") ? getenv("SPDK_JSON_CONFIG") : SPDK_JSON_PATH;
|
||||
|
||||
g_cq = CQ_Create();
|
||||
|
||||
zvfs_engine_init(bdev_name, json_file, 4);
|
||||
|
||||
struct zvfs_reactor_opts opts = {
|
||||
.socket_path = ZVFS_IPC_DEFAULT_SOCKET_PATH,
|
||||
.backlog = 128,
|
||||
.max_events = 64,
|
||||
.on_accept = on_accept,
|
||||
.on_read = on_read,
|
||||
.on_write = NULL,
|
||||
.on_close = on_close,
|
||||
.cb_ctx = &g_engine
|
||||
};
|
||||
|
||||
struct zvfs_reactor *r = zvfs_reactor_create(&opts);
|
||||
zvfs_reactor_run(r);
|
||||
|
||||
|
||||
|
||||
if(g_cq) CQ_Destroy(g_cq);
|
||||
}
|
||||
#endif
|
||||
1047
src/daemon/spdk_engine.c
Normal file
1047
src/daemon/spdk_engine.c
Normal file
File diff suppressed because it is too large
Load Diff
68
src/daemon/spdk_engine.h
Normal file
68
src/daemon/spdk_engine.h
Normal file
@@ -0,0 +1,68 @@
|
||||
#ifndef __ZVFS_SPDK_ENGINE_H__
|
||||
#define __ZVFS_SPDK_ENGINE_H__
|
||||
|
||||
#include "common/uthash.h"
|
||||
#include "proto/ipc_proto.h"
|
||||
#include <stdint.h>
|
||||
#include <sys/types.h>
|
||||
#include <stdatomic.h>
|
||||
#include <spdk/blob.h>
|
||||
|
||||
|
||||
// blob_handle 结构体:底层 blob 信息,不含文件级 size(上层维护)
|
||||
typedef struct zvfs_blob_handle {
|
||||
spdk_blob_id blob_id;
|
||||
struct spdk_blob *blob;
|
||||
void *dma_buf;
|
||||
uint64_t dma_buf_size;
|
||||
atomic_uint ref_count;
|
||||
} zvfs_blob_handle_t;
|
||||
|
||||
struct zvfs_io_thread {
|
||||
struct spdk_thread *thread;
|
||||
struct spdk_io_channel *channel; // 每个 io 线程独占一个 channel
|
||||
pthread_t tid;
|
||||
bool ready;
|
||||
};
|
||||
|
||||
typedef uint64_t zvfs_handle_id_t;
|
||||
|
||||
struct zvfs_blob_cache_entry {
|
||||
zvfs_handle_id_t handle_id; // key != blob_id
|
||||
struct zvfs_blob_handle *handle;
|
||||
UT_hash_handle hh;
|
||||
};
|
||||
|
||||
typedef struct zvfs_spdk_io_engine {
|
||||
struct spdk_bs_dev *bs_dev;
|
||||
struct spdk_blob_store *bs;
|
||||
|
||||
|
||||
/* 线程池:thread_pool[0] 固定为 md 线程,其余为 io 线程 */
|
||||
struct zvfs_io_thread *thread_pool; // 线程池
|
||||
int thread_count; // 总线程数 (= CPU 核心数)
|
||||
int io_thread_count; // 线程数量
|
||||
|
||||
struct zvfs_blob_cache_entry *handle_cache; // handle_id -> handle 映射
|
||||
pthread_mutex_t cache_mu;
|
||||
|
||||
uint64_t io_unit_size;
|
||||
uint64_t cluster_size;
|
||||
} zvfs_spdk_io_engine_t;
|
||||
|
||||
|
||||
int engine_cache_insert(struct zvfs_blob_handle *handle, zvfs_handle_id_t *out_id);
|
||||
struct zvfs_blob_handle *engine_cache_lookup(zvfs_handle_id_t handle_id);
|
||||
void engine_cache_remove(zvfs_handle_id_t handle_id);
|
||||
|
||||
int io_engine_init(const char *bdev_name, const char *json_file, int thread_num);
|
||||
int blob_create(struct zvfs_req *req);
|
||||
int blob_open(struct zvfs_req *req);
|
||||
int blob_write(struct zvfs_req *req);
|
||||
int blob_read(struct zvfs_req *req);
|
||||
int blob_resize(struct zvfs_req *req);
|
||||
int blob_sync_md(struct zvfs_req *req);
|
||||
int blob_close(struct zvfs_req *req);
|
||||
int blob_delete(struct zvfs_req *req);
|
||||
|
||||
#endif // __ZVFS_IO_ENGINE_H__
|
||||
210
src/daemon/spdk_engine_wrapper.c
Normal file
210
src/daemon/spdk_engine_wrapper.c
Normal file
@@ -0,0 +1,210 @@
|
||||
#include "spdk_engine_wrapper.h"
|
||||
#include "spdk_engine.h"
|
||||
#include "ipc_cq.h"
|
||||
#include <spdk/log.h>
|
||||
|
||||
extern struct zvfs_spdk_io_engine g_engine;
|
||||
|
||||
/** cq op */
|
||||
static void push_err_resp(struct zvfs_req *req, int status) {
|
||||
struct zvfs_resp *resp = calloc(1, sizeof(*resp));
|
||||
if (!resp) {
|
||||
SPDK_ERRLOG("push_err_resp: calloc failed, op_code=%u\n", req->opcode);
|
||||
if (req->data) free(req->data);
|
||||
if (req->add_ref_items) free(req->add_ref_items);
|
||||
free(req);
|
||||
return;
|
||||
}
|
||||
resp->opcode = req->opcode;
|
||||
resp->conn = req->conn;
|
||||
resp->status = status;
|
||||
if (req->data) free(req->data);
|
||||
if (req->add_ref_items) free(req->add_ref_items);
|
||||
free(req);
|
||||
CQ_Push(g_cq, resp);
|
||||
}
|
||||
|
||||
static void push_ok_resp(struct zvfs_req *req) {
|
||||
struct zvfs_resp *resp = calloc(1, sizeof(*resp));
|
||||
if (!resp) {
|
||||
SPDK_ERRLOG("push_ok_resp: calloc failed, op_code=%u\n", req->opcode);
|
||||
if (req->data) free(req->data);
|
||||
if (req->add_ref_items) free(req->add_ref_items);
|
||||
free(req);
|
||||
return;
|
||||
}
|
||||
resp->opcode = req->opcode;
|
||||
resp->conn = req->conn;
|
||||
resp->status = 0;
|
||||
if (req->data) free(req->data);
|
||||
if (req->add_ref_items) free(req->add_ref_items);
|
||||
free(req);
|
||||
CQ_Push(g_cq, resp);
|
||||
}
|
||||
|
||||
/** hash map op */
|
||||
int engine_cache_insert(struct zvfs_blob_handle *handle, zvfs_handle_id_t *out_id) {
|
||||
struct zvfs_blob_cache_entry *entry = calloc(1, sizeof(*entry));
|
||||
if (!entry) return -ENOMEM;
|
||||
entry->handle_id = (zvfs_handle_id_t)(uintptr_t)handle;
|
||||
entry->handle = handle;
|
||||
pthread_mutex_lock(&g_engine.cache_mu);
|
||||
HASH_ADD(hh, g_engine.handle_cache, handle_id, sizeof(zvfs_handle_id_t), entry);
|
||||
pthread_mutex_unlock(&g_engine.cache_mu);
|
||||
*out_id = entry->handle_id;
|
||||
return 0;
|
||||
}
|
||||
|
||||
struct zvfs_blob_handle *engine_cache_lookup(zvfs_handle_id_t handle_id) {
|
||||
struct zvfs_blob_cache_entry *entry = NULL;
|
||||
pthread_mutex_lock(&g_engine.cache_mu);
|
||||
HASH_FIND(hh, g_engine.handle_cache, &handle_id, sizeof(zvfs_handle_id_t), entry);
|
||||
pthread_mutex_unlock(&g_engine.cache_mu);
|
||||
return entry ? entry->handle : NULL;
|
||||
}
|
||||
|
||||
void engine_cache_remove(zvfs_handle_id_t handle_id) {
|
||||
struct zvfs_blob_cache_entry *entry = NULL;
|
||||
pthread_mutex_lock(&g_engine.cache_mu);
|
||||
HASH_FIND(hh, g_engine.handle_cache, &handle_id, sizeof(zvfs_handle_id_t), entry);
|
||||
if (entry) { HASH_DEL(g_engine.handle_cache, entry); free(entry); }
|
||||
pthread_mutex_unlock(&g_engine.cache_mu);
|
||||
}
|
||||
|
||||
static int fill_handle(struct zvfs_req *req, const char *op) {
|
||||
struct zvfs_blob_handle *handle = engine_cache_lookup(req->handle_id);
|
||||
if (!handle) {
|
||||
SPDK_ERRLOG("%s: invalid handle_id=%lu\n", op, req->handle_id);
|
||||
push_err_resp(req, -EBADF);
|
||||
return -EBADF;
|
||||
}
|
||||
req->handle = handle;
|
||||
return 0;
|
||||
}
|
||||
|
||||
|
||||
|
||||
// zvfs wrapper
|
||||
|
||||
int zvfs_engine_init(const char *bdev_name, const char *json_file, int thread_num) {
|
||||
return io_engine_init(bdev_name, json_file, thread_num);
|
||||
}
|
||||
|
||||
/* create / open:handle 在 engine 回调里注册,wrapper 直接透传 */
|
||||
static int zvfs_create(struct zvfs_req *req) {
|
||||
return blob_create(req);
|
||||
}
|
||||
|
||||
static int zvfs_open(struct zvfs_req *req) {
|
||||
return blob_open(req);
|
||||
}
|
||||
|
||||
/* delete:只需要 blob_id,无需 handle */
|
||||
static int zvfs_delete(struct zvfs_req *req) {
|
||||
return blob_delete(req);
|
||||
}
|
||||
|
||||
/* 以下操作需要先填充 handle */
|
||||
static int zvfs_write(struct zvfs_req *req) {
|
||||
if (fill_handle(req, "zvfs_write") != 0) return -EBADF;
|
||||
return blob_write(req);
|
||||
}
|
||||
|
||||
static int zvfs_read(struct zvfs_req *req) {
|
||||
if (fill_handle(req, "zvfs_read") != 0) return -EBADF;
|
||||
return blob_read(req);
|
||||
}
|
||||
|
||||
static int zvfs_resize(struct zvfs_req *req) {
|
||||
if (fill_handle(req, "zvfs_resize") != 0) return -EBADF;
|
||||
return blob_resize(req);
|
||||
}
|
||||
|
||||
static int zvfs_sync_md(struct zvfs_req *req) {
|
||||
if (fill_handle(req, "zvfs_sync_md") != 0) return -EBADF;
|
||||
return blob_sync_md(req);
|
||||
}
|
||||
|
||||
/* close:fill_handle 之后 engine 回调里会同步 cache_remove */
|
||||
static int zvfs_close(struct zvfs_req *req) {
|
||||
if (fill_handle(req, "zvfs_close") != 0) return -EBADF;
|
||||
return blob_close(req);
|
||||
}
|
||||
|
||||
static int zvfs_add_ref(struct zvfs_req *req) {
|
||||
if (req->ref_delta == 0) {
|
||||
push_err_resp(req, -EINVAL);
|
||||
return -EINVAL;
|
||||
}
|
||||
if (fill_handle(req, "zvfs_add_ref") != 0) return -EBADF;
|
||||
atomic_fetch_add(&req->handle->ref_count, req->ref_delta);
|
||||
push_ok_resp(req);
|
||||
return 0;
|
||||
}
|
||||
|
||||
static int zvfs_add_ref_batch(struct zvfs_req *req) {
|
||||
int rc = 0;
|
||||
uint32_t i = 0;
|
||||
|
||||
if (req->add_ref_count == 0 || !req->add_ref_items) {
|
||||
push_err_resp(req, -EINVAL);
|
||||
return -EINVAL;
|
||||
}
|
||||
|
||||
/* TODO: 当前为功能优先的非原子批量加引用实现。 */
|
||||
for (i = 0; i < req->add_ref_count; i++) {
|
||||
struct zvfs_add_ref_item *item = &req->add_ref_items[i];
|
||||
struct zvfs_blob_handle *handle = NULL;
|
||||
|
||||
if (item->ref_delta == 0) {
|
||||
rc = -EINVAL;
|
||||
continue;
|
||||
}
|
||||
|
||||
handle = engine_cache_lookup(item->handle_id);
|
||||
if (!handle) {
|
||||
rc = -EBADF;
|
||||
continue;
|
||||
}
|
||||
|
||||
atomic_fetch_add(&handle->ref_count, item->ref_delta);
|
||||
}
|
||||
|
||||
if (rc != 0) {
|
||||
push_err_resp(req, rc);
|
||||
return rc;
|
||||
}
|
||||
|
||||
push_ok_resp(req);
|
||||
return 0;
|
||||
}
|
||||
|
||||
int dispatch_to_worker(struct zvfs_req *req){
|
||||
switch (req->opcode)
|
||||
{
|
||||
case ZVFS_OP_CREATE:
|
||||
return zvfs_create(req);
|
||||
case ZVFS_OP_OPEN:
|
||||
return zvfs_open(req);
|
||||
case ZVFS_OP_READ:
|
||||
return zvfs_read(req);
|
||||
case ZVFS_OP_WRITE:
|
||||
return zvfs_write(req);
|
||||
case ZVFS_OP_RESIZE:
|
||||
return zvfs_resize(req);
|
||||
case ZVFS_OP_SYNC_MD:
|
||||
return zvfs_sync_md(req);
|
||||
case ZVFS_OP_CLOSE:
|
||||
return zvfs_close(req);
|
||||
case ZVFS_OP_DELETE:
|
||||
return zvfs_delete(req);
|
||||
case ZVFS_OP_ADD_REF:
|
||||
return zvfs_add_ref(req);
|
||||
case ZVFS_OP_ADD_REF_BATCH:
|
||||
return zvfs_add_ref_batch(req);
|
||||
default:
|
||||
break;
|
||||
}
|
||||
|
||||
return -1;
|
||||
}
|
||||
13
src/daemon/spdk_engine_wrapper.h
Normal file
13
src/daemon/spdk_engine_wrapper.h
Normal file
@@ -0,0 +1,13 @@
|
||||
#ifndef __ZVFS_ENGINE_H__
|
||||
#define __ZVFS_ENGINE_H__
|
||||
|
||||
#include "proto/ipc_proto.h"
|
||||
|
||||
|
||||
|
||||
int zvfs_engine_init(const char *bdev_name, const char *json_file, int thread_num);
|
||||
|
||||
|
||||
int dispatch_to_worker(struct zvfs_req *req);
|
||||
|
||||
#endif
|
||||
BIN
src/daemon/zvfs_daemon
Executable file
BIN
src/daemon/zvfs_daemon
Executable file
Binary file not shown.
@@ -1,7 +1,8 @@
|
||||
#ifndef _GNU_SOURCE
|
||||
#define _GNU_SOURCE
|
||||
#endif
|
||||
#include "config.h"
|
||||
|
||||
#include "common/config.h"
|
||||
#include "common/utils.h"
|
||||
#include "fs/zvfs.h"
|
||||
#include "fs/zvfs_inode.h"
|
||||
@@ -10,6 +11,7 @@
|
||||
|
||||
#include <sys/xattr.h>
|
||||
#include <sys/types.h>
|
||||
#include <errno.h>
|
||||
struct zvfs_fs g_fs = {0};
|
||||
|
||||
/* ------------------------------------------------------------------ */
|
||||
|
||||
@@ -67,10 +67,11 @@ void inode_remove(uint64_t blob_id) {
|
||||
/* size / timestamp helpers (调用方持有 inode->mu) */
|
||||
/* ------------------------------------------------------------------ */
|
||||
|
||||
void inode_update_size(struct zvfs_inode *inode, int real_fd, uint64_t new_size) {
|
||||
int inode_update_size(struct zvfs_inode *inode, int real_fd, uint64_t new_size) {
|
||||
inode->logical_size = new_size;
|
||||
if (real_fd >= 0)
|
||||
ftruncate(real_fd, (off_t)new_size); /* 同步 st_size,忽略错误 */
|
||||
return ftruncate(real_fd, (off_t)new_size); /* 同步 st_size,忽略错误 */
|
||||
return 0;
|
||||
}
|
||||
|
||||
void inode_touch_atime(struct zvfs_inode *inode) {
|
||||
|
||||
@@ -49,7 +49,7 @@ void inode_remove(uint64_t blob_id);
|
||||
|
||||
// 更新 logical_size,同时负责调用 ftruncate 同步 st_size
|
||||
// 需持有 inode->mu
|
||||
void inode_update_size(struct zvfs_inode *inode, int real_fd, uint64_t new_size);
|
||||
int inode_update_size(struct zvfs_inode *inode, int real_fd, uint64_t new_size);
|
||||
|
||||
// 更新时间戳(需持有 inode->mu)
|
||||
void inode_touch_atime(struct zvfs_inode *inode);
|
||||
|
||||
@@ -15,19 +15,18 @@
|
||||
struct zvfs_open_file *openfile_alloc(int fd,
|
||||
struct zvfs_inode *inode,
|
||||
int flags,
|
||||
struct zvfs_blob_handle *handle)
|
||||
uint64_t handle_id)
|
||||
{
|
||||
struct zvfs_open_file *of = calloc(1, sizeof(*of));
|
||||
if (!of)
|
||||
return NULL;
|
||||
|
||||
of->fd = fd;
|
||||
of->inode = inode;
|
||||
of->handle = handle;
|
||||
of->flags = flags;
|
||||
of->fd_flags = 0;
|
||||
of->offset = 0;
|
||||
atomic_init(&of->ref_count, 1);
|
||||
of->fd = fd;
|
||||
of->inode = inode;
|
||||
of->handle_id = handle_id;
|
||||
of->flags = flags;
|
||||
of->fd_flags = 0;
|
||||
of->offset = 0;
|
||||
|
||||
return of;
|
||||
}
|
||||
|
||||
@@ -3,33 +3,26 @@
|
||||
|
||||
#include "common/uthash.h"
|
||||
#include "spdk_engine/io_engine.h"
|
||||
#include <stdatomic.h>
|
||||
#include <stdint.h>
|
||||
|
||||
#ifndef SPDK_BLOB_ID_DEFINED
|
||||
typedef uint64_t spdk_blob_id;
|
||||
#define SPDK_BLOB_ID_DEFINED
|
||||
#endif
|
||||
|
||||
struct zvfs_open_file {
|
||||
int fd; // key,和真实 fd 1:1
|
||||
struct zvfs_inode *inode;
|
||||
struct zvfs_blob_handle *handle;
|
||||
uint64_t handle_id;
|
||||
|
||||
int flags;
|
||||
int fd_flags;
|
||||
|
||||
uint64_t offset; // 非 APPEND 模式的当前位置
|
||||
atomic_int ref_count; // dup / close 用
|
||||
|
||||
UT_hash_handle hh;
|
||||
};
|
||||
|
||||
// 分配 openfile,不插入全局表,ref_count 初始为 1
|
||||
// 分配 openfile,不插入全局表
|
||||
struct zvfs_open_file *openfile_alloc(int fd, struct zvfs_inode *inode,
|
||||
int flags, struct zvfs_blob_handle *handle);
|
||||
int flags, uint64_t handle_id);
|
||||
|
||||
// 释放内存(调用前确保 ref_count == 0,不负责 blob_close)
|
||||
// 释放内存
|
||||
void openfile_free(struct zvfs_open_file *of);
|
||||
|
||||
// 插入全局表(需持有 fd_mu)
|
||||
|
||||
@@ -2,7 +2,8 @@
|
||||
#ifndef _GNU_SOURCE
|
||||
#define _GNU_SOURCE
|
||||
#endif
|
||||
#include "config.h"
|
||||
|
||||
#include "common/config.h"
|
||||
#include "zvfs_sys_init.h"
|
||||
#include "fs/zvfs.h" // zvfs_fs_init
|
||||
#include "spdk_engine/io_engine.h"
|
||||
@@ -17,17 +18,6 @@ static int _init_ok = 0;
|
||||
static void
|
||||
do_init(void)
|
||||
{
|
||||
const char *bdev = getenv("ZVFS_BDEV");
|
||||
if (!bdev) {
|
||||
bdev = ZVFS_BDEV;
|
||||
fprintf(stderr, "[zvfs] ZVFS_BDEV not set, set as (%s)\n", ZVFS_BDEV);
|
||||
}
|
||||
|
||||
if (io_engine_init(bdev) != 0) {
|
||||
fprintf(stderr, "[zvfs] FATAL: io_engine_init(%s) failed\n", bdev);
|
||||
abort();
|
||||
}
|
||||
|
||||
_init_ok = 1;
|
||||
}
|
||||
|
||||
|
||||
@@ -68,9 +68,19 @@ zvfs_fcntl_impl(int fd, int cmd, va_list ap)
|
||||
/* ---- dup 类 -------------------------------------------------- */
|
||||
case F_DUPFD:
|
||||
case F_DUPFD_CLOEXEC: {
|
||||
(void)va_arg(ap, int);
|
||||
errno = ENOTSUP;
|
||||
return -1;
|
||||
int minfd = va_arg(ap, int);
|
||||
int newfd = real_fcntl(fd, cmd, minfd);
|
||||
if (newfd < 0)
|
||||
return -1;
|
||||
|
||||
int new_fd_flags = (cmd == F_DUPFD_CLOEXEC) ? FD_CLOEXEC : 0;
|
||||
if (zvfs_dup_attach_newfd(fd, newfd, new_fd_flags) < 0) {
|
||||
int saved = errno;
|
||||
real_close(newfd);
|
||||
errno = saved;
|
||||
return -1;
|
||||
}
|
||||
return newfd;
|
||||
}
|
||||
|
||||
/* ---- 文件锁(不实现,假装无锁)-------------------------------- */
|
||||
|
||||
@@ -19,6 +19,91 @@
|
||||
#include <pthread.h>
|
||||
#include <stdio.h>
|
||||
|
||||
/* ------------------------------------------------------------------ */
|
||||
/* 内部:路径判定辅助 */
|
||||
/* ------------------------------------------------------------------ */
|
||||
|
||||
/**
|
||||
* openat 到达符号链接之后跳转到 /zvfs 下,导致捕获不了。
|
||||
*
|
||||
* 1. 判断路径是不是 /zvfs
|
||||
* 2. 判断readpath是不是 /zvfs
|
||||
* 3. 如果O_CREATE并且目标不存在,realpath什么也拿不到。先解析父路径,再拼接看是不是落在 /zvfs
|
||||
*/
|
||||
static int
|
||||
zvfs_classify_path(const char *abspath, int may_create,
|
||||
char *normalized_out, size_t out_size)
|
||||
{
|
||||
char resolved[PATH_MAX];
|
||||
char tmp[PATH_MAX];
|
||||
char parent[PATH_MAX];
|
||||
char candidate[PATH_MAX];
|
||||
const char *name;
|
||||
char *slash;
|
||||
int n;
|
||||
|
||||
if (!abspath || !normalized_out || out_size == 0) {
|
||||
return 0;
|
||||
}
|
||||
|
||||
strncpy(normalized_out, abspath, out_size);
|
||||
normalized_out[out_size - 1] = '\0';
|
||||
|
||||
if (zvfs_is_zvfs_path(abspath)) {
|
||||
return 1;
|
||||
}
|
||||
|
||||
if (realpath(abspath, resolved) != NULL) {
|
||||
if (zvfs_is_zvfs_path(resolved)) {
|
||||
strncpy(normalized_out, resolved, out_size);
|
||||
normalized_out[out_size - 1] = '\0';
|
||||
return 1;
|
||||
}
|
||||
return 0;
|
||||
}
|
||||
|
||||
if (!may_create) {
|
||||
return 0;
|
||||
}
|
||||
|
||||
strncpy(tmp, abspath, sizeof(tmp));
|
||||
tmp[sizeof(tmp) - 1] = '\0';
|
||||
slash = strrchr(tmp, '/');
|
||||
if (!slash) {
|
||||
return 0;
|
||||
}
|
||||
|
||||
name = slash + 1;
|
||||
if (*name == '\0') {
|
||||
return 0;
|
||||
}
|
||||
|
||||
if (slash == tmp) {
|
||||
strcpy(parent, "/");
|
||||
} else {
|
||||
*slash = '\0';
|
||||
strncpy(parent, tmp, sizeof(parent));
|
||||
parent[sizeof(parent) - 1] = '\0';
|
||||
}
|
||||
|
||||
if (realpath(parent, resolved) == NULL) {
|
||||
return 0;
|
||||
}
|
||||
|
||||
n = snprintf(candidate, sizeof(candidate), "%s/%s", resolved, name);
|
||||
if (n <= 0 || (size_t)n >= sizeof(candidate)) {
|
||||
return 0;
|
||||
}
|
||||
|
||||
if (!zvfs_is_zvfs_path(candidate)) {
|
||||
return 0;
|
||||
}
|
||||
|
||||
strncpy(normalized_out, candidate, out_size);
|
||||
normalized_out[out_size - 1] = '\0';
|
||||
return 1;
|
||||
}
|
||||
|
||||
/* ------------------------------------------------------------------ */
|
||||
/* 内部:open 的核心逻辑(路径已解析为绝对路径) */
|
||||
/* ------------------------------------------------------------------ */
|
||||
@@ -36,16 +121,15 @@
|
||||
static int
|
||||
zvfs_open_impl(int real_fd, const char *abspath, int flags, mode_t mode)
|
||||
{
|
||||
struct zvfs_inode *inode = NULL;
|
||||
struct zvfs_blob_handle *handle = NULL;
|
||||
uint64_t blob_id = 0;
|
||||
struct zvfs_inode *inode = NULL;
|
||||
uint64_t blob_id = 0;
|
||||
uint64_t handle_id = 0;
|
||||
|
||||
if (flags & O_CREAT) {
|
||||
/* ---- 创建路径 -------------------------------------------- */
|
||||
|
||||
/* 1. 创建 blob */
|
||||
handle = blob_create(0);
|
||||
if (!handle) {
|
||||
if (blob_create(0, &blob_id, &handle_id) != 0) {
|
||||
int saved = errno;
|
||||
if (saved == 0) saved = EIO;
|
||||
fprintf(stderr,
|
||||
@@ -54,7 +138,6 @@ zvfs_open_impl(int real_fd, const char *abspath, int flags, mode_t mode)
|
||||
errno = saved;
|
||||
goto fail;
|
||||
}
|
||||
blob_id = handle->id;
|
||||
|
||||
/* 2. 把 blob_id 写入真实文件的 xattr */
|
||||
if (zvfs_xattr_write_blob_id(real_fd, blob_id) < 0) goto fail;
|
||||
@@ -88,8 +171,10 @@ zvfs_open_impl(int real_fd, const char *abspath, int flags, mode_t mode)
|
||||
if (inode) {
|
||||
/* path_cache 命中:直接用缓存的 inode,重新 blob_open */
|
||||
blob_id = inode->blob_id;
|
||||
handle = blob_open(blob_id);
|
||||
if (!handle) { if (errno == 0) errno = EIO; goto fail; }
|
||||
if (blob_open(blob_id, &handle_id) != 0) {
|
||||
if (errno == 0) errno = EIO;
|
||||
goto fail;
|
||||
}
|
||||
/* 共享 inode,增加引用 */
|
||||
atomic_fetch_add(&inode->ref_count, 1);
|
||||
|
||||
@@ -106,6 +191,10 @@ zvfs_open_impl(int real_fd, const char *abspath, int flags, mode_t mode)
|
||||
pthread_mutex_unlock(&g_fs.inode_mu);
|
||||
|
||||
if (inode) {
|
||||
if (blob_open(blob_id, &handle_id) != 0) {
|
||||
if (errno == 0) errno = EIO;
|
||||
goto fail;
|
||||
}
|
||||
atomic_fetch_add(&inode->ref_count, 1);
|
||||
} else {
|
||||
/* 全新 inode:需从真实文件 stat 获取 mode/size */
|
||||
@@ -123,15 +212,16 @@ zvfs_open_impl(int real_fd, const char *abspath, int flags, mode_t mode)
|
||||
pthread_mutex_lock(&g_fs.path_mu);
|
||||
path_cache_insert(abspath, inode);
|
||||
pthread_mutex_unlock(&g_fs.path_mu);
|
||||
if (blob_open(blob_id, &handle_id) != 0) {
|
||||
if (errno == 0) errno = EIO;
|
||||
goto fail;
|
||||
}
|
||||
}
|
||||
|
||||
handle = blob_open(blob_id);
|
||||
if (!handle) { if (errno == 0) errno = EIO; goto fail; }
|
||||
}
|
||||
}
|
||||
|
||||
/* ---- 分配 openfile,插入 fd_table ---------------------------- */
|
||||
struct zvfs_open_file *of = openfile_alloc(real_fd, inode, flags, handle);
|
||||
struct zvfs_open_file *of = openfile_alloc(real_fd, inode, flags, handle_id);
|
||||
if (!of) { errno = ENOMEM; goto fail_handle; }
|
||||
|
||||
pthread_mutex_lock(&g_fs.fd_mu);
|
||||
@@ -141,7 +231,9 @@ zvfs_open_impl(int real_fd, const char *abspath, int flags, mode_t mode)
|
||||
return real_fd;
|
||||
|
||||
fail_handle:
|
||||
blob_close(handle);
|
||||
if (handle_id != 0) {
|
||||
blob_close(handle_id);
|
||||
}
|
||||
fail:
|
||||
/* inode 若刚分配(ref_count==1)需要回滚 */
|
||||
if (inode && atomic_load(&inode->ref_count) == 1) {
|
||||
@@ -165,6 +257,10 @@ open(const char *path, int flags, ...)
|
||||
{
|
||||
ZVFS_HOOK_ENTER();
|
||||
|
||||
char abspath[PATH_MAX];
|
||||
char normpath[PATH_MAX];
|
||||
int is_zvfs_path = 0;
|
||||
|
||||
mode_t mode = 0;
|
||||
if (flags & O_CREAT) {
|
||||
va_list ap;
|
||||
@@ -173,8 +269,13 @@ open(const char *path, int flags, ...)
|
||||
va_end(ap);
|
||||
}
|
||||
|
||||
if (zvfs_resolve_atpath(AT_FDCWD, path, abspath, sizeof(abspath)) == 0) {
|
||||
is_zvfs_path = zvfs_classify_path(abspath, (flags & O_CREAT) != 0,
|
||||
normpath, sizeof(normpath));
|
||||
}
|
||||
|
||||
int ret;
|
||||
if (ZVFS_IN_HOOK() || !zvfs_is_zvfs_path(path)) {
|
||||
if (ZVFS_IN_HOOK() || !is_zvfs_path) {
|
||||
ret = real_open(path, flags, mode);
|
||||
ZVFS_HOOK_LEAVE();
|
||||
return ret;
|
||||
@@ -186,7 +287,7 @@ open(const char *path, int flags, ...)
|
||||
int real_fd = real_open(path, flags, mode);
|
||||
if (real_fd < 0) { ZVFS_HOOK_LEAVE(); return -1; }
|
||||
|
||||
ret = zvfs_open_impl(real_fd, path, flags, mode);
|
||||
ret = zvfs_open_impl(real_fd, normpath, flags, mode);
|
||||
if (ret < 0) {
|
||||
int saved = errno;
|
||||
real_close(real_fd);
|
||||
@@ -217,6 +318,9 @@ openat(int dirfd, const char *path, int flags, ...)
|
||||
{
|
||||
ZVFS_HOOK_ENTER();
|
||||
|
||||
char normpath[PATH_MAX];
|
||||
int is_zvfs_path = 0;
|
||||
|
||||
mode_t mode = 0;
|
||||
if (flags & O_CREAT) {
|
||||
va_list ap; va_start(ap, flags);
|
||||
@@ -230,9 +334,11 @@ openat(int dirfd, const char *path, int flags, ...)
|
||||
ZVFS_HOOK_LEAVE();
|
||||
return -1;
|
||||
}
|
||||
is_zvfs_path = zvfs_classify_path(abspath, (flags & O_CREAT) != 0,
|
||||
normpath, sizeof(normpath));
|
||||
|
||||
int ret;
|
||||
if (ZVFS_IN_HOOK() || !zvfs_is_zvfs_path(abspath)) {
|
||||
if (ZVFS_IN_HOOK() || !is_zvfs_path) {
|
||||
ret = real_openat(dirfd, path, flags, mode);
|
||||
ZVFS_HOOK_LEAVE();
|
||||
return ret;
|
||||
@@ -243,7 +349,7 @@ openat(int dirfd, const char *path, int flags, ...)
|
||||
int real_fd = real_openat(dirfd, path, flags, mode);
|
||||
if (real_fd < 0) { ZVFS_HOOK_LEAVE(); return -1; }
|
||||
|
||||
ret = zvfs_open_impl(real_fd, abspath, flags, mode);
|
||||
ret = zvfs_open_impl(real_fd, normpath, flags, mode);
|
||||
if (ret < 0) {
|
||||
int saved = errno;
|
||||
real_close(real_fd);
|
||||
@@ -321,43 +427,23 @@ int __libc_open(const char *path, int flags, ...)
|
||||
/* ------------------------------------------------------------------ */
|
||||
|
||||
/*
|
||||
* zvfs_close_impl - zvfs fd 的关闭逻辑。
|
||||
*
|
||||
* 调用方已持有 fd_mu。函数内部会释放 fd_mu 后再处理 inode。
|
||||
* zvfs_release_openfile - 释放一个 openfile 对应的 zvfs 资源。
|
||||
* 这里只处理 zvfs bookkeeping,不做 real_close(fd)。
|
||||
*/
|
||||
static int
|
||||
zvfs_close_impl(int fd)
|
||||
zvfs_release_openfile(struct zvfs_open_file *of, int do_sync_md)
|
||||
{
|
||||
/* 持 fd_mu 取出 openfile,从表里摘除 */
|
||||
pthread_mutex_lock(&g_fs.fd_mu);
|
||||
struct zvfs_open_file *of = openfile_lookup(fd);
|
||||
if (!of) {
|
||||
pthread_mutex_unlock(&g_fs.fd_mu);
|
||||
errno = EBADF;
|
||||
return -1;
|
||||
}
|
||||
int new_ref = atomic_fetch_sub(&of->ref_count, 1) - 1;
|
||||
if (new_ref == 0)
|
||||
openfile_remove(fd);
|
||||
pthread_mutex_unlock(&g_fs.fd_mu);
|
||||
|
||||
if (new_ref > 0) {
|
||||
/*
|
||||
* 还有其他 dup 出来的 fd 引用同一个 openfile,
|
||||
* 只关闭真实 fd,不动 blob 和 inode。
|
||||
*/
|
||||
return real_close(fd);
|
||||
}
|
||||
|
||||
/* ---- openfile 引用归零:先刷 metadata,再关闭 blob handle ------ */
|
||||
struct zvfs_inode *inode = of->inode;
|
||||
struct zvfs_blob_handle *handle = of->handle;
|
||||
int sync_failed = 0;
|
||||
int saved_errno = 0;
|
||||
struct zvfs_inode *inode = of->inode;
|
||||
uint64_t handle_id = of->handle_id;
|
||||
openfile_free(of);
|
||||
|
||||
if (blob_sync_md(handle) < 0)
|
||||
sync_failed = 1;
|
||||
blob_close(handle);
|
||||
if (do_sync_md && handle_id != 0 && blob_sync_md(handle_id) < 0) {
|
||||
saved_errno = (errno != 0) ? errno : EIO;
|
||||
}
|
||||
if (handle_id != 0 && blob_close(handle_id) < 0 && saved_errno == 0) {
|
||||
saved_errno = (errno != 0) ? errno : EIO;
|
||||
}
|
||||
|
||||
/* ---- inode ref_count-- --------------------------------------- */
|
||||
int inode_ref = atomic_fetch_sub(&inode->ref_count, 1) - 1;
|
||||
@@ -372,8 +458,8 @@ zvfs_close_impl(int fd)
|
||||
do_delete = inode->deleted;
|
||||
pthread_mutex_unlock(&inode->mu);
|
||||
|
||||
if (do_delete)
|
||||
blob_delete(inode->blob_id);
|
||||
if (do_delete && blob_delete(inode->blob_id) < 0 && saved_errno == 0)
|
||||
saved_errno = (errno != 0) ? errno : EIO;
|
||||
|
||||
pthread_mutex_lock(&g_fs.inode_mu);
|
||||
inode_remove(inode->blob_id);
|
||||
@@ -403,13 +489,52 @@ zvfs_close_impl(int fd)
|
||||
inode_free(inode);
|
||||
}
|
||||
|
||||
if (saved_errno != 0) {
|
||||
errno = saved_errno;
|
||||
return -1;
|
||||
}
|
||||
return 0;
|
||||
}
|
||||
|
||||
/*
|
||||
* zvfs_detach_fd_mapping - 仅摘除 fd -> openfile 映射并释放 zvfs 资源。
|
||||
* 不调用 real_close(fd),用于 dup2/dup3 中 newfd 旧值清理。
|
||||
*/
|
||||
static int
|
||||
zvfs_detach_fd_mapping(int fd, int do_sync_md)
|
||||
{
|
||||
pthread_mutex_lock(&g_fs.fd_mu);
|
||||
struct zvfs_open_file *of = openfile_lookup(fd);
|
||||
if (!of) {
|
||||
pthread_mutex_unlock(&g_fs.fd_mu);
|
||||
errno = EBADF;
|
||||
return -1;
|
||||
}
|
||||
openfile_remove(fd);
|
||||
pthread_mutex_unlock(&g_fs.fd_mu);
|
||||
|
||||
return zvfs_release_openfile(of, do_sync_md);
|
||||
}
|
||||
|
||||
/*
|
||||
* zvfs_close_impl - close(fd) 的 zvfs 路径:
|
||||
* 先做 bookkeeping,再做 real_close(fd)。
|
||||
*/
|
||||
static int
|
||||
zvfs_close_impl(int fd)
|
||||
{
|
||||
int bk_rc = zvfs_detach_fd_mapping(fd, 1);
|
||||
int bk_errno = (bk_rc < 0) ? errno : 0;
|
||||
|
||||
int rc = real_close(fd);
|
||||
if (rc < 0)
|
||||
return -1;
|
||||
if (sync_failed) {
|
||||
errno = EIO;
|
||||
|
||||
if (bk_rc < 0) {
|
||||
errno = bk_errno;
|
||||
return -1;
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
@@ -436,6 +561,180 @@ close(int fd)
|
||||
int __close(int fd) { return close(fd); }
|
||||
int __libc_close(int fd) { return close(fd); }
|
||||
|
||||
/* ------------------------------------------------------------------ */
|
||||
/* dup helper */
|
||||
/* ------------------------------------------------------------------ */
|
||||
|
||||
int
|
||||
zvfs_dup_attach_newfd(int oldfd, int newfd, int new_fd_flags)
|
||||
{
|
||||
struct zvfs_open_file *old_of, *new_of;
|
||||
int fd_flags;
|
||||
int rc;
|
||||
int saved;
|
||||
|
||||
if (oldfd < 0 || newfd < 0) {
|
||||
errno = EBADF;
|
||||
return -1;
|
||||
}
|
||||
|
||||
pthread_mutex_lock(&g_fs.fd_mu);
|
||||
old_of = openfile_lookup(oldfd);
|
||||
if (!old_of) {
|
||||
pthread_mutex_unlock(&g_fs.fd_mu);
|
||||
errno = EBADF;
|
||||
return -1;
|
||||
}
|
||||
if (openfile_lookup(newfd) != NULL) {
|
||||
pthread_mutex_unlock(&g_fs.fd_mu);
|
||||
errno = EEXIST;
|
||||
return -1;
|
||||
}
|
||||
|
||||
rc = blob_add_ref(old_of->handle_id, 1);
|
||||
if (rc != 0) {
|
||||
pthread_mutex_unlock(&g_fs.fd_mu);
|
||||
return -1;
|
||||
}
|
||||
|
||||
new_of = openfile_alloc(newfd, old_of->inode, old_of->flags, old_of->handle_id);
|
||||
if (!new_of) {
|
||||
saved = (errno != 0) ? errno : ENOMEM;
|
||||
(void)blob_close(old_of->handle_id);
|
||||
pthread_mutex_unlock(&g_fs.fd_mu);
|
||||
errno = saved;
|
||||
return -1;
|
||||
}
|
||||
|
||||
new_of->offset = old_of->offset;
|
||||
fd_flags = (new_fd_flags >= 0) ? new_fd_flags : old_of->fd_flags;
|
||||
new_of->fd_flags = fd_flags;
|
||||
|
||||
atomic_fetch_add(&old_of->inode->ref_count, 1);
|
||||
openfile_insert(new_of);
|
||||
pthread_mutex_unlock(&g_fs.fd_mu);
|
||||
return 0;
|
||||
}
|
||||
|
||||
static int
|
||||
zvfs_add_ref_batch_or_fallback(const uint64_t *handle_ids,
|
||||
const uint32_t *ref_deltas,
|
||||
uint32_t count)
|
||||
{
|
||||
uint32_t i;
|
||||
|
||||
if (count == 0)
|
||||
return 0;
|
||||
|
||||
if (blob_add_ref_batch(handle_ids, ref_deltas, count) == 0)
|
||||
return 0;
|
||||
|
||||
for (i = 0; i < count; i++) {
|
||||
if (blob_add_ref(handle_ids[i], ref_deltas[i]) != 0)
|
||||
return -1;
|
||||
}
|
||||
return 0;
|
||||
}
|
||||
|
||||
static void
|
||||
zvfs_rollback_added_refs(const uint64_t *handle_ids, uint32_t count)
|
||||
{
|
||||
uint32_t i;
|
||||
for (i = 0; i < count; i++) {
|
||||
if (handle_ids[i] != 0)
|
||||
(void)blob_close(handle_ids[i]);
|
||||
}
|
||||
}
|
||||
|
||||
static int
|
||||
zvfs_snapshot_fd_handles(uint64_t **handle_ids_out,
|
||||
uint32_t **ref_deltas_out,
|
||||
uint32_t *count_out)
|
||||
{
|
||||
struct zvfs_open_file *of, *tmp;
|
||||
uint32_t i = 0;
|
||||
uint32_t count;
|
||||
uint64_t *handle_ids = NULL;
|
||||
uint32_t *ref_deltas = NULL;
|
||||
|
||||
*handle_ids_out = NULL;
|
||||
*ref_deltas_out = NULL;
|
||||
*count_out = 0;
|
||||
|
||||
pthread_mutex_lock(&g_fs.fd_mu);
|
||||
count = (uint32_t)HASH_COUNT(g_fs.fd_table);
|
||||
if (count == 0) {
|
||||
pthread_mutex_unlock(&g_fs.fd_mu);
|
||||
return 0;
|
||||
}
|
||||
|
||||
handle_ids = calloc(count, sizeof(*handle_ids));
|
||||
ref_deltas = calloc(count, sizeof(*ref_deltas));
|
||||
if (!handle_ids || !ref_deltas) {
|
||||
pthread_mutex_unlock(&g_fs.fd_mu);
|
||||
free(handle_ids);
|
||||
free(ref_deltas);
|
||||
errno = ENOMEM;
|
||||
return -1;
|
||||
}
|
||||
|
||||
HASH_ITER(hh, g_fs.fd_table, of, tmp) {
|
||||
if (i >= count)
|
||||
break;
|
||||
handle_ids[i] = of->handle_id;
|
||||
ref_deltas[i] = 1;
|
||||
i++;
|
||||
}
|
||||
pthread_mutex_unlock(&g_fs.fd_mu);
|
||||
|
||||
*handle_ids_out = handle_ids;
|
||||
*ref_deltas_out = ref_deltas;
|
||||
*count_out = i;
|
||||
return 0;
|
||||
}
|
||||
|
||||
static int
|
||||
zvfs_snapshot_fds_in_range(unsigned int first, unsigned int last,
|
||||
int **fds_out, uint32_t *count_out)
|
||||
{
|
||||
struct zvfs_open_file *of, *tmp;
|
||||
uint32_t cap;
|
||||
uint32_t n = 0;
|
||||
int *fds = NULL;
|
||||
|
||||
*fds_out = NULL;
|
||||
*count_out = 0;
|
||||
|
||||
pthread_mutex_lock(&g_fs.fd_mu);
|
||||
cap = (uint32_t)HASH_COUNT(g_fs.fd_table);
|
||||
if (cap == 0) {
|
||||
pthread_mutex_unlock(&g_fs.fd_mu);
|
||||
return 0;
|
||||
}
|
||||
|
||||
fds = calloc(cap, sizeof(*fds));
|
||||
if (!fds) {
|
||||
pthread_mutex_unlock(&g_fs.fd_mu);
|
||||
errno = ENOMEM;
|
||||
return -1;
|
||||
}
|
||||
|
||||
HASH_ITER(hh, g_fs.fd_table, of, tmp) {
|
||||
if (of->fd < 0) {
|
||||
continue;
|
||||
}
|
||||
if ((unsigned int)of->fd < first || (unsigned int)of->fd > last) {
|
||||
continue;
|
||||
}
|
||||
fds[n++] = of->fd;
|
||||
}
|
||||
pthread_mutex_unlock(&g_fs.fd_mu);
|
||||
|
||||
*fds_out = fds;
|
||||
*count_out = n;
|
||||
return 0;
|
||||
}
|
||||
|
||||
/* ------------------------------------------------------------------ */
|
||||
/* close_range */
|
||||
/* ------------------------------------------------------------------ */
|
||||
@@ -452,32 +751,53 @@ close_range(unsigned int first, unsigned int last, int flags)
|
||||
return ret;
|
||||
}
|
||||
|
||||
if (first > last) {
|
||||
errno = EINVAL;
|
||||
ZVFS_HOOK_LEAVE();
|
||||
return -1;
|
||||
}
|
||||
|
||||
/*
|
||||
* 遍历范围内所有 fd,zvfs fd 单独走 zvfs_close_impl,
|
||||
* 其余统一交给 real_close_range(如果内核支持)。
|
||||
* 若内核不支持 close_range(< 5.9),逐个 close。
|
||||
* 只快照当前 zvfs fd_table 中命中的 fd,避免对 [first,last] 做
|
||||
* 全范围扫描(last=UINT_MAX 时会非常慢,且旧逻辑存在回绕风险)。
|
||||
*/
|
||||
int any_err = 0;
|
||||
int inited = 0;
|
||||
for (unsigned int fd = first; fd <= last; fd++) {
|
||||
if (zvfs_is_zvfs_fd((int)fd)) {
|
||||
if (!inited) {
|
||||
zvfs_ensure_init();
|
||||
inited = 1;
|
||||
}
|
||||
if (zvfs_close_impl((int)fd) < 0) any_err = 1;
|
||||
int *zvfs_fds = NULL;
|
||||
uint32_t zvfs_fd_count = 0;
|
||||
if (zvfs_snapshot_fds_in_range(first, last, &zvfs_fds, &zvfs_fd_count) < 0) {
|
||||
ZVFS_HOOK_LEAVE();
|
||||
return -1;
|
||||
}
|
||||
|
||||
for (uint32_t i = 0; i < zvfs_fd_count; i++) {
|
||||
if (!inited) {
|
||||
zvfs_ensure_init();
|
||||
inited = 1;
|
||||
}
|
||||
if (zvfs_close_impl(zvfs_fds[i]) < 0) {
|
||||
any_err = 1;
|
||||
}
|
||||
}
|
||||
free(zvfs_fds);
|
||||
|
||||
/* 让内核处理剩余非 zvfs fd(CLOEXEC 等 flags 也在这里生效) */
|
||||
if (real_close_range) {
|
||||
if (real_close_range(first, last, flags) < 0 && !any_err)
|
||||
any_err = 1;
|
||||
} else {
|
||||
/* 降级:逐个 close 非 zvfs fd */
|
||||
for (unsigned int fd = first; fd <= last; fd++) {
|
||||
/* 降级:逐个 close 非 zvfs fd(按 open-max 做上界截断) */
|
||||
unsigned int upper = last;
|
||||
long open_max = sysconf(_SC_OPEN_MAX);
|
||||
if (open_max > 0 && upper >= (unsigned int)open_max) {
|
||||
upper = (unsigned int)open_max - 1;
|
||||
}
|
||||
|
||||
for (unsigned int fd = first; fd <= upper; fd++) {
|
||||
if (!zvfs_is_zvfs_fd((int)fd))
|
||||
real_close((int)fd);
|
||||
if (fd == upper)
|
||||
break;
|
||||
}
|
||||
}
|
||||
|
||||
@@ -501,14 +821,24 @@ dup(int oldfd)
|
||||
return ret;
|
||||
}
|
||||
|
||||
/*
|
||||
* 当前版本不支持在 zvfs fd 上做 dup。
|
||||
* 先明确返回 ENOTSUP,避免暴露错误的 offset 语义。
|
||||
*/
|
||||
zvfs_ensure_init();
|
||||
errno = ENOTSUP;
|
||||
|
||||
int newfd = real_dup(oldfd);
|
||||
if (newfd < 0) {
|
||||
ZVFS_HOOK_LEAVE();
|
||||
return -1;
|
||||
}
|
||||
|
||||
if (zvfs_dup_attach_newfd(oldfd, newfd, 0) < 0) {
|
||||
int saved = errno;
|
||||
(void)real_close(newfd);
|
||||
errno = saved;
|
||||
ZVFS_HOOK_LEAVE();
|
||||
return -1;
|
||||
}
|
||||
|
||||
ZVFS_HOOK_LEAVE();
|
||||
return -1;
|
||||
return newfd;
|
||||
}
|
||||
|
||||
/* ------------------------------------------------------------------ */
|
||||
@@ -534,9 +864,32 @@ dup2(int oldfd, int newfd)
|
||||
}
|
||||
|
||||
zvfs_ensure_init();
|
||||
errno = ENOTSUP;
|
||||
int newfd_was_zvfs = zvfs_is_zvfs_fd(newfd);
|
||||
|
||||
int ret = real_dup2(oldfd, newfd);
|
||||
if (ret < 0) {
|
||||
ZVFS_HOOK_LEAVE();
|
||||
return -1;
|
||||
}
|
||||
|
||||
if (newfd_was_zvfs && zvfs_detach_fd_mapping(newfd, 1) < 0) {
|
||||
int saved = errno;
|
||||
(void)real_close(newfd);
|
||||
errno = saved;
|
||||
ZVFS_HOOK_LEAVE();
|
||||
return -1;
|
||||
}
|
||||
|
||||
if (zvfs_dup_attach_newfd(oldfd, newfd, 0) < 0) {
|
||||
int saved = errno;
|
||||
(void)real_close(newfd);
|
||||
errno = saved;
|
||||
ZVFS_HOOK_LEAVE();
|
||||
return -1;
|
||||
}
|
||||
|
||||
ZVFS_HOOK_LEAVE();
|
||||
return -1;
|
||||
return ret;
|
||||
}
|
||||
|
||||
/* ------------------------------------------------------------------ */
|
||||
@@ -561,8 +914,92 @@ dup3(int oldfd, int newfd, int flags)
|
||||
return -1;
|
||||
}
|
||||
|
||||
if ((flags & ~O_CLOEXEC) != 0) {
|
||||
errno = EINVAL;
|
||||
ZVFS_HOOK_LEAVE();
|
||||
return -1;
|
||||
}
|
||||
|
||||
zvfs_ensure_init();
|
||||
errno = ENOTSUP;
|
||||
int newfd_was_zvfs = zvfs_is_zvfs_fd(newfd);
|
||||
|
||||
int ret = real_dup3(oldfd, newfd, flags);
|
||||
if (ret < 0) {
|
||||
ZVFS_HOOK_LEAVE();
|
||||
return -1;
|
||||
}
|
||||
|
||||
if (newfd_was_zvfs && zvfs_detach_fd_mapping(newfd, 1) < 0) {
|
||||
int saved = errno;
|
||||
(void)real_close(newfd);
|
||||
errno = saved;
|
||||
ZVFS_HOOK_LEAVE();
|
||||
return -1;
|
||||
}
|
||||
|
||||
int fd_flags = (flags & O_CLOEXEC) ? FD_CLOEXEC : 0;
|
||||
if (zvfs_dup_attach_newfd(oldfd, newfd, fd_flags) < 0) {
|
||||
int saved = errno;
|
||||
(void)real_close(newfd);
|
||||
errno = saved;
|
||||
ZVFS_HOOK_LEAVE();
|
||||
return -1;
|
||||
}
|
||||
|
||||
ZVFS_HOOK_LEAVE();
|
||||
return -1;
|
||||
return ret;
|
||||
}
|
||||
|
||||
/* ------------------------------------------------------------------ */
|
||||
/* fork */
|
||||
/* ------------------------------------------------------------------ */
|
||||
|
||||
pid_t
|
||||
fork(void)
|
||||
{
|
||||
ZVFS_HOOK_ENTER();
|
||||
|
||||
if (ZVFS_IN_HOOK()) {
|
||||
pid_t ret = real_fork();
|
||||
ZVFS_HOOK_LEAVE();
|
||||
return ret;
|
||||
}
|
||||
|
||||
uint64_t *handle_ids = NULL;
|
||||
uint32_t *ref_deltas = NULL;
|
||||
uint32_t count = 0;
|
||||
|
||||
if (zvfs_snapshot_fd_handles(&handle_ids, &ref_deltas, &count) < 0) {
|
||||
ZVFS_HOOK_LEAVE();
|
||||
return -1;
|
||||
}
|
||||
|
||||
if (count > 0) {
|
||||
zvfs_ensure_init();
|
||||
if (zvfs_add_ref_batch_or_fallback(handle_ids, ref_deltas, count) < 0) {
|
||||
int saved = errno;
|
||||
free(handle_ids);
|
||||
free(ref_deltas);
|
||||
errno = saved;
|
||||
ZVFS_HOOK_LEAVE();
|
||||
return -1;
|
||||
}
|
||||
}
|
||||
|
||||
pid_t ret = real_fork();
|
||||
if (ret < 0) {
|
||||
int saved = errno;
|
||||
if (count > 0)
|
||||
zvfs_rollback_added_refs(handle_ids, count);
|
||||
free(handle_ids);
|
||||
free(ref_deltas);
|
||||
errno = saved;
|
||||
ZVFS_HOOK_LEAVE();
|
||||
return -1;
|
||||
}
|
||||
|
||||
free(handle_ids);
|
||||
free(ref_deltas);
|
||||
ZVFS_HOOK_LEAVE();
|
||||
return ret;
|
||||
}
|
||||
|
||||
@@ -12,16 +12,17 @@
|
||||
* 非 zvfs 路径 → 透传
|
||||
*
|
||||
* close:
|
||||
* zvfs fd → openfile ref_count--
|
||||
* 归零:blob_close;若 inode->deleted,blob_delete + inode_free
|
||||
* inode ref_count--(归零:path_cache_remove + inode_free)
|
||||
* zvfs fd → blob_sync_md + blob_close
|
||||
* inode ref_count--(归零:若 inode->deleted 则 blob_delete,再 inode_free)
|
||||
* real_close
|
||||
* 非 zvfs fd → 透传
|
||||
*
|
||||
* dup / dup2 / dup3:
|
||||
* zvfs fd → 新 fd 插入 fd_table,openfile.ref_count++(共享同一 openfile),
|
||||
* real_dup* 同步执行(内核也要知道这个 fd)
|
||||
* zvfs fd → real_dup* + daemon ADD_REF + 本地 openfile/inode 引用维护
|
||||
* 非 zvfs fd → 透传
|
||||
*
|
||||
* fork:
|
||||
* 子进程会对已继承的 zvfs handle 执行 ADD_REF_BATCH(失败时退化为逐个 ADD_REF)
|
||||
*/
|
||||
|
||||
/* open 族 */
|
||||
@@ -40,6 +41,10 @@ int close_range(unsigned int first, unsigned int last, int flags);
|
||||
int dup(int oldfd);
|
||||
int dup2(int oldfd, int newfd);
|
||||
int dup3(int oldfd, int newfd, int flags);
|
||||
pid_t fork(void);
|
||||
|
||||
/* 给 fcntl(F_DUPFD*) 复用的内部辅助接口 */
|
||||
int zvfs_dup_attach_newfd(int oldfd, int newfd, int new_fd_flags);
|
||||
|
||||
/* glibc 内部别名(与 open/close 实现体共享逻辑,转发即可) */
|
||||
int __open(const char *path, int flags, ...);
|
||||
|
||||
@@ -114,6 +114,10 @@ extern void *(*real_mmap64)(void *addr, size_t length, int prot, int flags,
|
||||
extern int (*real_munmap)(void *addr, size_t length);
|
||||
extern int (*real_msync)(void *addr, size_t length, int flags);
|
||||
|
||||
/* 进程 */
|
||||
extern pid_t (*real_fork)(void);
|
||||
extern pid_t (*real_vfork)(void);
|
||||
|
||||
|
||||
/* glibc 内部别名 */
|
||||
extern int (*real___open)(const char *path, int flags, ...);
|
||||
|
||||
@@ -7,6 +7,7 @@
|
||||
#include "fs/zvfs.h"
|
||||
#include "fs/zvfs_open_file.h"
|
||||
#include "fs/zvfs_inode.h"
|
||||
#include "proto/ipc_proto.h"
|
||||
#include "spdk_engine/io_engine.h"
|
||||
|
||||
#include <errno.h>
|
||||
@@ -50,7 +51,7 @@ zvfs_pread_impl(struct zvfs_open_file *of,
|
||||
if (count == 0)
|
||||
return 0;
|
||||
|
||||
if (blob_read(of->handle, offset, buf, count) < 0) {
|
||||
if (blob_read(of->handle_id, offset, buf, count) < 0) {
|
||||
errno = EIO;
|
||||
return -1;
|
||||
}
|
||||
@@ -74,33 +75,15 @@ zvfs_pwrite_impl(struct zvfs_open_file *of,
|
||||
|
||||
uint64_t end = offset + count;
|
||||
|
||||
/*
|
||||
* 若写入范围超出 blob 当前物理大小,先 resize。
|
||||
* blob_resize 是 SPDK 侧的操作(可能分配新 cluster)。
|
||||
*/
|
||||
pthread_mutex_lock(&of->inode->mu);
|
||||
uint64_t old_size = of->inode->logical_size;
|
||||
pthread_mutex_unlock(&of->inode->mu);
|
||||
|
||||
if (end > old_size) {
|
||||
if (blob_resize(of->handle, end) < 0) {
|
||||
errno = EIO;
|
||||
return -1;
|
||||
}
|
||||
}
|
||||
|
||||
if (blob_write(of->handle, offset, buf, count) < 0) {
|
||||
errno = EIO;
|
||||
if (blob_write_ex(of->handle_id, offset, buf, count, ZVFS_WRITE_F_AUTO_GROW) < 0) {
|
||||
return -1;
|
||||
}
|
||||
|
||||
/* 更新 logical_size(持锁,inode_update_size 负责 ftruncate) */
|
||||
if (end > old_size) {
|
||||
pthread_mutex_lock(&of->inode->mu);
|
||||
if (end > of->inode->logical_size) /* double-check */
|
||||
inode_update_size(of->inode, of->fd, end);
|
||||
pthread_mutex_unlock(&of->inode->mu);
|
||||
}
|
||||
pthread_mutex_lock(&of->inode->mu);
|
||||
if (end > of->inode->logical_size) /* double-check */
|
||||
inode_update_size(of->inode, of->fd, end);
|
||||
pthread_mutex_unlock(&of->inode->mu);
|
||||
|
||||
return (ssize_t)count;
|
||||
}
|
||||
@@ -151,7 +134,7 @@ zvfs_iov_pread(struct zvfs_open_file *of,
|
||||
char *tmp = malloc(total_len);
|
||||
if (!tmp) { errno = ENOMEM; return -1; }
|
||||
|
||||
if (blob_read(of->handle, offset, tmp, total_len) < 0) {
|
||||
if (blob_read(of->handle_id, offset, tmp, total_len) < 0) {
|
||||
free(tmp);
|
||||
errno = EIO;
|
||||
return -1;
|
||||
@@ -477,36 +460,16 @@ write(int fd, const void *buf, size_t count)
|
||||
uint64_t write_off;
|
||||
|
||||
if (of->flags & O_APPEND) {
|
||||
/*
|
||||
* O_APPEND:每次写入位置 = 当前 logical_size(原子操作)。
|
||||
* 持 inode->mu 保证 read-then-write 的原子性,
|
||||
* 防止两个 O_APPEND fd 并发写时覆盖彼此数据。
|
||||
*/
|
||||
/* --- O_APPEND 内联写 -------------------------------------- */
|
||||
/* O_APPEND:每次写入位置 = 当前 logical_size。 */
|
||||
pthread_mutex_lock(&of->inode->mu);
|
||||
write_off = of->inode->logical_size; /* 重新取,防止 TOCTOU */
|
||||
uint64_t end = write_off + count;
|
||||
|
||||
pthread_mutex_unlock(&of->inode->mu);
|
||||
|
||||
if (blob_resize(of->handle, end) < 0) {
|
||||
errno = EIO;
|
||||
ZVFS_HOOK_LEAVE();
|
||||
return -1;
|
||||
}
|
||||
if (blob_write(of->handle, write_off, buf, count) < 0) {
|
||||
errno = EIO;
|
||||
ZVFS_HOOK_LEAVE();
|
||||
return -1;
|
||||
}
|
||||
|
||||
pthread_mutex_lock(&of->inode->mu);
|
||||
if (end > of->inode->logical_size)
|
||||
inode_update_size(of->inode, of->fd, end);
|
||||
pthread_mutex_unlock(&of->inode->mu);
|
||||
|
||||
ssize_t r = zvfs_pwrite_impl(of, buf, count, write_off);
|
||||
if (r > 0)
|
||||
of->offset = write_off + (uint64_t)r;
|
||||
ZVFS_HOOK_LEAVE();
|
||||
return (ssize_t)count;
|
||||
return r;
|
||||
|
||||
} else {
|
||||
write_off = of->offset;
|
||||
@@ -572,28 +535,14 @@ writev(int fd, const struct iovec *iov, int iovcnt)
|
||||
|
||||
ssize_t r;
|
||||
if (of->flags & O_APPEND) {
|
||||
/*
|
||||
* O_APPEND + writev:和 write 一样需要原子序列。
|
||||
* 先计算总字节数,用 iov_pwrite 完成,整个过程持 inode->mu。
|
||||
*/
|
||||
size_t total_len = 0;
|
||||
for (int i = 0; i < iovcnt; i++) total_len += iov[i].iov_len;
|
||||
|
||||
/* O_APPEND + writev:以当前 logical_size 作为写入起点。 */
|
||||
pthread_mutex_lock(&of->inode->mu);
|
||||
uint64_t write_off = of->inode->logical_size;
|
||||
uint64_t end = write_off + total_len;
|
||||
pthread_mutex_unlock(&of->inode->mu);
|
||||
|
||||
if (blob_resize(of->handle, end) < 0) { errno = EIO; ZVFS_HOOK_LEAVE(); return -1; }
|
||||
r = zvfs_iov_pwrite(of, iov, iovcnt, write_off);
|
||||
|
||||
if (r > 0) {
|
||||
pthread_mutex_lock(&of->inode->mu);
|
||||
uint64_t new_end = write_off + (uint64_t)r;
|
||||
if (new_end > of->inode->logical_size)
|
||||
inode_update_size(of->inode, of->fd, new_end);
|
||||
pthread_mutex_unlock(&of->inode->mu);
|
||||
}
|
||||
if (r > 0)
|
||||
of->offset = write_off + (uint64_t)r;
|
||||
} else {
|
||||
r = zvfs_iov_pwrite(of, iov, iovcnt, of->offset);
|
||||
if (r > 0) of->offset += (uint64_t)r;
|
||||
|
||||
@@ -69,21 +69,21 @@ off_t lseek64(int fd, off_t offset, int whence)
|
||||
|
||||
|
||||
/*
|
||||
* zvfs_truncate_by_inode - 对有 handle 的 openfile 做 truncate。
|
||||
* 找到任意一个打开该 inode 的 openfile 取其 handle。
|
||||
* zvfs_truncate_by_inode - 对有 handle_id 的 openfile 做 truncate。
|
||||
* 找到任意一个打开该 inode 的 openfile 取其 handle_id。
|
||||
*/
|
||||
static int
|
||||
zvfs_truncate_inode_with_handle(struct zvfs_inode *inode,
|
||||
int real_fd, uint64_t new_size)
|
||||
{
|
||||
/* 在 fd_table 里找一个指向该 inode 的 openfile 取 handle */
|
||||
struct zvfs_blob_handle *handle = NULL;
|
||||
/* 在 fd_table 里找一个指向该 inode 的 openfile 取 handle_id */
|
||||
uint64_t handle_id = 0;
|
||||
pthread_mutex_lock(&g_fs.fd_mu);
|
||||
struct zvfs_open_file *of, *tmp;
|
||||
HASH_ITER(hh, g_fs.fd_table, of, tmp) {
|
||||
(void)tmp;
|
||||
if (of->inode == inode) {
|
||||
handle = of->handle;
|
||||
handle_id = of->handle_id;
|
||||
break;
|
||||
}
|
||||
}
|
||||
@@ -93,20 +93,23 @@ zvfs_truncate_inode_with_handle(struct zvfs_inode *inode,
|
||||
uint64_t old_size = inode->logical_size;
|
||||
pthread_mutex_unlock(&inode->mu);
|
||||
|
||||
if (new_size != old_size && handle) {
|
||||
if (blob_resize(handle, new_size) < 0) {
|
||||
if (new_size != old_size && handle_id != 0) {
|
||||
if (blob_resize(handle_id, new_size) < 0) {
|
||||
errno = EIO;
|
||||
return -1;
|
||||
}
|
||||
} else if (new_size != old_size && !handle) {
|
||||
} else if (new_size != old_size && handle_id == 0) {
|
||||
/*
|
||||
* 文件未被打开:需要临时 blob_open。
|
||||
* 这种情况下 truncate(path, ...) 被调用但文件没有 fd。
|
||||
*/
|
||||
handle = blob_open(inode->blob_id);
|
||||
if (!handle) { errno = EIO; return -1; }
|
||||
int rc = blob_resize(handle, new_size);
|
||||
blob_close(handle);
|
||||
uint64_t temp_handle_id = 0;
|
||||
if (blob_open(inode->blob_id, &temp_handle_id) < 0) {
|
||||
errno = EIO;
|
||||
return -1;
|
||||
}
|
||||
int rc = blob_resize(temp_handle_id, new_size);
|
||||
blob_close(temp_handle_id);
|
||||
if (rc < 0) { errno = EIO; return -1; }
|
||||
}
|
||||
|
||||
|
||||
@@ -39,7 +39,7 @@ fsync(int fd)
|
||||
* zvfs 无写缓冲区,数据已在 blob_write 时落到 SPDK 存储。
|
||||
* 调用 blob_sync_md 确保 blob 元数据(size 等)持久化。
|
||||
*/
|
||||
int r = blob_sync_md(of->handle);
|
||||
int r = blob_sync_md(of->handle_id);
|
||||
if (r < 0) errno = EIO;
|
||||
|
||||
ZVFS_HOOK_LEAVE();
|
||||
@@ -75,7 +75,7 @@ fdatasync(int fd)
|
||||
* 对 zvfs:数据已无缓冲,blob_sync_md 同步 size 元数据即可。
|
||||
* 与 fsync 实现相同——如果将来区分数据/元数据可在此分叉。
|
||||
*/
|
||||
int r = blob_sync_md(of->handle);
|
||||
int r = blob_sync_md(of->handle_id);
|
||||
if (r < 0) errno = EIO;
|
||||
|
||||
ZVFS_HOOK_LEAVE();
|
||||
|
||||
1056
src/proto/ipc_proto.c
Normal file
1056
src/proto/ipc_proto.c
Normal file
File diff suppressed because it is too large
Load Diff
265
src/proto/ipc_proto.h
Normal file
265
src/proto/ipc_proto.h
Normal file
@@ -0,0 +1,265 @@
|
||||
#ifndef __IPC_PROTO_H__
|
||||
#define __IPC_PROTO_H__
|
||||
|
||||
#include <stddef.h>
|
||||
#include <stdint.h>
|
||||
|
||||
#ifdef __cplusplus
|
||||
extern "C" {
|
||||
#endif
|
||||
|
||||
struct zvfs_conn;
|
||||
struct zvfs_blob_handle;
|
||||
|
||||
enum zvfs_opcode {
|
||||
ZVFS_OP_CREATE = 1,
|
||||
ZVFS_OP_OPEN,
|
||||
ZVFS_OP_READ,
|
||||
ZVFS_OP_WRITE,
|
||||
ZVFS_OP_RESIZE,
|
||||
ZVFS_OP_SYNC_MD,
|
||||
ZVFS_OP_CLOSE,
|
||||
ZVFS_OP_DELETE,
|
||||
ZVFS_OP_ADD_REF,
|
||||
ZVFS_OP_ADD_REF_BATCH
|
||||
};
|
||||
|
||||
inline const char* cast_opcode2string(uint32_t op){
|
||||
switch (op)
|
||||
{
|
||||
case ZVFS_OP_CREATE:
|
||||
return "CREATE";
|
||||
break;
|
||||
case ZVFS_OP_OPEN:
|
||||
return "OPEN";
|
||||
break;
|
||||
case ZVFS_OP_READ:
|
||||
return "READ";
|
||||
break;
|
||||
case ZVFS_OP_WRITE:
|
||||
return "WRITE";
|
||||
break;
|
||||
case ZVFS_OP_RESIZE:
|
||||
return "RESIZE";
|
||||
break;
|
||||
case ZVFS_OP_SYNC_MD:
|
||||
return "SYNC";
|
||||
break;
|
||||
case ZVFS_OP_CLOSE:
|
||||
return "CLOSE";
|
||||
break;
|
||||
case ZVFS_OP_DELETE:
|
||||
return "DELETE";
|
||||
break;
|
||||
default:
|
||||
break;
|
||||
}
|
||||
return "ERROR";
|
||||
}
|
||||
|
||||
#define ZVFS_WRITE_F_AUTO_GROW (1u << 0)
|
||||
|
||||
/* 最小固定头(同步阻塞场景,不含 request_id) */
|
||||
struct zvfs_req_header {
|
||||
uint32_t opcode;
|
||||
uint32_t payload_len;
|
||||
};
|
||||
|
||||
struct zvfs_resp_header {
|
||||
uint32_t opcode;
|
||||
int32_t status;
|
||||
uint32_t payload_len;
|
||||
};
|
||||
|
||||
/* -------------------- per-op request body -------------------- */
|
||||
|
||||
struct zvfs_req_create_body {
|
||||
uint64_t size_hint;
|
||||
};
|
||||
|
||||
struct zvfs_req_open_body {
|
||||
uint64_t blob_id;
|
||||
};
|
||||
|
||||
struct zvfs_req_read_body {
|
||||
uint64_t handle_id;
|
||||
uint64_t offset;
|
||||
uint64_t length;
|
||||
};
|
||||
|
||||
struct zvfs_req_write_body {
|
||||
uint64_t handle_id;
|
||||
uint64_t offset;
|
||||
uint64_t length;
|
||||
uint32_t flags;
|
||||
const void *data;
|
||||
};
|
||||
|
||||
struct zvfs_req_resize_body {
|
||||
uint64_t handle_id;
|
||||
uint64_t new_size;
|
||||
};
|
||||
|
||||
struct zvfs_req_sync_md_body {
|
||||
uint64_t handle_id;
|
||||
};
|
||||
|
||||
struct zvfs_req_close_body {
|
||||
uint64_t handle_id;
|
||||
};
|
||||
|
||||
struct zvfs_req_delete_body {
|
||||
uint64_t blob_id;
|
||||
};
|
||||
|
||||
struct zvfs_add_ref_item {
|
||||
uint64_t handle_id;
|
||||
uint32_t ref_delta;
|
||||
};
|
||||
|
||||
struct zvfs_req_add_ref_body {
|
||||
uint64_t handle_id;
|
||||
uint32_t ref_delta;
|
||||
};
|
||||
|
||||
struct zvfs_req_add_ref_batch_body {
|
||||
uint32_t item_count;
|
||||
const struct zvfs_add_ref_item *items;
|
||||
};
|
||||
|
||||
/* -------------------- per-op response body -------------------- */
|
||||
|
||||
struct zvfs_resp_create_body {
|
||||
uint64_t blob_id;
|
||||
uint64_t handle_id;
|
||||
};
|
||||
|
||||
struct zvfs_resp_open_body {
|
||||
uint64_t handle_id;
|
||||
uint64_t size;
|
||||
};
|
||||
|
||||
struct zvfs_resp_read_body {
|
||||
uint64_t length;
|
||||
void *data;
|
||||
};
|
||||
|
||||
struct zvfs_resp_write_body {
|
||||
uint64_t bytes_written;
|
||||
};
|
||||
|
||||
/* resize/sync_md/close/delete 成功时 body 为空 */
|
||||
size_t zvfs_serialize_resp_resize(uint8_t *buf, size_t buf_len);
|
||||
size_t zvfs_deserialize_resp_resize(const uint8_t *buf, size_t buf_len);
|
||||
size_t zvfs_serialize_resp_sync_md(uint8_t *buf, size_t buf_len);
|
||||
size_t zvfs_deserialize_resp_sync_md(const uint8_t *buf, size_t buf_len);
|
||||
size_t zvfs_serialize_resp_close(uint8_t *buf, size_t buf_len);
|
||||
size_t zvfs_deserialize_resp_close(const uint8_t *buf, size_t buf_len);
|
||||
size_t zvfs_serialize_resp_delete(uint8_t *buf, size_t buf_len);
|
||||
size_t zvfs_deserialize_resp_delete(const uint8_t *buf, size_t buf_len);
|
||||
|
||||
/* -------------------- 兼容旧接口 req/resp -------------------- */
|
||||
|
||||
struct zvfs_req {
|
||||
uint32_t opcode;
|
||||
|
||||
uint64_t size_hint;
|
||||
uint64_t blob_id;
|
||||
uint64_t handle_id;
|
||||
|
||||
uint64_t offset;
|
||||
uint64_t length;
|
||||
uint32_t write_flags;
|
||||
void *data;
|
||||
|
||||
uint32_t ref_delta;
|
||||
uint32_t add_ref_count;
|
||||
struct zvfs_add_ref_item *add_ref_items;
|
||||
|
||||
struct zvfs_conn *conn;
|
||||
struct zvfs_blob_handle *handle;
|
||||
};
|
||||
|
||||
struct zvfs_resp {
|
||||
uint32_t opcode;
|
||||
int32_t status;
|
||||
|
||||
uint64_t blob_id;
|
||||
uint64_t handle_id;
|
||||
uint64_t size;
|
||||
|
||||
uint64_t length;
|
||||
void *data;
|
||||
|
||||
uint64_t bytes_written;
|
||||
|
||||
struct zvfs_conn *conn;
|
||||
};
|
||||
|
||||
/* -------------------- 头部序列化/反序列化 -------------------- */
|
||||
|
||||
size_t zvfs_serialize_req_header(const struct zvfs_req_header *header, uint8_t *buf, size_t buf_len);
|
||||
size_t zvfs_deserialize_req_header(const uint8_t *buf, size_t buf_len, struct zvfs_req_header *header);
|
||||
|
||||
size_t zvfs_serialize_resp_header(const struct zvfs_resp_header *header, uint8_t *buf, size_t buf_len);
|
||||
size_t zvfs_deserialize_resp_header(const uint8_t *buf, size_t buf_len, struct zvfs_resp_header *header);
|
||||
|
||||
/* -------------------- request body 序列化/反序列化 -------------------- */
|
||||
|
||||
size_t zvfs_serialize_req_create(const struct zvfs_req_create_body *body, uint8_t *buf, size_t buf_len);
|
||||
size_t zvfs_deserialize_req_create(const uint8_t *buf, size_t buf_len, struct zvfs_req_create_body *body);
|
||||
|
||||
size_t zvfs_serialize_req_open(const struct zvfs_req_open_body *body, uint8_t *buf, size_t buf_len);
|
||||
size_t zvfs_deserialize_req_open(const uint8_t *buf, size_t buf_len, struct zvfs_req_open_body *body);
|
||||
|
||||
size_t zvfs_serialize_req_read(const struct zvfs_req_read_body *body, uint8_t *buf, size_t buf_len);
|
||||
size_t zvfs_deserialize_req_read(const uint8_t *buf, size_t buf_len, struct zvfs_req_read_body *body);
|
||||
|
||||
size_t zvfs_serialize_req_write(const struct zvfs_req_write_body *body, uint8_t *buf, size_t buf_len);
|
||||
size_t zvfs_deserialize_req_write(const uint8_t *buf, size_t buf_len, struct zvfs_req_write_body *body);
|
||||
|
||||
size_t zvfs_serialize_req_resize(const struct zvfs_req_resize_body *body, uint8_t *buf, size_t buf_len);
|
||||
size_t zvfs_deserialize_req_resize(const uint8_t *buf, size_t buf_len, struct zvfs_req_resize_body *body);
|
||||
|
||||
size_t zvfs_serialize_req_sync_md(const struct zvfs_req_sync_md_body *body, uint8_t *buf, size_t buf_len);
|
||||
size_t zvfs_deserialize_req_sync_md(const uint8_t *buf, size_t buf_len, struct zvfs_req_sync_md_body *body);
|
||||
|
||||
size_t zvfs_serialize_req_close(const struct zvfs_req_close_body *body, uint8_t *buf, size_t buf_len);
|
||||
size_t zvfs_deserialize_req_close(const uint8_t *buf, size_t buf_len, struct zvfs_req_close_body *body);
|
||||
|
||||
size_t zvfs_serialize_req_delete(const struct zvfs_req_delete_body *body, uint8_t *buf, size_t buf_len);
|
||||
size_t zvfs_deserialize_req_delete(const uint8_t *buf, size_t buf_len, struct zvfs_req_delete_body *body);
|
||||
|
||||
size_t zvfs_serialize_req_add_ref(const struct zvfs_req_add_ref_body *body, uint8_t *buf, size_t buf_len);
|
||||
size_t zvfs_deserialize_req_add_ref(const uint8_t *buf, size_t buf_len, struct zvfs_req_add_ref_body *body);
|
||||
|
||||
size_t zvfs_serialize_req_add_ref_batch(const struct zvfs_req_add_ref_batch_body *body, uint8_t *buf, size_t buf_len);
|
||||
size_t zvfs_deserialize_req_add_ref_batch(const uint8_t *buf, size_t buf_len, struct zvfs_req_add_ref_batch_body *body);
|
||||
|
||||
/* -------------------- response body 序列化/反序列化 -------------------- */
|
||||
|
||||
size_t zvfs_serialize_resp_create(const struct zvfs_resp_create_body *body, uint8_t *buf, size_t buf_len);
|
||||
size_t zvfs_deserialize_resp_create(const uint8_t *buf, size_t buf_len, struct zvfs_resp_create_body *body);
|
||||
|
||||
size_t zvfs_serialize_resp_open(const struct zvfs_resp_open_body *body, uint8_t *buf, size_t buf_len);
|
||||
size_t zvfs_deserialize_resp_open(const uint8_t *buf, size_t buf_len, struct zvfs_resp_open_body *body);
|
||||
|
||||
size_t zvfs_serialize_resp_read(const struct zvfs_resp_read_body *body, uint8_t *buf, size_t buf_len);
|
||||
size_t zvfs_deserialize_resp_read(const uint8_t *buf, size_t buf_len, struct zvfs_resp_read_body *body);
|
||||
|
||||
size_t zvfs_serialize_resp_write(const struct zvfs_resp_write_body *body, uint8_t *buf, size_t buf_len);
|
||||
size_t zvfs_deserialize_resp_write(const uint8_t *buf, size_t buf_len, struct zvfs_resp_write_body *body);
|
||||
|
||||
/* -------------------- 兼容封装 -------------------- */
|
||||
|
||||
size_t zvfs_serialize_req(struct zvfs_req *req, uint8_t *buf, size_t buf_len);
|
||||
size_t zvfs_deserialize_req(uint8_t *buf, size_t buf_len, struct zvfs_req *req);
|
||||
|
||||
size_t zvfs_serialize_resp(struct zvfs_resp *resp, uint8_t *buf, size_t buf_len);
|
||||
size_t zvfs_deserialize_resp(uint8_t *buf, size_t buf_len, struct zvfs_resp *resp);
|
||||
|
||||
#ifdef __cplusplus
|
||||
}
|
||||
#endif
|
||||
|
||||
#endif
|
||||
File diff suppressed because it is too large
Load Diff
@@ -2,42 +2,20 @@
|
||||
#define __ZVFS_IO_ENGINE_H__
|
||||
|
||||
#include <stdint.h>
|
||||
#include <sys/types.h>
|
||||
#include <spdk/blob.h>
|
||||
#include <stddef.h>
|
||||
|
||||
// blob_handle 结构体:底层 blob 信息,不含文件级 size(上层维护)
|
||||
typedef struct zvfs_blob_handle {
|
||||
spdk_blob_id id;
|
||||
struct spdk_blob *blob;
|
||||
uint64_t size;
|
||||
void *dma_buf;
|
||||
uint64_t dma_buf_size;
|
||||
} zvfs_blob_handle_t ;
|
||||
int io_engine_init(void);
|
||||
|
||||
typedef struct zvfs_spdk_io_engine {
|
||||
struct spdk_bs_dev *bs_dev;
|
||||
struct spdk_blob_store *bs;
|
||||
struct spdk_thread *md_thread;
|
||||
uint64_t io_unit_size;
|
||||
uint64_t cluster_size;
|
||||
int reactor_count;
|
||||
|
||||
} zvfs_spdk_io_engine_t;
|
||||
|
||||
typedef struct zvfs_tls_ctx {
|
||||
struct spdk_thread *thread;
|
||||
struct spdk_io_channel *channel;
|
||||
}zvfs_tls_ctx_t;
|
||||
|
||||
int io_engine_init(const char *bdev_name);
|
||||
|
||||
struct zvfs_blob_handle *blob_create(uint64_t size_hint); // 创建并 open,返回 handle
|
||||
struct zvfs_blob_handle *blob_open(uint64_t blob_id); // open 现有 blob,返回 handle
|
||||
int blob_write(struct zvfs_blob_handle *handle, uint64_t offset, const void *buf, size_t len);
|
||||
int blob_read(struct zvfs_blob_handle *handle, uint64_t offset, void *buf, size_t len);
|
||||
int blob_resize(struct zvfs_blob_handle *handle, uint64_t new_size);
|
||||
int blob_sync_md(struct zvfs_blob_handle *handle);
|
||||
int blob_close(struct zvfs_blob_handle *handle); // close 这个 handle 的 blob*
|
||||
int blob_delete(uint64_t blob_id); // delete,整个 blob(不需 handle)
|
||||
int blob_create(uint64_t size_hint, uint64_t *blob_id_out, uint64_t *handle_id_out);
|
||||
int blob_open(uint64_t blob_id, uint64_t *handle_id_out);
|
||||
int blob_write_ex(uint64_t handle_id, uint64_t offset, const void *buf, size_t len, uint32_t write_flags);
|
||||
int blob_write(uint64_t handle_id, uint64_t offset, const void *buf, size_t len);
|
||||
int blob_read(uint64_t handle_id, uint64_t offset, void *buf, size_t len);
|
||||
int blob_resize(uint64_t handle_id, uint64_t new_size);
|
||||
int blob_sync_md(uint64_t handle_id);
|
||||
int blob_close(uint64_t handle_id);
|
||||
int blob_delete(uint64_t blob_id);
|
||||
int blob_add_ref(uint64_t handle_id, uint32_t ref_delta);
|
||||
int blob_add_ref_batch(const uint64_t *handle_ids, const uint32_t *ref_deltas, uint32_t count);
|
||||
|
||||
#endif // __ZVFS_IO_ENGINE_H__
|
||||
|
||||
@@ -7,7 +7,7 @@
|
||||
"method": "bdev_malloc_create",
|
||||
"params": {
|
||||
"name": "Malloc0",
|
||||
"num_blocks": 262140,
|
||||
"num_blocks": 1048576,
|
||||
"block_size": 512
|
||||
}
|
||||
}
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
SUBDIRS := ioengine_test hook
|
||||
SUBDIRS := hook_test daemon_test
|
||||
|
||||
.PHONY: all clean $(SUBDIRS)
|
||||
|
||||
|
||||
12
tests/daemon_test/Makefile
Normal file
12
tests/daemon_test/Makefile
Normal file
@@ -0,0 +1,12 @@
|
||||
|
||||
BIN_DIR := $(abspath $(CURDIR)/../bin)
|
||||
PROTO_DIR := $(abspath $(CURDIR)/../../src/proto)
|
||||
|
||||
CFLAGS := -I$(abspath $(CURDIR)/../../src)
|
||||
|
||||
all:
|
||||
gcc -g -o $(BIN_DIR)/ipc_echo_test ipc_echo_test.c
|
||||
gcc -g $(CFLAGS) -o $(BIN_DIR)/ipc_zvfs_test ipc_zvfs_test.c $(PROTO_DIR)/ipc_proto.c
|
||||
|
||||
clean:
|
||||
rm -rf $(BIN_DIR)/ipc_echo_test $(BIN_DIR)/ipc_zvfs_test
|
||||
33
tests/daemon_test/ipc_echo_test.c
Normal file
33
tests/daemon_test/ipc_echo_test.c
Normal file
@@ -0,0 +1,33 @@
|
||||
#include <stdio.h>
|
||||
#include <unistd.h>
|
||||
#include <string.h>
|
||||
#include <sys/socket.h>
|
||||
#include <sys/un.h>
|
||||
|
||||
|
||||
int main()
|
||||
{
|
||||
int fd = socket(AF_UNIX, SOCK_STREAM, 0);
|
||||
|
||||
struct sockaddr_un addr;
|
||||
|
||||
memset(&addr, 0, sizeof(addr));
|
||||
addr.sun_family = AF_UNIX;
|
||||
strcpy(addr.sun_path, "/tmp/zvfs.sock");
|
||||
|
||||
connect(fd, (struct sockaddr*)&addr, sizeof(addr));
|
||||
|
||||
char *msg = "hello reactor\n";
|
||||
|
||||
write(fd, msg, strlen(msg));
|
||||
|
||||
char buf[4096];
|
||||
|
||||
int n = read(fd, buf, sizeof(buf));
|
||||
|
||||
printf("echo: %.*s\n", n, buf);
|
||||
|
||||
close(fd);
|
||||
|
||||
return 0;
|
||||
}
|
||||
265
tests/daemon_test/ipc_zvfs_test.c
Normal file
265
tests/daemon_test/ipc_zvfs_test.c
Normal file
@@ -0,0 +1,265 @@
|
||||
#include <stdio.h>
|
||||
#include <stdlib.h>
|
||||
#include <string.h>
|
||||
#include <unistd.h>
|
||||
#include <sys/socket.h>
|
||||
#include <sys/un.h>
|
||||
#include "proto/ipc_proto.h"
|
||||
|
||||
#define SOCKET_PATH "/tmp/zvfs.sock"
|
||||
#define BUF_SIZE 4096
|
||||
|
||||
int connect_to_server() {
|
||||
int fd = socket(AF_UNIX, SOCK_STREAM, 0);
|
||||
if (fd < 0) {
|
||||
perror("socket");
|
||||
return -1;
|
||||
}
|
||||
|
||||
struct sockaddr_un addr;
|
||||
memset(&addr, 0, sizeof(addr));
|
||||
addr.sun_family = AF_UNIX;
|
||||
strncpy(addr.sun_path, SOCKET_PATH, sizeof(addr.sun_path)-1);
|
||||
|
||||
if (connect(fd, (struct sockaddr*)&addr, sizeof(addr)) < 0) {
|
||||
perror("connect");
|
||||
close(fd);
|
||||
return -1;
|
||||
}
|
||||
|
||||
return fd;
|
||||
}
|
||||
|
||||
// -------------------- 操作函数 --------------------
|
||||
|
||||
void do_create(int fd) {
|
||||
struct zvfs_req req;
|
||||
memset(&req, 0, sizeof(req));
|
||||
req.opcode = ZVFS_OP_CREATE;
|
||||
req.size_hint = 1024; // 1KB
|
||||
|
||||
uint8_t buf[BUF_SIZE];
|
||||
size_t n = zvfs_serialize_req(&req, buf, sizeof(buf));
|
||||
if (n == 0) { fprintf(stderr,"serialize failed\n"); return; }
|
||||
|
||||
if (write(fd, buf, n) != n) { perror("write"); return; }
|
||||
|
||||
uint8_t resp_buf[BUF_SIZE];
|
||||
ssize_t r = read(fd, resp_buf, sizeof(resp_buf));
|
||||
if (r <= 0) { perror("read"); return; }
|
||||
|
||||
struct zvfs_resp resp;
|
||||
memset(&resp, 0, sizeof(resp));
|
||||
size_t consumed = zvfs_deserialize_resp(resp_buf, r, &resp);
|
||||
if (consumed == 0) { fprintf(stderr, "deserialize failed\n"); return; }
|
||||
|
||||
printf("Received CREATE response: status=%d, blob_id=%lu, handle_id=%lu\n",
|
||||
resp.status, resp.blob_id, resp.handle_id);
|
||||
|
||||
if(resp.data) free(resp.data);
|
||||
}
|
||||
|
||||
void do_open(int fd, uint64_t blob_id) {
|
||||
struct zvfs_req req;
|
||||
memset(&req,0,sizeof(req));
|
||||
req.opcode = ZVFS_OP_OPEN;
|
||||
req.blob_id = blob_id;
|
||||
|
||||
uint8_t buf[BUF_SIZE];
|
||||
size_t n = zvfs_serialize_req(&req, buf, sizeof(buf));
|
||||
if (n == 0) { fprintf(stderr,"serialize failed\n"); return; }
|
||||
|
||||
if (write(fd, buf, n) != n) { perror("write"); return; }
|
||||
|
||||
uint8_t resp_buf[BUF_SIZE];
|
||||
ssize_t r = read(fd, resp_buf, sizeof(resp_buf));
|
||||
if (r <= 0) { perror("read"); return; }
|
||||
|
||||
struct zvfs_resp resp;
|
||||
memset(&resp,0,sizeof(resp));
|
||||
size_t consumed = zvfs_deserialize_resp(resp_buf, r, &resp);
|
||||
if (consumed == 0) { fprintf(stderr, "deserialize failed\n"); return; }
|
||||
|
||||
printf("Received OPEN response: status=%d, handle_id=%lu, size=%lu\n",
|
||||
resp.status, resp.handle_id, resp.size);
|
||||
|
||||
if(resp.data) free(resp.data);
|
||||
}
|
||||
|
||||
void do_read(int fd, uint64_t handle_id, uint64_t offset, uint64_t length) {
|
||||
struct zvfs_req req;
|
||||
memset(&req,0,sizeof(req));
|
||||
req.opcode = ZVFS_OP_READ;
|
||||
req.handle_id = handle_id;
|
||||
req.offset = offset;
|
||||
req.length = length;
|
||||
|
||||
uint8_t buf[BUF_SIZE];
|
||||
size_t n = zvfs_serialize_req(&req, buf, sizeof(buf));
|
||||
if (n == 0) { fprintf(stderr,"serialize failed\n"); return; }
|
||||
|
||||
if (write(fd, buf, n) != n) { perror("write"); return; }
|
||||
|
||||
uint8_t resp_buf[BUF_SIZE];
|
||||
ssize_t r = read(fd, resp_buf, sizeof(resp_buf));
|
||||
if (r <= 0) { perror("read"); return; }
|
||||
|
||||
struct zvfs_resp resp;
|
||||
memset(&resp,0,sizeof(resp));
|
||||
size_t consumed = zvfs_deserialize_resp(resp_buf, r, &resp);
|
||||
if (consumed == 0) { fprintf(stderr, "deserialize failed\n"); return; }
|
||||
|
||||
printf("Received READ response: status=%d, length=%lu\n",
|
||||
resp.status, resp.length);
|
||||
|
||||
if(resp.data) {
|
||||
printf("Data: ");
|
||||
for(size_t i=0;i<resp.length;i++)
|
||||
printf("%02x ", ((uint8_t*)resp.data)[i]);
|
||||
printf("\n");
|
||||
free(resp.data);
|
||||
}
|
||||
}
|
||||
|
||||
void do_write(int fd, uint64_t handle_id, uint64_t offset,
|
||||
const char *data, size_t len, uint32_t write_flags) {
|
||||
struct zvfs_req req;
|
||||
memset(&req,0,sizeof(req));
|
||||
req.opcode = ZVFS_OP_WRITE;
|
||||
req.handle_id = handle_id;
|
||||
req.offset = offset;
|
||||
req.length = len;
|
||||
req.write_flags = write_flags;
|
||||
req.data = (void*)data;
|
||||
|
||||
uint8_t buf[BUF_SIZE];
|
||||
size_t n = zvfs_serialize_req(&req, buf, sizeof(buf));
|
||||
if (n == 0) { fprintf(stderr,"serialize failed\n"); return; }
|
||||
|
||||
if (write(fd, buf, n) != n) { perror("write"); return; }
|
||||
|
||||
uint8_t resp_buf[BUF_SIZE];
|
||||
ssize_t r = read(fd, resp_buf, sizeof(resp_buf));
|
||||
if (r <= 0) { perror("read"); return; }
|
||||
|
||||
struct zvfs_resp resp;
|
||||
memset(&resp,0,sizeof(resp));
|
||||
size_t consumed = zvfs_deserialize_resp(resp_buf, r, &resp);
|
||||
if (consumed == 0) { fprintf(stderr, "deserialize failed\n"); return; }
|
||||
|
||||
printf("Received WRITE response: status=%d, bytes_written=%lu\n",
|
||||
resp.status, resp.bytes_written);
|
||||
|
||||
if(resp.data) free(resp.data);
|
||||
}
|
||||
|
||||
void do_close(int fd, uint64_t handle_id) {
|
||||
struct zvfs_req req;
|
||||
memset(&req,0,sizeof(req));
|
||||
req.opcode = ZVFS_OP_CLOSE;
|
||||
req.handle_id = handle_id;
|
||||
|
||||
uint8_t buf[BUF_SIZE];
|
||||
size_t n = zvfs_serialize_req(&req, buf, sizeof(buf));
|
||||
if (n == 0) { fprintf(stderr,"serialize failed\n"); return; }
|
||||
if (write(fd, buf, n) != n) { perror("write"); return; }
|
||||
|
||||
uint8_t resp_buf[BUF_SIZE];
|
||||
ssize_t r = read(fd, resp_buf, sizeof(resp_buf));
|
||||
if (r <= 0) { perror("read"); return; }
|
||||
|
||||
struct zvfs_resp resp;
|
||||
memset(&resp,0,sizeof(resp));
|
||||
size_t consumed = zvfs_deserialize_resp(resp_buf, r, &resp);
|
||||
if (consumed == 0) { fprintf(stderr, "deserialize failed\n"); return; }
|
||||
|
||||
printf("Received CLOSE response: status=%d\n", resp.status);
|
||||
if(resp.data) free(resp.data);
|
||||
}
|
||||
|
||||
void do_delete(int fd, uint64_t blob_id) {
|
||||
struct zvfs_req req;
|
||||
memset(&req,0,sizeof(req));
|
||||
req.opcode = ZVFS_OP_DELETE;
|
||||
req.blob_id = blob_id;
|
||||
|
||||
uint8_t buf[BUF_SIZE];
|
||||
size_t n = zvfs_serialize_req(&req, buf, sizeof(buf));
|
||||
if (n == 0) { fprintf(stderr,"serialize failed\n"); return; }
|
||||
if (write(fd, buf, n) != n) { perror("write"); return; }
|
||||
|
||||
uint8_t resp_buf[BUF_SIZE];
|
||||
ssize_t r = read(fd, resp_buf, sizeof(resp_buf));
|
||||
if (r <= 0) { perror("read"); return; }
|
||||
|
||||
struct zvfs_resp resp;
|
||||
memset(&resp,0,sizeof(resp));
|
||||
size_t consumed = zvfs_deserialize_resp(resp_buf, r, &resp);
|
||||
if (consumed == 0) { fprintf(stderr, "deserialize failed\n"); return; }
|
||||
|
||||
printf("Received DELETE response: status=%d\n", resp.status);
|
||||
if(resp.data) free(resp.data);
|
||||
}
|
||||
|
||||
void do_resize(int fd, uint64_t handle_id, uint64_t new_size) {
|
||||
struct zvfs_req req;
|
||||
memset(&req,0,sizeof(req));
|
||||
req.opcode = ZVFS_OP_RESIZE;
|
||||
req.handle_id = handle_id;
|
||||
req.size_hint = new_size;
|
||||
|
||||
uint8_t buf[BUF_SIZE];
|
||||
size_t n = zvfs_serialize_req(&req, buf, sizeof(buf));
|
||||
if (n == 0) { fprintf(stderr,"serialize failed\n"); return; }
|
||||
if (write(fd, buf, n) != n) { perror("write"); return; }
|
||||
|
||||
uint8_t resp_buf[BUF_SIZE];
|
||||
ssize_t r = read(fd, resp_buf, sizeof(resp_buf));
|
||||
if (r <= 0) { perror("read"); return; }
|
||||
|
||||
struct zvfs_resp resp;
|
||||
memset(&resp,0,sizeof(resp));
|
||||
size_t consumed = zvfs_deserialize_resp(resp_buf, r, &resp);
|
||||
if (consumed == 0) { fprintf(stderr, "deserialize failed\n"); return; }
|
||||
|
||||
printf("Received RESIZE response: status=%d\n", resp.status);
|
||||
if(resp.data) free(resp.data);
|
||||
}
|
||||
|
||||
// -------------------- main --------------------
|
||||
|
||||
int main() {
|
||||
int fd = connect_to_server();
|
||||
if(fd < 0) return 1;
|
||||
|
||||
printf("Connected to server at %s\n", SOCKET_PATH);
|
||||
printf("Commands:\n create\n open <blob>\n read <handle> <offset> <len>\n write <handle> <offset> <data>\n writeg <handle> <offset> <data>\n close <handle>\n delete <blob>\n resize <handle> <size>\n quit\n");
|
||||
|
||||
char line[256];
|
||||
while (1) {
|
||||
printf("> ");
|
||||
if(!fgets(line, sizeof(line), stdin)) break;
|
||||
|
||||
char cmd[32];
|
||||
uint64_t a,b,c;
|
||||
char data[128];
|
||||
|
||||
if (sscanf(line, "%31s", cmd) != 1) continue;
|
||||
|
||||
if (strcmp(cmd,"quit")==0) break;
|
||||
else if (strcmp(cmd,"create")==0) do_create(fd);
|
||||
else if (strcmp(cmd,"open")==0 && sscanf(line,"%*s %lu",&a)==1) do_open(fd,a);
|
||||
else if (strcmp(cmd,"read")==0 && sscanf(line,"%*s %lu %lu %lu",&a,&b,&c)==3) do_read(fd,a,b,c);
|
||||
else if (strcmp(cmd,"write")==0 && sscanf(line,"%*s %lu %lu %127s",&a,&b,data)==3)
|
||||
do_write(fd, a, b, data, strlen(data), 0);
|
||||
else if (strcmp(cmd,"writeg")==0 && sscanf(line,"%*s %lu %lu %127s",&a,&b,data)==3)
|
||||
do_write(fd, a, b, data, strlen(data), ZVFS_WRITE_F_AUTO_GROW);
|
||||
else if (strcmp(cmd,"close")==0 && sscanf(line,"%*s %lu",&a)==1) do_close(fd,a);
|
||||
else if (strcmp(cmd,"delete")==0 && sscanf(line,"%*s %lu",&a)==1) do_delete(fd,a);
|
||||
else if (strcmp(cmd,"resize")==0 && sscanf(line,"%*s %lu %lu",&a,&b)==2) do_resize(fd,a,b);
|
||||
else printf("Unknown or invalid command\n");
|
||||
}
|
||||
|
||||
close(fd);
|
||||
return 0;
|
||||
}
|
||||
@@ -1,43 +0,0 @@
|
||||
# SPDX-License-Identifier: BSD-3-Clause
|
||||
|
||||
SPDK_ROOT_DIR := $(abspath $(CURDIR)/../../spdk)
|
||||
include $(SPDK_ROOT_DIR)/mk/spdk.common.mk
|
||||
include $(SPDK_ROOT_DIR)/mk/spdk.modules.mk
|
||||
include $(SPDK_ROOT_DIR)/mk/spdk.app_vars.mk
|
||||
|
||||
# 输出目录
|
||||
BIN_DIR := $(abspath $(CURDIR)/../bin)
|
||||
|
||||
TEST_BINS := \
|
||||
ioengine_single_blob_test \
|
||||
ioengine_multi_blob_test \
|
||||
ioengine_same_blob_mt_test
|
||||
|
||||
COMMON_SRCS := \
|
||||
test_common.c \
|
||||
../../src/spdk_engine/io_engine.c \
|
||||
../../src/common/utils.c
|
||||
|
||||
SPDK_LIB_LIST = $(ALL_MODULES_LIST) event event_bdev
|
||||
LIBS += $(SPDK_LIB_LINKER_ARGS)
|
||||
|
||||
CFLAGS += -I$(abspath $(CURDIR)/../../src) -I$(CURDIR)
|
||||
|
||||
.PHONY: all clean
|
||||
all: $(BIN_DIR) $(addprefix $(BIN_DIR)/,$(TEST_BINS))
|
||||
|
||||
# 创建 bin 目录
|
||||
$(BIN_DIR):
|
||||
mkdir -p $(BIN_DIR)
|
||||
|
||||
$(BIN_DIR)/ioengine_single_blob_test: ioengine_single_blob_test.c $(COMMON_SRCS) $(SPDK_LIB_FILES) $(ENV_LIBS)
|
||||
$(CC) $(CFLAGS) -o $@ $< $(COMMON_SRCS) $(LDFLAGS) $(LIBS) $(ENV_LDFLAGS) $(SYS_LIBS)
|
||||
|
||||
$(BIN_DIR)/ioengine_multi_blob_test: ioengine_multi_blob_test.c $(COMMON_SRCS) $(SPDK_LIB_FILES) $(ENV_LIBS)
|
||||
$(CC) $(CFLAGS) -o $@ $< $(COMMON_SRCS) $(LDFLAGS) $(LIBS) $(ENV_LDFLAGS) $(SYS_LIBS)
|
||||
|
||||
$(BIN_DIR)/ioengine_same_blob_mt_test: ioengine_same_blob_mt_test.c $(COMMON_SRCS) $(SPDK_LIB_FILES) $(ENV_LIBS)
|
||||
$(CC) $(CFLAGS) -o $@ $< $(COMMON_SRCS) $(LDFLAGS) $(LIBS) $(ENV_LDFLAGS) $(SYS_LIBS)
|
||||
|
||||
clean:
|
||||
rm -f $(addprefix $(BIN_DIR)/,$(TEST_BINS))
|
||||
@@ -1,106 +0,0 @@
|
||||
#include <stdint.h>
|
||||
#include <stdio.h>
|
||||
#include <stdlib.h>
|
||||
#include <string.h>
|
||||
|
||||
#include "spdk_engine/io_engine.h"
|
||||
#include "test_common.h"
|
||||
|
||||
#define MULTI_BLOB_COUNT 3
|
||||
|
||||
int main(void) {
|
||||
int rc = 0;
|
||||
const char *bdev_name = getenv("SPDK_BDEV_NAME");
|
||||
struct zvfs_blob_handle *handles[MULTI_BLOB_COUNT] = {0};
|
||||
uint64_t ids[MULTI_BLOB_COUNT] = {0};
|
||||
uint64_t cluster = 0;
|
||||
void *wbuf = NULL;
|
||||
void *rbuf = NULL;
|
||||
int i = 0;
|
||||
|
||||
if (!bdev_name) {
|
||||
bdev_name = "Malloc0";
|
||||
}
|
||||
if (io_engine_init(bdev_name) != 0) {
|
||||
fprintf(stderr, "TEST2: io_engine_init failed (bdev=%s)\n", bdev_name);
|
||||
return 1;
|
||||
}
|
||||
|
||||
printf("[TEST2] single thread / multi blob\n");
|
||||
|
||||
handles[0] = blob_create(0);
|
||||
if (!handles[0]) {
|
||||
fprintf(stderr, "TEST2: create first blob failed\n");
|
||||
return 1;
|
||||
}
|
||||
ids[0] = handles[0]->id;
|
||||
cluster = handles[0]->size;
|
||||
if (cluster == 0) {
|
||||
fprintf(stderr, "TEST2: invalid cluster size\n");
|
||||
rc = 1;
|
||||
goto out;
|
||||
}
|
||||
if (blob_resize(handles[0], cluster * 2) != 0) {
|
||||
fprintf(stderr, "TEST2: resize first blob failed\n");
|
||||
rc = 1;
|
||||
goto out;
|
||||
}
|
||||
|
||||
for (i = 1; i < MULTI_BLOB_COUNT; i++) {
|
||||
handles[i] = blob_create(cluster * 2);
|
||||
if (!handles[i]) {
|
||||
fprintf(stderr, "TEST2: create blob %d failed\n", i);
|
||||
rc = 1;
|
||||
goto out;
|
||||
}
|
||||
ids[i] = handles[i]->id;
|
||||
}
|
||||
|
||||
if (alloc_aligned_buf(&wbuf, cluster) != 0 || alloc_aligned_buf(&rbuf, cluster) != 0) {
|
||||
fprintf(stderr, "TEST2: alloc aligned buffer failed\n");
|
||||
rc = 1;
|
||||
goto out;
|
||||
}
|
||||
|
||||
for (i = 0; i < MULTI_BLOB_COUNT; i++) {
|
||||
fill_pattern((uint8_t *)wbuf, cluster, (uint8_t)(0x20 + i));
|
||||
memset(rbuf, 0, cluster);
|
||||
|
||||
if (blob_write(handles[i], 0, wbuf, cluster) != 0) {
|
||||
fprintf(stderr, "TEST2: blob_write[%d] failed\n", i);
|
||||
rc = 1;
|
||||
goto out;
|
||||
}
|
||||
if (blob_read(handles[i], 0, rbuf, cluster) != 0) {
|
||||
fprintf(stderr, "TEST2: blob_read[%d] failed\n", i);
|
||||
rc = 1;
|
||||
goto out;
|
||||
}
|
||||
if (memcmp(wbuf, rbuf, cluster) != 0) {
|
||||
fprintf(stderr, "TEST2: blob[%d] readback mismatch\n", i);
|
||||
rc = 1;
|
||||
goto out;
|
||||
}
|
||||
}
|
||||
|
||||
out:
|
||||
for (i = 0; i < MULTI_BLOB_COUNT; i++) {
|
||||
if (handles[i]) {
|
||||
(void)blob_close(handles[i]);
|
||||
}
|
||||
}
|
||||
for (i = 0; i < MULTI_BLOB_COUNT; i++) {
|
||||
if (ids[i] != 0) {
|
||||
(void)blob_delete(ids[i]);
|
||||
}
|
||||
}
|
||||
free(wbuf);
|
||||
free(rbuf);
|
||||
|
||||
if (rc == 0) {
|
||||
printf("[TEST2] PASS\n");
|
||||
return 0;
|
||||
}
|
||||
printf("[TEST2] FAIL\n");
|
||||
return 1;
|
||||
}
|
||||
@@ -1,147 +0,0 @@
|
||||
#include <pthread.h>
|
||||
#include <stdint.h>
|
||||
#include <stdio.h>
|
||||
#include <stdlib.h>
|
||||
#include <string.h>
|
||||
|
||||
#include "spdk_engine/io_engine.h"
|
||||
#include "test_common.h"
|
||||
|
||||
#define THREAD_COUNT 4
|
||||
|
||||
struct mt_case_arg {
|
||||
struct zvfs_blob_handle *handle;
|
||||
uint64_t cluster_size;
|
||||
uint64_t offset;
|
||||
uint8_t seed;
|
||||
pthread_barrier_t *barrier;
|
||||
int rc;
|
||||
};
|
||||
|
||||
static void *mt_case_worker(void *arg) {
|
||||
struct mt_case_arg *ctx = (struct mt_case_arg *)arg;
|
||||
void *wbuf = NULL;
|
||||
void *rbuf = NULL;
|
||||
|
||||
if (alloc_aligned_buf(&wbuf, ctx->cluster_size) != 0 ||
|
||||
alloc_aligned_buf(&rbuf, ctx->cluster_size) != 0) {
|
||||
free(wbuf);
|
||||
free(rbuf);
|
||||
ctx->rc = 1;
|
||||
return NULL;
|
||||
}
|
||||
|
||||
fill_pattern((uint8_t *)wbuf, ctx->cluster_size, ctx->seed);
|
||||
(void)pthread_barrier_wait(ctx->barrier);
|
||||
|
||||
if (blob_write(ctx->handle, ctx->offset, wbuf, ctx->cluster_size) != 0) {
|
||||
ctx->rc = 1;
|
||||
goto out;
|
||||
}
|
||||
if (blob_read(ctx->handle, ctx->offset, rbuf, ctx->cluster_size) != 0) {
|
||||
ctx->rc = 1;
|
||||
goto out;
|
||||
}
|
||||
if (memcmp(wbuf, rbuf, ctx->cluster_size) != 0) {
|
||||
ctx->rc = 1;
|
||||
goto out;
|
||||
}
|
||||
|
||||
ctx->rc = 0;
|
||||
|
||||
out:
|
||||
free(wbuf);
|
||||
free(rbuf);
|
||||
return NULL;
|
||||
}
|
||||
|
||||
int main(void) {
|
||||
int rc = 0;
|
||||
const char *bdev_name = getenv("SPDK_BDEV_NAME");
|
||||
int i = 0;
|
||||
struct zvfs_blob_handle *h = NULL;
|
||||
uint64_t blob_id = 0;
|
||||
uint64_t cluster = 0;
|
||||
pthread_t tids[THREAD_COUNT];
|
||||
struct mt_case_arg args[THREAD_COUNT];
|
||||
pthread_barrier_t barrier;
|
||||
int barrier_inited = 0;
|
||||
|
||||
if (!bdev_name) {
|
||||
bdev_name = "Malloc0";
|
||||
}
|
||||
if (io_engine_init(bdev_name) != 0) {
|
||||
fprintf(stderr, "TEST3: io_engine_init failed (bdev=%s)\n", bdev_name);
|
||||
return 1;
|
||||
}
|
||||
|
||||
printf("[TEST3] multi thread / same blob\n");
|
||||
|
||||
h = blob_create(0);
|
||||
if (!h) {
|
||||
fprintf(stderr, "TEST3: blob_create failed\n");
|
||||
return 1;
|
||||
}
|
||||
blob_id = h->id;
|
||||
cluster = h->size;
|
||||
if (cluster == 0) {
|
||||
fprintf(stderr, "TEST3: invalid cluster size\n");
|
||||
rc = 1;
|
||||
goto out;
|
||||
}
|
||||
if (blob_resize(h, cluster * THREAD_COUNT) != 0) {
|
||||
fprintf(stderr, "TEST3: blob_resize failed\n");
|
||||
rc = 1;
|
||||
goto out;
|
||||
}
|
||||
|
||||
if (pthread_barrier_init(&barrier, NULL, THREAD_COUNT) != 0) {
|
||||
fprintf(stderr, "TEST3: barrier init failed\n");
|
||||
rc = 1;
|
||||
goto out;
|
||||
}
|
||||
barrier_inited = 1;
|
||||
|
||||
for (i = 0; i < THREAD_COUNT; i++) {
|
||||
args[i].handle = h;
|
||||
args[i].cluster_size = cluster;
|
||||
args[i].offset = cluster * (uint64_t)i;
|
||||
args[i].seed = (uint8_t)(0x40 + i);
|
||||
args[i].barrier = &barrier;
|
||||
args[i].rc = 1;
|
||||
if (pthread_create(&tids[i], NULL, mt_case_worker, &args[i]) != 0) {
|
||||
fprintf(stderr, "TEST3: pthread_create[%d] failed\n", i);
|
||||
rc = 1;
|
||||
while (--i >= 0) {
|
||||
pthread_join(tids[i], NULL);
|
||||
}
|
||||
goto out;
|
||||
}
|
||||
}
|
||||
|
||||
for (i = 0; i < THREAD_COUNT; i++) {
|
||||
pthread_join(tids[i], NULL);
|
||||
if (args[i].rc != 0) {
|
||||
fprintf(stderr, "TEST3: worker[%d] failed\n", i);
|
||||
rc = 1;
|
||||
}
|
||||
}
|
||||
|
||||
out:
|
||||
if (barrier_inited) {
|
||||
(void)pthread_barrier_destroy(&barrier);
|
||||
}
|
||||
if (h) {
|
||||
(void)blob_close(h);
|
||||
}
|
||||
if (blob_id != 0) {
|
||||
(void)blob_delete(blob_id);
|
||||
}
|
||||
|
||||
if (rc == 0) {
|
||||
printf("[TEST3] PASS\n");
|
||||
return 0;
|
||||
}
|
||||
printf("[TEST3] FAIL\n");
|
||||
return 1;
|
||||
}
|
||||
@@ -1,136 +0,0 @@
|
||||
#include <stdint.h>
|
||||
#include <stdio.h>
|
||||
#include <stdlib.h>
|
||||
#include <string.h>
|
||||
|
||||
#include "spdk_engine/io_engine.h"
|
||||
#include "test_common.h"
|
||||
|
||||
int main(void) {
|
||||
int rc = 0;
|
||||
const char *bdev_name = getenv("SPDK_BDEV_NAME");
|
||||
struct zvfs_blob_handle *h = NULL;
|
||||
struct zvfs_blob_handle *reopen = NULL;
|
||||
uint64_t blob_id = 0;
|
||||
uint64_t cluster = 0;
|
||||
void *wbuf = NULL;
|
||||
void *rbuf = NULL;
|
||||
|
||||
if (!bdev_name) {
|
||||
bdev_name = "Malloc0";
|
||||
}
|
||||
if (io_engine_init(bdev_name) != 0) {
|
||||
fprintf(stderr, "TEST1: io_engine_init failed (bdev=%s)\n", bdev_name);
|
||||
return 1;
|
||||
}
|
||||
|
||||
printf("[TEST1] single thread / single blob\n");
|
||||
|
||||
h = blob_create(0);
|
||||
if (!h) {
|
||||
fprintf(stderr, "TEST1: blob_create failed\n");
|
||||
return 1;
|
||||
}
|
||||
blob_id = h->id;
|
||||
cluster = h->size;
|
||||
if (cluster == 0) {
|
||||
fprintf(stderr, "TEST1: invalid cluster size\n");
|
||||
rc = 1;
|
||||
goto out;
|
||||
}
|
||||
|
||||
rc = blob_resize(h, cluster * 2);
|
||||
if (rc != 0) {
|
||||
fprintf(stderr, "TEST1: blob_resize failed: %d\n", rc);
|
||||
rc = 1;
|
||||
goto out;
|
||||
}
|
||||
|
||||
rc = alloc_aligned_buf(&wbuf, cluster);
|
||||
if (rc != 0) {
|
||||
fprintf(stderr, "TEST1: alloc write buf failed: %d\n", rc);
|
||||
rc = 1;
|
||||
goto out;
|
||||
}
|
||||
rc = alloc_aligned_buf(&rbuf, cluster);
|
||||
if (rc != 0) {
|
||||
fprintf(stderr, "TEST1: alloc read buf failed: %d\n", rc);
|
||||
rc = 1;
|
||||
goto out;
|
||||
}
|
||||
fill_pattern((uint8_t *)wbuf, cluster, 0x11);
|
||||
|
||||
rc = blob_write(h, 0, wbuf, cluster);
|
||||
if (rc != 0) {
|
||||
fprintf(stderr, "TEST1: blob_write failed: %d\n", rc);
|
||||
rc = 1;
|
||||
goto out;
|
||||
}
|
||||
|
||||
rc = blob_read(h, 0, rbuf, cluster);
|
||||
if (rc != 0) {
|
||||
fprintf(stderr, "TEST1: blob_read failed: %d\n", rc);
|
||||
rc = 1;
|
||||
goto out;
|
||||
}
|
||||
if (memcmp(wbuf, rbuf, cluster) != 0) {
|
||||
fprintf(stderr, "TEST1: readback mismatch\n");
|
||||
rc = 1;
|
||||
goto out;
|
||||
}
|
||||
|
||||
rc = blob_sync_md(h);
|
||||
if (rc != 0) {
|
||||
fprintf(stderr, "TEST1: blob_sync_md failed: %d\n", rc);
|
||||
rc = 1;
|
||||
goto out;
|
||||
}
|
||||
|
||||
rc = blob_close(h);
|
||||
if (rc != 0) {
|
||||
fprintf(stderr, "TEST1: blob_close failed: %d\n", rc);
|
||||
rc = 1;
|
||||
goto out;
|
||||
}
|
||||
h = NULL;
|
||||
|
||||
reopen = blob_open(blob_id);
|
||||
if (!reopen) {
|
||||
fprintf(stderr, "TEST1: blob_open(reopen) failed\n");
|
||||
rc = 1;
|
||||
goto out;
|
||||
}
|
||||
|
||||
memset(rbuf, 0, cluster);
|
||||
rc = blob_read(reopen, 0, rbuf, cluster);
|
||||
if (rc != 0) {
|
||||
fprintf(stderr, "TEST1: reopen blob_read failed: %d\n", rc);
|
||||
rc = 1;
|
||||
goto out;
|
||||
}
|
||||
if (memcmp(wbuf, rbuf, cluster) != 0) {
|
||||
fprintf(stderr, "TEST1: reopen readback mismatch\n");
|
||||
rc = 1;
|
||||
goto out;
|
||||
}
|
||||
|
||||
out:
|
||||
if (reopen) {
|
||||
(void)blob_close(reopen);
|
||||
}
|
||||
if (h) {
|
||||
(void)blob_close(h);
|
||||
}
|
||||
if (blob_id != 0) {
|
||||
(void)blob_delete(blob_id);
|
||||
}
|
||||
free(wbuf);
|
||||
free(rbuf);
|
||||
|
||||
if (rc == 0) {
|
||||
printf("[TEST1] PASS\n");
|
||||
return 0;
|
||||
}
|
||||
printf("[TEST1] FAIL\n");
|
||||
return 1;
|
||||
}
|
||||
@@ -1,20 +0,0 @@
|
||||
#include "test_common.h"
|
||||
|
||||
#include <stdlib.h>
|
||||
#include <string.h>
|
||||
|
||||
int alloc_aligned_buf(void **buf, size_t len) {
|
||||
int rc = posix_memalign(buf, 4096, len);
|
||||
if (rc != 0) {
|
||||
return -rc;
|
||||
}
|
||||
memset(*buf, 0, len);
|
||||
return 0;
|
||||
}
|
||||
|
||||
void fill_pattern(uint8_t *buf, size_t len, uint8_t seed) {
|
||||
size_t i = 0;
|
||||
for (i = 0; i < len; i++) {
|
||||
buf[i] = (uint8_t)(seed + (uint8_t)i);
|
||||
}
|
||||
}
|
||||
@@ -1,10 +0,0 @@
|
||||
#ifndef __IOENGINE_TEST_COMMON_H__
|
||||
#define __IOENGINE_TEST_COMMON_H__
|
||||
|
||||
#include <stddef.h>
|
||||
#include <stdint.h>
|
||||
|
||||
int alloc_aligned_buf(void **buf, size_t len);
|
||||
void fill_pattern(uint8_t *buf, size_t len, uint8_t seed);
|
||||
|
||||
#endif // __IOENGINE_TEST_COMMON_H__
|
||||
16
zvfs架构图.excalidraw.svg
Normal file
16
zvfs架构图.excalidraw.svg
Normal file
File diff suppressed because one or more lines are too long
|
After Width: | Height: | Size: 95 KiB |
Reference in New Issue
Block a user