Skip to content

Commit abfd42e

Browse files
committed
Move iceberg_rust_ffi to the RustyIceberg.jl repo
1 parent 728669d commit abfd42e

File tree

14 files changed

+5601
-200
lines changed

14 files changed

+5601
-200
lines changed

.github/workflows/CI.yml

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -32,6 +32,16 @@ jobs:
3232
arch: x64
3333
steps:
3434
- uses: actions/checkout@v3.5.0
35+
- name: Setup Rust
36+
uses: actions-rust-lang/setup-rust-toolchain@v1
37+
with:
38+
toolchain: stable
39+
- name: Build Rust FFI library
40+
run: |
41+
cd iceberg_rust_ffi
42+
cargo build --release
43+
- name: Set ICEBERG_RUST_LIB environment variable
44+
run: echo "ICEBERG_RUST_LIB=${{ github.workspace }}/iceberg_rust_ffi/target/release" >> $GITHUB_ENV
3545
- name: Initialize containers
3646
uses: gacts/run-and-post-run@v1
3747
with:

.gitignore

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1,5 @@
1-
**/.env
1+
**/.env
2+
iceberg_rust_ffi/target
3+
iceberg_rust_ffi/integration_test
4+
**/*.dylib
5+
**/.claude

AGENTS.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
CLAUDE.md

CLAUDE.md

Lines changed: 110 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,110 @@
1+
# CLAUDE.md
2+
3+
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. Most of the commands are for stuff inside `iceberg_rust_ffi` folder, with the exception of the `RustyIceberg.jl` section below.
4+
5+
## Common Development Commands
6+
7+
### Building
8+
- `cargo build` - Build the Rust FFI library (debug)
9+
- `cargo build --release --no-default-features` - Build the Rust FFI library (production)
10+
- `make build-lib` - Build the Rust library and generate C header using cbindgen
11+
12+
### Testing
13+
- `./run_integration_test.sh` - Recommended way to run the full integration test (builds everything and runs test with colored output) locally (requires containers).
14+
- `make all` - Build everything and run integration test (requires containers)
15+
- `make run-containers` - Start Docker containers for S3 testing
16+
- `make test` - Run the integration test (requires build and containers)
17+
- `make stop-containers` - Stop Docker containers
18+
- `cargo test` - Run Rust unit tests
19+
20+
### Code Quality
21+
- `cargo fmt` - Format Rust code
22+
- `cargo clippy` - Run Rust linter
23+
- `cargo check` - Quick check for Rust compilation errors
24+
25+
### Cleanup
26+
- `make clean` - Clean build artifacts (but keep target directory)
27+
- `make clean-all` - Clean everything including target directory
28+
29+
## Architecture Overview
30+
31+
This project provides a **Foreign Function Interface (FFI)** for Apache Iceberg, allowing C programs (and other languages through C bindings) to access Iceberg tables stored in object storage systems like S3. The majority of the infrastucture relies on object_store_ffi crate. If you don't have access to that crate's code locally, access it at this [URL](https://github.com/RelationalAI/object_store_ffi).
32+
33+
### Key Components
34+
35+
#### Rust Library (`src/lib.rs`)
36+
- **Core FFI Implementation**: Exposes Iceberg functionality through C-compatible functions
37+
- **Async Runtime Integration**: Uses Tokio for async operations with object_store_ffi for callback handling. Async operations rely on `export_runtime_op!` macro, which has a sync block, which is a builder function, where all deserialization and conversion is done. Then the result of that is passed to an async block. Each parameter has to implement Send trait, in order to be passed to the async block
38+
- **Julia Integration**: Conditional compilation features for Julia interop (`julia` feature flag)
39+
- **Memory Management**: Safe FFI patterns with proper cleanup functions
40+
41+
#### C header (`include/iceberg_rust_ffi.h`)
42+
- **Manual Generation**: C header is not generated right now. Whenever you make a change in the Rust library, examine whether the header should be updated.
43+
- **C99 Compatible**: Ensures compatibility with standard C compilers
44+
- **Response Structures**: Async operations return response structures with context for cancellation
45+
46+
#### Integration Test (`tests/integration_test.c`)
47+
- **Dynamic Loading**: Uses `dlopen`/`dlsym` to load the Rust library at runtime
48+
- **Async API Testing**: Tests the new async API with response structures and callbacks
49+
- **S3 Integration**: Connects to S3 (or MinIO) to test real object storage operations
50+
51+
### FFI Design Patterns
52+
53+
#### Async Operations with Callbacks
54+
The FFI uses an async callback pattern where:
55+
1. C calls an async function with a response structure
56+
2. Rust spawns the operation and returns immediately
57+
3. When complete, Rust invokes a callback to signal completion
58+
4. C polls or waits for completion, then checks the response structure
59+
60+
#### Memory Management
61+
- **Owned Pointers**: Rust allocates, C receives opaque pointers
62+
- **Cleanup Functions**: Every allocated resource has a corresponding `_free` function
63+
- **Error Handling**: Errors are returned via response structures with allocated error strings
64+
65+
#### Context and Cancellation
66+
- Operations return a context pointer that can be used for cancellation
67+
- `iceberg_cancel_context` and `iceberg_destroy_context` functions manage operation lifecycle
68+
69+
### S3 Configuration
70+
71+
The integration test expects AWS S3 credentials through environment variables:
72+
- `AWS_ACCESS_KEY_ID`
73+
- `AWS_SECRET_ACCESS_KEY`
74+
- `AWS_REGION` or `AWS_DEFAULT_REGION`
75+
- `AWS_ENDPOINT_URL` (for MinIO or custom S3-compatible storage)
76+
77+
Use the `.env` file or export variables directly. The test is designed to fail with permission errors when S3 paths are inaccessible, which confirms the API is working correctly.
78+
79+
### Build System
80+
81+
#### Cargo Features
82+
- Default features: `["julia"]`
83+
- `julia` feature: Enables Julia thread adoption and GC integration
84+
- Integration tests use `--no-default-features` to avoid Julia dependencies
85+
86+
### RustyIceberg.jl
87+
88+
Whenever making API changes in the iceberg_rust_ffi, the corresponding changes should be made in its parent folder. The parent folder is home for Julia package, which provides Julia bindings on top of the FFI.
89+
Once changes are made there, they should be tested by:
90+
1. Doing `cargo build` (with default features) for the iceberg_rust_ffi.
91+
2. Invoking `ICEBERG_RUST_LIB=<path_to_iceberg_rust_ffi_folder>/target/debug julia --project=. examples/basic_usage.jl` in the RustyIceberg.jl directory.
92+
93+
## Development Notes
94+
95+
### Working with FFI
96+
- Always check for null pointers in C code before dereferencing
97+
- Use the provided `_free` functions to avoid memory leaks
98+
- Error messages are allocated strings that must be freed with `iceberg_destroy_cstring`
99+
100+
### Testing Changes
101+
Run the integration test after making changes to verify the FFI still works:
102+
```bash
103+
./run_integration_test.sh
104+
```
105+
106+
### Object Store Integration
107+
This crate depends on `object_store_ffi` for async runtime management and callback handling. The integration provides:
108+
- Cross-platform async runtime setup
109+
- Callback infrastructure for async operations
110+
- Context management for cancellation support

Makefile

Lines changed: 92 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,92 @@
1+
.PHONY: run-containers stop-containers build build-debug build-release test repl clean clean-all help
2+
3+
# Rust library configuration
4+
RUST_FFI_DIR = iceberg_rust_ffi
5+
BUILD_TYPE ?= debug
6+
TARGET_DIR = $(RUST_FFI_DIR)/target/$(BUILD_TYPE)
7+
LIB_NAME = libiceberg_rust_ffi.dylib
8+
RUST_LIB_PATH = $(TARGET_DIR)/$(LIB_NAME)
9+
10+
# Julia thread configuration (only set if JULIA_NUM_THREADS is defined)
11+
ifdef JULIA_NUM_THREADS
12+
JULIA_THREADS_ENV = JULIA_NUM_THREADS=$(JULIA_NUM_THREADS)
13+
else
14+
JULIA_THREADS_ENV =
15+
endif
16+
17+
# Default target
18+
all: build
19+
20+
# Start docker containers
21+
run-containers:
22+
cd docker && docker-compose up -d && sleep 10
23+
24+
# Stop docker containers
25+
stop-containers:
26+
cd docker && docker-compose down
27+
28+
# Build the Rust FFI library (debug by default, use BUILD_TYPE=release for release build)
29+
build:
30+
ifeq ($(BUILD_TYPE),debug)
31+
cd $(RUST_FFI_DIR) && cargo build
32+
else
33+
cd $(RUST_FFI_DIR) && cargo build --release
34+
endif
35+
36+
# Build debug version
37+
build-debug:
38+
$(MAKE) BUILD_TYPE=debug build
39+
40+
# Build release version
41+
build-release:
42+
$(MAKE) BUILD_TYPE=release build
43+
44+
# Run tests (requires .env file)
45+
test: build
46+
@if [ ! -f .env ]; then \
47+
echo "Error: .env file not found. Please create a .env file with required environment variables."; \
48+
exit 1; \
49+
fi
50+
@set -a && . ./.env && set +a && \
51+
export ICEBERG_RUST_LIB=$(TARGET_DIR) && \
52+
$(JULIA_THREADS_ENV) julia --project=. -e 'using Pkg; Pkg.test()'
53+
54+
# Start Julia REPL with environment configured (requires .env file)
55+
repl: build
56+
@if [ ! -f .env ]; then \
57+
echo "Error: .env file not found. Please create a .env file with required environment variables."; \
58+
exit 1; \
59+
fi
60+
@set -a && . ./.env && set +a && \
61+
export ICEBERG_RUST_LIB=$(TARGET_DIR) && \
62+
$(JULIA_THREADS_ENV) julia --project=.
63+
64+
# Clean build artifacts
65+
clean:
66+
$(MAKE) -C $(RUST_FFI_DIR) clean
67+
68+
# Clean everything
69+
clean-all:
70+
$(MAKE) -C $(RUST_FFI_DIR) clean-all
71+
72+
# Show help
73+
help:
74+
@echo "Available targets:"
75+
@echo " all - Build the Rust FFI library (default)"
76+
@echo " build - Build the Rust FFI library (use BUILD_TYPE=debug for debug build)"
77+
@echo " build-debug - Build the Rust FFI library in debug mode"
78+
@echo " build-release - Build the Rust FFI library in release mode"
79+
@echo " test - Run Julia tests (requires .env file and runs build first)"
80+
@echo " repl - Start Julia REPL with environment configured (requires .env file)"
81+
@echo " run-containers - Start docker containers"
82+
@echo " stop-containers - Stop docker containers"
83+
@echo " clean - Clean build artifacts"
84+
@echo " clean-all - Clean everything including target directory"
85+
@echo " help - Show this help message"
86+
@echo ""
87+
@echo "Examples:"
88+
@echo " make test - Build in debug mode and run tests"
89+
@echo " make BUILD_TYPE=release test - Build in release mode and run tests"
90+
@echo " make build-release repl - Build in release mode and start REPL"
91+
@echo " JULIA_NUM_THREADS=8 make test - Run tests with 8 Julia threads"
92+
@echo " JULIA_NUM_THREADS=4 make repl - Start REPL with 4 Julia threads"

0 commit comments

Comments
 (0)