FLock FL Alliance Client

FL Alliance is a decentralized federated learning protocol where multiple participants collaboratively train a shared model — without ever exposing their private data. Participants stake tokens, train on local datasets, and are rewarded or slashed based on the quality of their contributions, all enforced by on-chain smart contracts.

This repository is the production client for FL Alliance. It handles the full lifecycle — staking, model download, local training, parameter upload, voting, aggregation, and reward claiming — so you can participate with a single command.

Key Features

Four operating modes — on-chain testnet, local chain dev, fully offline, and chainless pure FL
Two runtime backends — Docker container or local Python process; repository defaults to runtime.mode=docker, OCM addon deployments typically override to local (direct client, no FLocKit sidecar)
Seal encryption — optional end-to-end encryption of model parameters via Mysten Labs' Seal
LAN-ready — run multi-client simulations on a single machine or across a local network
Cross-platform — Linux, macOS (Apple Silicon / Intel), and Windows 11 + WSL2; see Runtime Modes for the support matrix
Concurrent backend runs — the FastAPI backend supports multiple simultaneous client runs with per-run port and environment isolation
Structured logging — rotating file logs with full source tracing for production debugging; subprocess (FLockit) errors and warnings are forwarded to the parent log in real time so model-side failures surface without a manual tail -f
Operator-friendly failure modes — SIGHUP / SIGTERM / SIGINT are caught and logged before exit (no more silent deaths from SSH disconnects), and every long wait emits an INFO heartbeat naming the polled URL and the subprocess log path

Prerequisites

Requirement	When needed
Python >= 3.11	Always
uv (recommended) or pip	Always
Docker	Only for `docker` runtime mode (default)
$FLOCK tokens (get whitelisted (TBD))	Online mode only
Base Sepolia ETH (Alchemy Faucet)	Online mode only

Local simulation and pure FL modes require no tokens, no ETH, and no internet (after initial dependency install).

Quick Start

The default quick start is the online testnet (on-chain) flow.

1. Clone and install

git clone https://github.com/FLock-io/FL-Alliance-Client.git
cd FL-Alliance-Client

# Using uv (recommended)
uv sync

# Or using pip
pip install -r requirements.txt

uv sync is the recommended path for the repository-managed environment. requirements.txt is kept as a compatibility install path and may pull a heavier dependency set for model/runtime workflows.

2. Configure

cp .env.onchain.example .env
# Edit .env and set:
#   PRIVATE_KEY=<your wallet private key>
#   BLOCKCHAIN_RPC=<Base Sepolia RPC URL>   # WEB3_RPC_URL is also supported
#   TOKEN_ADDRESS=<FlockToken address>
#   TASK_ADDRESS=<FlockTask address>
# Optional but recommended on testnet/mainnet:
#   EXPECTED_CHAIN_ID=84532                  # Base Sepolia; refuses to start on mismatch
#   BLOCKCHAIN_TX_RECEIPT_TIMEOUT=180        # Seconds for tx receipt (default 120)
#   FLOCK_CONTRACTS_FILE=/path/to/contracts.json   # Highest-priority contracts source
# Optional:
#   HF_TOKEN=<token for gated models>
# Optional — bump for LLM tasks on a cold cache (HF download + venv install):
#   PROCESS_STARTUP_TIMEOUT=7200             # Seconds before model startup is declared failed (default 1800)
#   PROCESS_RESPONSE_TIMEOUT=7200            # Seconds per train / evaluate / aggregate SDK call (default 3600)

3. Run a client

python main.py -c config/conf.yaml \
  --task-address <TASK_ADDRESS> \
  --dataset <DATASET_PATH> \
  --hf-token <HF_TOKEN> \
  --gpu

Use a custom mounted env file (for example in Kubernetes):

python main.py -c config/conf.yaml --env-file /data/.env

Example:

python main.py -c config/conf.yaml \
  --task-address 0x47B0397C6ae306002788D093b29bcD2EDAd19924 \
  --dataset data/asr_sarawakmalay_whisper_format_client_ids.json \
  --hf-token $HF_TOKEN \
  --gpu

Long-running training: wrap the command in tmux / nohup / systemd so the client survives SSH disconnects. The client now installs SIGHUP / SIGTERM handlers that log the signal name before exit, but SIGKILL (OOM-killer) still terminates the process silently — a session manager is the only reliable defence.

4. Scale to multiple clients (optional)

# Use a different PRIVATE_KEY and runtime.port per process
python main.py -c config/conf.yaml \
  --task-address <TASK_ADDRESS> \
  --dataset <DATASET_PATH> \
  --hf-token <HF_TOKEN> \
  --gpu \
  --override runtime.port=<UNIQUE_PORT>

That's it. You are now running on Base Sepolia with incentive-enabled FL Alliance flow.

Container image publishing

Recommended: publish both latest and a git-SHA tag, then deploy the SHA tag from flock-addon.

Recommended setup:

export IMAGE_SHA=$(git rev-parse --short=12 HEAD)

Build locally:

make image-build IMAGE_OWNER=ray-ruisun IMAGE_TAG=latest IMAGE_IMMUTABLE_TAG="$IMAGE_SHA"

Inspect the local image:

make image-inspect IMAGE_OWNER=ray-ruisun IMAGE_TAG=latest IMAGE_IMMUTABLE_TAG="$IMAGE_SHA"

Push manually:

make image-login GHCR_USER="$GHCR_USER" GHCR_PAT="$GHCR_PAT"
make image-push IMAGE_OWNER=ray-ruisun IMAGE_TAG=latest IMAGE_IMMUTABLE_TAG="$IMAGE_SHA"

One command publish flow:

make image-publish \
  IMAGE_OWNER=ray-ruisun \
  IMAGE_TAG=latest \
  IMAGE_IMMUTABLE_TAG="$IMAGE_SHA" \
  GHCR_USER="$GHCR_USER" \
  GHCR_PAT="$GHCR_PAT"

If Docker on your machine requires sudo, use:

make image-publish \
  DOCKER='sudo docker' \
  IMAGE_OWNER=ray-ruisun \
  IMAGE_TAG=latest \
  IMAGE_IMMUTABLE_TAG="$IMAGE_SHA" \
  GHCR_USER="$GHCR_USER" \
  GHCR_PAT="$GHCR_PAT"

Print the exact published tags:

make image-print IMAGE_OWNER=ray-ruisun IMAGE_TAG=latest IMAGE_IMMUTABLE_TAG="$IMAGE_SHA"

Recommended handoff to flock-addon:

export IMAGE_TAG=$(git rev-parse --short=12 HEAD)

Automatic publishing:

GitHub Actions now publishes to ghcr.io/<repository-owner-lowercase>/fl-alliance-client
pushes on main and version tags such as v0.1.0 will publish automatically
workflow_dispatch can also publish on demand

Dataset format: DATASET accepts a single file or a directory. main.py first stages every input into a temporary directory by copying (shutil.copytree / shutil.copy2); the runtime backend then exposes that staging directory to the model — Docker via a read-only bind mount at /app/data, and runtime.mode=local via a symlink (falling back to an NTFS junction or a full copy on Windows). See Configuration and Runtime Modes for details.

Prefer local simulation first? Use offline mode:
cp .env.local.example .env
make chain MODEL_DEFINITION_HASH=$(sha256sum model.tar.gz | cut -d' ' -f1)
make sim1 DATASET=data/train.jsonl
On macOS, replace sha256sum with: shasum -a 256 model.tar.gz | cut -d' ' -f1

For all scenarios (testnet, dev mode, offline mode, pure FL, and LAN deployment), see the Run Playbook.

Operating Modes

Mode	Chain	Storage	Internet	Config	Command
Online (testnet)	Base Sepolia	S3	Required	`config/conf.yaml`	`python main.py -c config/conf.yaml ...`
Dev (local chain + object storage)	Local Anvil	S3 Signer / direct S3-compatible + HuggingFace	Required	`config/simulation-online.yaml`	`make dev1`
Offline (fully local)	Local Anvil	Local filesystem	Not needed	`config/simulation.yaml`	`make sim1`
Pure FL (chainless)	None	Local filesystem	Not needed	`config/pure-fl.yaml`	`make pure-fl1`

All modes use the same client code — only the configuration differs. Each mode has a dedicated YAML config template and corresponding Makefile targets (dev/sim: up to 20 clients, pure-fl: 3 clients by default).

Choosing a mode:

Just exploring? Start with Offline mode (make sim1) — zero external dependencies.
Developing with real storage? Use Dev mode (make dev1) — local chain + S3 Signer or direct S3-compatible storage.
Running on testnet? Use Online mode (python main.py -c config/conf.yaml ...) — requires $FLOCK tokens and Base Sepolia ETH.
No blockchain needed? Use Pure FL mode (make pure-fl1) — coordination via shared files only.

For step-by-step instructions for each mode, see the Run Playbook.

Project Structure

.
├── client/                  # Core FL client runtime and managers
│   ├── contracts/           # Smart contract wrappers and ABIs
│   ├── managers/            # Container, storage, sync, metrics, coordination managers
│   ├── encryption/          # Seal encryption integration
│   └── logging_utils.py     # Centralized logging configuration
├── contracts/               # Solidity contracts and deployment scripts
├── config/                  # Configuration templates (one per mode)
│   ├── conf.yaml            # Online mode (Base Sepolia)
│   ├── simulation-online.yaml # Dev mode: local chain + online storage
│   ├── simulation.yaml      # Offline mode: local chain + local storage
│   └── pure-fl.yaml         # Pure FL mode (chainless)
├── docs/                    # Detailed documentation
├── .env.onchain.example     # .env template for online mode
├── .env.local.example       # .env template for local chain modes
├── main.py                  # Client entry point
├── docker-compose.yml       # Local chain + deployer services
├── Makefile                 # Developer shortcuts
└── output/                  # Runtime logs and task outputs (git-ignored)

Documentation

Document	Description
Configuration	Config files, env vars, YAML settings, CLI overrides
Run Playbook	Step-by-step commands for every scenario
Runtime Modes	Docker / local execution backends
Local Chain Simulation	Offline and LAN deployment, shared storage setup (NFS/SMB/sshfs)
Pure FL Mode	Chainless federated learning without incentive mechanism
Encryption & Storage	Seal encryption, S3/Nami/local storage backends
FL Alliance Protocol	Protocol deep-dive and smart contract lifecycle
Backend API	FastAPI service for runs, metrics, events, artifacts, and task admin

Makefile Parameters

Parameter	Default	Description
`DATASET`	(required)	Path to dataset file or directory
`GPU`	`true`	Enable GPU acceleration (`true`/`false`)
`CHAIN_HOST`	`localhost`	Anvil chain host IP (for remote LAN clients)
`TOKEN_ADDRESS`	(auto)	FlockToken contract address (auto-detected from `$FLOCK_CONTRACTS_FILE`, `/data/contracts.json`, or `data/contracts.json` — first match wins)
`TASK_ADDRESS`	(auto)	FlockTask contract address (auto-detected from the same set as `TOKEN_ADDRESS`)
`MODEL_DEFINITION_HASH`	(required for `make chain`)	SHA-256 hash of model archive
`ROUNDS`	`10`	Number of training rounds
`MIN_PARTICIPANTS`	`3`	Minimum participants per round

Development

This project uses uv for Python package management:

uv sync                        # install dependencies
uv run python main.py          # run in project environment
uv add <package>               # add a dependency

Before submitting changes:

make test
uv run python -m compileall main.py client

If pytest fails during startup because an external plugin is auto-loaded by your environment, run:

PYTEST_DISABLE_PLUGIN_AUTOLOAD=1 uv run pytest -q

Troubleshooting

Local mode — dependency or module errors

When using runtime.mode: local, the client creates virtual environments in tmp_envs/ for each task. If you see ModuleNotFoundError or similar after updating baseline packages, remove the cache and retry:

rm -rf tmp_envs/

By default, local runtime environments are preserved to speed up restarts. Set FL_KEEP_MODEL_ENV=false to force cleanup on each stop.

Process exited silently after Waiting for model to start...

Almost always one of:

SSH session dropped — the parent received SIGHUP and was killed before any handler could run on legacy builds. Recent builds log the signal name before exit; either way, run inside tmux / nohup / systemd so the client outlives the shell.
OOM-killer — SIGKILL cannot be caught. Confirm with sudo dmesg -T | grep -iE 'killed process|out of memory'. Lower batch size, lower model precision, or move to a larger box.
Genuine startup timeout — bump PROCESS_STARTUP_TIMEOUT (default 1800 seconds) and PROCESS_RESPONSE_TIMEOUT (default 3600). Both are env-var overridable; LLM cold-starts (HF download + venv install + GPU load) frequently need 2 hours.

In every case the model subprocess uses start_new_session=True, so it survives parent death — inspect output/task_outputs/process_*.log to see exactly how far it got.

License

This project is licensed under the Apache License 2.0. See the LICENSE file for details.

References

FL Alliance is based on academic research by the FLock team. See the paper: Defending Against Poisoning Attacks in Federated Learning With Blockchain.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
backend_api		backend_api
client		client
config		config
contracts		contracts
docs		docs
img		img
test		test
tests		tests
.dockerignore		.dockerignore
.env.local.example		.env.local.example
.env.onchain.example		.env.onchain.example
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
docker-compose.yml		docker-compose.yml
hardhat.config.ts		hardhat.config.ts
main.py		main.py
package.json		package.json
pyproject.toml		pyproject.toml
start_client.sh		start_client.sh
tsconfig.json		tsconfig.json
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FLock FL Alliance Client

Key Features

Table of Contents

Prerequisites

Quick Start

1. Clone and install

2. Configure

3. Run a client

4. Scale to multiple clients (optional)

Container image publishing

Operating Modes

Project Structure

Documentation

Makefile Parameters

Development

Troubleshooting

License

References

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

FLock FL Alliance Client

Key Features

Table of Contents

Prerequisites

Quick Start

1. Clone and install

2. Configure

3. Run a client

4. Scale to multiple clients (optional)

Container image publishing

Operating Modes

Project Structure

Documentation

Makefile Parameters

Development

Troubleshooting

License

References

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages