Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
114 commits
Select commit Hold shift + click to select a range
0c07585
the modifications done for the static part
o0n1x Mar 25, 2025
c6969d0
removed miniforge
o0n1x Mar 25, 2025
ab32a68
Delete Miniforge3-Linux-x86_64.sh
o0n1x Mar 25, 2025
d2bfc90
integrated the updated/refactored read_network section
o0n1x Apr 7, 2025
0f1cc93
duplicated flwr example with network
o0n1x Apr 7, 2025
93e0687
fixed typo
o0n1x Apr 7, 2025
a27b625
added the process of removing old networktemp files before starting g…
o0n1x Apr 7, 2025
f8a148f
added a function to do configmap given a folder
o0n1x Apr 9, 2025
a8a564b
moved where the networktemp folder is checked to a correct place
o0n1x Apr 9, 2025
cf58bc3
progress on updating configmap saving into a folder for each client…
o0n1x Apr 9, 2025
22ea382
completed part where configmap files are generated
o0n1x Apr 9, 2025
f925d56
changed so that the new configmap creator is used
o0n1x Apr 9, 2025
223d42a
fixed import missing
o0n1x Apr 9, 2025
651b0df
quick fix
o0n1x Apr 10, 2025
1b405c2
removed residual code
o0n1x Apr 10, 2025
fc82acd
quick fix for a variable
o0n1x Apr 10, 2025
242e1df
edited how old folder is deleted
o0n1x Apr 10, 2025
9d44723
fixed a bug in configmap files saving
o0n1x Apr 10, 2025
d3746b2
Merge branch 'main' of https://github.com/o0n1x/colext_edit
o0n1x Apr 10, 2025
97233d1
fixed a bug
o0n1x Apr 13, 2025
c8aa0ed
1-fixed issue with converting dict being directly saved as the static…
o0n1x Apr 13, 2025
daf4972
cleaned and refactored merge_static_network_rules()
o0n1x Apr 13, 2025
e602ee6
quick changes
o0n1x Apr 13, 2025
6683867
Merge branch 'main' of https://github.com/o0n1x/colext_edit into main
o0n1x Apr 13, 2025
557f324
added compatibility with -1 in colext config
o0n1x Apr 16, 2025
af91754
Merge branch 'main' of https://github.com/o0n1x/colext_edit into main
o0n1x Apr 16, 2025
9d9e2d3
edited config
o0n1x Apr 16, 2025
6b9936b
Merge branch 'main' of https://github.com/o0n1x/colext_edit into main
o0n1x Apr 16, 2025
267d95e
temp debug
o0n1x Apr 16, 2025
29ea2d3
bug fix
o0n1x Apr 16, 2025
d8cdd47
silly logical error
o0n1x Apr 16, 2025
e0303c3
test update
o0n1x Apr 16, 2025
e8261c0
test update
o0n1x Apr 16, 2025
4d7a43b
very sill mistake lol
o0n1x Apr 16, 2025
4449292
remove test code
o0n1x Apr 16, 2025
a08a702
refactoring name
o0n1x Apr 16, 2025
6050732
added debugging
o0n1x Apr 16, 2025
c921785
debugging
o0n1x Apr 16, 2025
6f25e9b
fixes in dynamic configfile not reaching
o0n1x Apr 16, 2025
e83fe18
removed del will be implemented later
o0n1x Apr 16, 2025
b2cf38c
support for multiple dynamic networks
o0n1x Apr 17, 2025
41462b9
verification for del now only needs direction and doesnt check for ru…
o0n1x Apr 17, 2025
23c9628
json files now saved as .json files
o0n1x Apr 17, 2025
237205e
added virication for network tags to be string
o0n1x Apr 17, 2025
5664213
added ability to add multiple commadns in the same time/epoch stamp
o0n1x Apr 17, 2025
5e8068b
added del in test config
o0n1x Apr 17, 2025
ffc4925
debugging support for del command
o0n1x Apr 17, 2025
9a95835
major change: changed how simple dynamic rules is defined into lists …
o0n1x Apr 17, 2025
9a36dd9
changed colextconfig to support major change
o0n1x Apr 17, 2025
ca4d15d
added a network manager with all the classes needed for the client/se…
o0n1x Apr 24, 2025
28ee80b
fully implemented network_manager. NEEDS TESTING STILL
o0n1x Apr 27, 2025
c85fcff
server decorator modified to include publishing
o0n1x Apr 28, 2025
94709db
changed it to only publish time = 0, 1 only
o0n1x Apr 28, 2025
75ee2a1
removed unused functions
o0n1x Apr 28, 2025
7bd0a04
added support for time iter for client to loop on it self
o0n1x Apr 29, 2025
5ab7fe2
testing
o0n1x Apr 29, 2025
6cd8726
added broker yaml files
o0n1x Apr 29, 2025
fb74282
changed namespace
o0n1x Apr 29, 2025
3b12a82
orginized and added service for ui managment
o0n1x Apr 29, 2025
7c3aad7
added the host of rabbitmq broker
o0n1x Apr 29, 2025
ef869da
made publish msgs auto convert to str before publishing
o0n1x Apr 29, 2025
e9ff2b8
fixed a path error
o0n1x Apr 29, 2025
1938d96
error fixes
o0n1x Apr 29, 2025
133d6cb
fixed type extracting error
o0n1x Apr 29, 2025
5a46fe6
added validation for iter name (epoch/time)
o0n1x Apr 29, 2025
791bd6b
temp
o0n1x Apr 29, 2025
9767388
Merge branch 'main' of https://github.com/o0n1x/colext_edit into main
o0n1x Apr 29, 2025
6816bdb
debugging logging
o0n1x Apr 29, 2025
4f2bb45
Merge branch 'main' of https://github.com/o0n1x/colext_edit into main
o0n1x Apr 29, 2025
553895a
removing try for debugging
o0n1x Apr 29, 2025
b6e4f86
Merge branch 'main' of https://github.com/o0n1x/colext_edit into main
o0n1x Apr 29, 2025
35ca020
more logging
o0n1x Apr 29, 2025
f1cef59
silly error fixed
o0n1x Apr 29, 2025
a18d12a
fixed bug where dicts is modified while looping through it
o0n1x Apr 29, 2025
a00b44b
very temp logging, must be edited later
o0n1x Apr 29, 2025
f3a0197
even more logging!
o0n1x Apr 29, 2025
f25f77d
possible bugfix
o0n1x Apr 29, 2025
fe26ff5
Another bug fix :)
o0n1x Apr 29, 2025
3f18724
debugging
o0n1x Apr 29, 2025
784c52c
bug fix?
o0n1x Apr 29, 2025
18b7369
logging
o0n1x Apr 29, 2025
3b6fc5e
added predefined queues
o0n1x Apr 30, 2025
5daade2
merge
o0n1x Apr 30, 2025
5926e02
Merge branch 'main' of https://github.com/o0n1x/colext_edit into main
o0n1x Apr 30, 2025
3df3bab
fixed minor bug
o0n1x Apr 30, 2025
6706cad
logging bananza!
o0n1x Apr 30, 2025
0b9bf2a
moar logs!
o0n1x Apr 30, 2025
b2a93da
possible fix?
o0n1x Apr 30, 2025
6a8e813
possible fix
o0n1x Apr 30, 2025
a916f63
refactoring varaiables
o0n1x Apr 30, 2025
f17866d
syntax error
o0n1x Apr 30, 2025
d9b6368
more logging
o0n1x Apr 30, 2025
58625e8
trying out stuff
o0n1x Apr 30, 2025
82fe80b
logging
o0n1x Apr 30, 2025
10c52a7
CLOSURE PROBLEM fixed!
o0n1x Apr 30, 2025
4345d4a
runtime error fixed
o0n1x Apr 30, 2025
b623a9f
fixing the closure problem
o0n1x Apr 30, 2025
546a973
finally fix lambda closures?
o0n1x Apr 30, 2025
0e833ae
syntax error
o0n1x Apr 30, 2025
460e922
logical error fixed maybe
o0n1x Apr 30, 2025
d03c751
logical error fixes
o0n1x Apr 30, 2025
c7c9d4b
logical error fix
o0n1x Apr 30, 2025
160a8c3
runtime error fixed
o0n1x Apr 30, 2025
8c70043
logical bug fixed
o0n1x Apr 30, 2025
4d51555
iter loop fixed similar to time loop which is working
o0n1x Apr 30, 2025
70a91ca
fix state persistance
o0n1x Apr 30, 2025
722d72a
runtime error fixed
o0n1x Apr 30, 2025
0d9ef83
debugged and removing extra logging :D
o0n1x Apr 30, 2025
2605dad
partially complete documentation
o0n1x May 14, 2025
7f16afa
Merge remote-tracking branch 'copied/main' into integrate-changes
o0n1x May 14, 2025
ae0d7e9
Merge pull request #1 from o0n1x/integrate-changes
o0n1x May 14, 2025
9edf2c4
update on documentation - almost complete
o0n1x May 15, 2025
afc0b86
finalizing documenetation
o0n1x May 15, 2025
a19e06e
Merge pull request #2 from o0n1x/integrate-changes
o0n1x May 22, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -8,4 +8,7 @@ logs
data
build
*.pdf
torch_wheels
torch_wheels
Miniforge3-Linux-x86_64.sh
/examples/**/*.txt
/examples/**/*.json
16 changes: 8 additions & 8 deletions examples/flwr_tutorial_1_6/colext_config.yaml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
project: colext_example # project name should not have spaces
project: network_exp # project name should not have spaces
job_name: "SOTA FL experiment"

# deployer: local_py
Expand All @@ -13,15 +13,15 @@ code:
args: "--num_clients=${COLEXT_N_CLIENTS} --num_rounds=3"

devices:
- { device_type: JetsonAGXOrin, count: 2 }
- { device_type: JetsonOrinNano, count: 2 }
- { device_type: JetsonXavierNX, count: 2 }
- { dev_type: JetsonAGXOrin, count: 1 }
- { dev_type: JetsonOrinNano, count: 1 }
- { dev_type: JetsonXavierNX, count: 2 }
# - { device_type: JetsonNano, count: 6 }
# - { device_type: LattePandaDelta3, count: 2 }
# - { device_type: OrangePi5B, count: 8 }

# Monitoring defaults
# monitoring:
# live_metrics: True # True/False
# push_interval: 10 # in seconds
# scraping_interval: 0.3 # in seconds
monitoring:
live_metrics: True # True/False
push_interval: 10 # in seconds
scraping_interval: 0.3 # in seconds
68 changes: 47 additions & 21 deletions examples/flwr_tutorial_1_8/colext_config.yaml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
project: colext_example # project name should not have spaces
project: network_exp # project name should not have spaces
job_name: "SOTA FL experiment"

# deployer: local_py
deployer: sbc
# python_version: "3.10"

code:
Expand All @@ -16,31 +16,57 @@ code:
command: >-
python3 ./server.py
--num_clients=${COLEXT_N_CLIENTS}
--num_rounds=3
--num_rounds=30

clients:
# - dev_type: LattePandaDelta3
# count: 4
# add_args: "--max_step_count=50"
network:
- tag: default
upstream:
bandwidth: 100Mbps
latency: 1ms
downstream:
bandwidth: 100Mbps
latency: 1ms
- tag: slow
upstream:
bandwidth: 1Mbps
latency: 1ms
downstream:
bandwidth: 1Mbps
latency: 1ms
- tag: veryslow
upstream:
bandwidth: 100Kbps
latency: 1ms
downstream:
bandwidth: 100Kbps
latency: 1ms


- dev_type: JetsonOrinNano
count: 4
add_args: "--max_step_count=200"
clients:
# - dev_type: LattePandaDelta3
# count: 4
#add_args: "--max_step_count=50"
#network: veryslow

- dev_type: OrangePi5B
add_args: "--max_step_count=100"
- dev_type: JetsonOrinNano
add_args: "--max_step_count=200"
network: slow

- dev_type: OrangePi5B
count: 2
add_args: "--max_step_count=50"
- dev_type: OrangePi5B
add_args: "--max_step_count=100"
network: default

# - { dev_type: JetsonAGXOrin, count: 1 }
- dev_type: OrangePi5B
count: 1
add_args: "--max_step_count=50"
network: default
# - { dev_type: JetsonAGXOrin, count: 1 }
# - { dev_type: JetsonOrinNano, count: 2 }
# - { dev_type: JetsonXavierNX, count: 2 }
# - { dev_type: JetsonNano, count: 6 }

# Monitoring defaults
# monitoring:
# live_metrics: True # True/False
# push_interval: 10 # in seconds
# scraping_interval: 0.3 # in seconds
# Monitoring defaults
monitoring:
live_metrics: True # True/False
push_interval: 10 # in seconds
scraping_interval: 0.3 # in seconds
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
tcset eth0 --direction outgoing ['rate 1Mbps', 'delay 1ms'] --change
tcset eth0 --direction incoming ['rate 1Mbps', 'delay 1ms'] --change
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
tcset eth0 --direction outgoing ['rate 100Mbps', 'delay 1ms'] --change
tcset eth0 --direction incoming ['rate 100Mbps', 'delay 1ms'] --change
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
tcset eth0 --direction outgoing ['rate 100Mbps', 'delay 1ms'] --change
tcset eth0 --direction incoming ['rate 100Mbps', 'delay 1ms'] --change
183 changes: 183 additions & 0 deletions examples/flwr_tutorial_1_8_network/client.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,183 @@
# Copied from: https://github.com/adap/flower/blob/dcffb484fb7d1e712f65d414fb31aa021f0a760e/examples/quickstart-pytorch/client.py
import argparse
import warnings
from collections import OrderedDict

from flwr.client import NumPyClient, ClientApp
from flwr_datasets import FederatedDataset
import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.utils.data import DataLoader
from torchvision.transforms import Compose, Normalize, ToTensor
from tqdm import tqdm

from colext import MonitorFlwrClient

# #############################################################################
# 1. Regular PyTorch pipeline: nn.Module, train, test, and DataLoader
# #############################################################################

warnings.filterwarnings("ignore", category=UserWarning)
DEVICE = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")


class Net(nn.Module):
"""Model (simple CNN adapted from 'PyTorch: A 60 Minute Blitz')"""

def __init__(self) -> None:
super(Net, self).__init__()
self.conv1 = nn.Conv2d(3, 6, 5)
self.pool = nn.MaxPool2d(2, 2)
self.conv2 = nn.Conv2d(6, 16, 5)
self.fc1 = nn.Linear(16 * 5 * 5, 120)
self.fc2 = nn.Linear(120, 84)
self.fc3 = nn.Linear(84, 10)

def forward(self, x: torch.Tensor) -> torch.Tensor:
x = self.pool(F.relu(self.conv1(x)))
x = self.pool(F.relu(self.conv2(x)))
x = x.view(-1, 16 * 5 * 5)
x = F.relu(self.fc1(x))
x = F.relu(self.fc2(x))
return self.fc3(x)


def train(net, trainloader, epochs):
"""Train the model on the training set."""
criterion = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(net.parameters(), lr=0.001, momentum=0.9)
for _ in range(epochs):
step = 0

for batch in tqdm(trainloader, "Training"):
images = batch["img"]
labels = batch["label"]
optimizer.zero_grad()
criterion(net(images.to(DEVICE)), labels.to(DEVICE)).backward()
optimizer.step()

if step >= max_step_count:
break
else:
step += 1


def test(net, testloader):
"""Validate the model on the test set."""
criterion = torch.nn.CrossEntropyLoss()
correct, loss = 0, 0.0
with torch.no_grad():
step = 0

for batch in tqdm(testloader, "Testing"):
images = batch["img"].to(DEVICE)
labels = batch["label"].to(DEVICE)
outputs = net(images)
loss += criterion(outputs, labels).item()
correct += (torch.max(outputs.data, 1)[1] == labels).sum().item()

if step >= max_step_count:
break
else:
step += 1
accuracy = correct / len(testloader.dataset)
return loss, accuracy


def load_data(partition_id):
"""Load partition CIFAR10 data."""
fds = FederatedDataset(dataset="cifar10", partitioners={"train": 3})
partition = fds.load_partition(partition_id)
# Divide data on each node: 80% train, 20% test
partition_train_test = partition.train_test_split(test_size=0.2)
pytorch_transforms = Compose(
[ToTensor(), Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))]
)

def apply_transforms(batch):
"""Apply transforms to the partition from FederatedDataset."""
batch["img"] = [pytorch_transforms(img) for img in batch["img"]]
return batch

partition_train_test = partition_train_test.with_transform(apply_transforms)
trainloader = DataLoader(partition_train_test["train"], batch_size=32, shuffle=True)
testloader = DataLoader(partition_train_test["test"], batch_size=32)
return trainloader, testloader


# #############################################################################
# 2. Federation of the pipeline with Flower
# #############################################################################

# Get partition id
parser = argparse.ArgumentParser(description="Flower")
parser.add_argument(
"--partition-id",
choices=[0, 1, 2],
default=0,
type=int,
help="Partition of the dataset divided into 3 iid partitions created artificially.",
)
partition_id = parser.parse_known_args()[0].partition_id

# Load model and data (simple CNN, CIFAR-10)
net = Net().to(DEVICE)
trainloader, testloader = load_data(partition_id=partition_id)

# Define Flower client
# The decoration does nothing if outsite the CoLExT environment
@MonitorFlwrClient
class FlowerClient(NumPyClient):
def get_parameters(self, config):
return [val.cpu().numpy() for _, val in net.state_dict().items()]

def set_parameters(self, parameters):
params_dict = zip(net.state_dict().keys(), parameters)
state_dict = OrderedDict({k: torch.tensor(v) for k, v in params_dict})
net.load_state_dict(state_dict, strict=True)

def fit(self, parameters, config):
self.set_parameters(parameters)
train(net, trainloader, epochs=1)
return self.get_parameters(config={}), len(trainloader.dataset), {}

def evaluate(self, parameters, config):
self.set_parameters(parameters)
loss, accuracy = test(net, testloader)
return loss, len(testloader.dataset), {"accuracy": accuracy}


def client_fn(cid: str):
"""Create and return an instance of Flower `Client`."""
return FlowerClient().to_client()


# Flower ClientApp
app = ClientApp(
client_fn=client_fn,
)

def get_args():
parser = argparse.ArgumentParser(
prog='FL Client',
description='Starts the FL client')

parser.add_argument('--flserver_address', type=str, default="127.0.0.1:8080", help="FL server address ip:port")
parser.add_argument('--max_step_count', default=3000, type=int, help="Configure number of steps for train and test")
args = parser.parse_args()
return args

# Legacy mode
if __name__ == "__main__":
from flwr.client import start_client

args = get_args()

flserver_address = args.flserver_address
max_step_count = args.max_step_count

start_client(
server_address=flserver_address,
client=FlowerClient().to_client(),
)
Loading