Skip to content

Feat/netmaker#7

Merged
Pablomonte merged 38 commits intomasterfrom
feat/netmaker
Dec 15, 2025
Merged

Feat/netmaker#7
Pablomonte merged 38 commits intomasterfrom
feat/netmaker

Conversation

@Pablomonte
Copy link
Copy Markdown
Owner

Adoptamos Netmaker

Pablomonte and others added 30 commits November 3, 2025 15:54
Implements simulated ISP (AS 65001) for eBGP testing with 3 deployment modes:

Mode 1 (Mesh Only): Default - 21 containers, no ISP (backward compatible)
Mode 2 (Integrated): Mesh + ISP via profile - 22 containers on same host
Mode 3 (Decoupled): ISP standalone - separate hosts for hybrid testing

Key features:
- Docker Compose profiles for opt-in ISP deployment
- bird1 as border router with conditional eBGP peer
- Route filtering: announces customer prefixes, blocks TINC mesh
- External network (isp-net) for decoupling support
- Standalone docker-compose.isp.yml for independent ISP deployment

Files added:
- configs/isp-bird/bird.conf: ISP BIRD configuration (AS 65001)
- docker-compose.isp.yml: Standalone ISP deployment
- docs/ISP_TESTING.md: Comprehensive testing guide for all 3 modes
- tests/integration/test_isp_integrated.sh: Integration test suite

Files modified:
- docker-compose.yml: Add isp-bird service with profile, isp-net network
- configs/bird/protocols.conf.j2: Add conditional ISP peer for node1
- configs/bird/filters.conf: Add ISP import/export filters
- Makefile: Add deploy-local-isp, deploy-isp-only, clean-all targets

Testing:
- Backward compatible: make deploy-local (21 containers, no ISP)
- Integrated: make deploy-local-isp (22 containers)
- Decoupled: make deploy-isp-only (separate host)

🤖 Generated with Claude Code

Co-Authored-By: Claude <noreply@anthropic.com>
The entrypoint.sh was not passing isp_enabled and isp_neighbor variables
to the Jinja2 template, causing the ISP peer to never be configured.

🤖 Generated with Claude Code

Co-Authored-By: Claude <noreply@anthropic.com>
BIRD requires filter definitions to appear before they are used in protocols.
Reversed include order to fix 'CF_SYM_UNDEFINED' syntax error.

🤖 Generated with Claude Code

Co-Authored-By: Claude <noreply@anthropic.com>
tinc1 was getting auto-assigned 172.30.0.2 which conflicted with isp-bird.
Now explicitly set to 172.30.0.1 (border router IP).

🤖 Generated with Claude Code

Co-Authored-By: Claude <noreply@anthropic.com>
Use consistent mapping style for all networks instead of mixing list and map.

🤖 Generated with Claude Code

Co-Authored-By: Claude <noreply@anthropic.com>
Bug #5: Move 'next hop self' inside ipv4 channel block
- BIRD 2.x requires channel-specific options inside the channel block
- Moved from protocol level to ipv4 {} block in isp-bird/bird.conf
- Resolves: "syntax error, unexpected NEXT" on line 82

Bug #6: Resolve Docker gateway IP conflict (172.30.0.1)
- Docker auto-assigns 172.30.0.1 as bridge network gateway
- Changed tinc1 from 172.30.0.1 to 172.30.0.3 in docker-compose.yml
- Updated ISP BGP neighbor to 172.30.0.3 in isp-bird/bird.conf
- Updated protocols.conf.j2 local address to 172.30.0.3
- Resolves: "Address already in use" error on tinc1 startup

All 22 containers now start successfully in ISP integrated mode.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Enhancements:
- Fix container count grep pattern to include 'daemon' containers
- Make ping test optional when ping command not available in BIRD image
- Add warning message instead of failure when ping missing
- BGP Established state already proves network connectivity

All 8 tests now pass reliably in ISP integrated mode.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Architecture changes:
- Replace full mesh iBGP (5 routers) with single border router (bird1)
- Implement dual ISP uplinks with BGP multi-homing
- Remove unnecessary services: bird2-5, daemon1-5, etcd2-5, prometheus
- Reduce deployment from 22 to 8 containers

Multi-homing implementation:
- Primary uplink: 172.30.0.3 → 172.30.0.2 (local-pref 200)
- Secondary uplink: 172.31.0.3 → 172.31.0.2 (local-pref 150)
- Both uplinks terminate on same ISP (AS 65001)
- Automatic failover via BGP local-preference

Network topology:
- TINC mesh: 5 nodes (44.30.127.0/24) - VPN only
- ISP primary: 172.30.0.0/24
- ISP secondary: 172.31.0.0/24
- Single etcd node for TINC peer discovery

Test updates:
- Updated integration tests for 8-container architecture
- Verify dual BGP sessions (2/2 Established)
- Validate local-pref preference (200 > 150)
- Confirm route filtering (TINC mesh blocked from ISP)

All tests passing (8/8).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
- Add ISP_LOCAL_IP variable to bird1 environment
- Pass isp_local_ip to BIRD template when defined
- Update protocols.conf.j2: isp_primary uses macvlan IP when ISP_LOCAL_IP set
- Fallback to isp-net IPs (172.30.0.3/172.31.0.3) for integrated ISP mode
- Network subnets: mesh-net 172.22.0.0/16, cluster-net 172.23.0.0/16
- Macvlan network driver for direct L2 access to physical LAN
- Configurable via .env: LAN_INTERFACE, LAN_SUBNET, TINC1_LAN_IP
- tinc1 gets additional lan-macvlan network interface
- Deploy with: make deploy-with-external-isp
- deploy-with-external-isp: Deploy mesh with external ISP connectivity
- verify-isp: Check ISP BGP session status and received routes
- Production-ready guide for macvlan ISP setup over wired Ethernet
- Prerequisites: wired interface, IP configuration, ISP node setup
- Deployment steps using make deploy-with-external-isp
- ISP node BIRD configuration with example
- Troubleshooting macvlan connectivity and BGP sessions
- Performance metrics and monitoring commands
- Production recommendations (MD5 auth, route filters, BFD)
- Detailed comparison of failed approaches:
  * Bridge + NAT: BGP breaks due to source IP changes
  * Macvlan over WiFi: Driver and AP MAC filtering issues
  * Host network + veth bridge: Complex, defeats containerization
- Working solution: Macvlan over wired Ethernet with direct L2 access
- Quick start commands using make targets
- Reference to comprehensive integration guide
- Consistent with repository conventions (make commands, not raw docker-compose)
- Update README.md: 8 containers, single border router, ISP multi-homing
- Update Arquitectura.md: Add Section 7 documenting multi-homing decision
- Remove references to 22-container full mesh iBGP setup
- Add current Sprint Status and deployment commands

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
- ISP node uses host networking to access physical LAN directly
- BGP session: 10.42.0.228 (ISP) ↔ 10.42.0.100 (mesh border router)
- Single customer session (validated 2025-11-10 with 100% test pass)
- Removed dual-link configuration (moved to experimental)
- This is the production-ready configuration for external ISP integration
- Saved dual-homing configuration (primary + secondary ISP links)
- Files: bird-dual-link.conf.experimental, docker-compose.isp-dual-link.yml.experimental
- Uses isp-net networks (172.30.0.0/24 primary, 172.31.0.0/24 secondary)
- Not production-validated, kept for future multi-homing exploration
docs: Add hardware test documentation for physical device setup
docs: Update hardware test documentation to use Docker
- Complete test report for RPi (Mock-ISP) → Laptop1 (border) → Laptop2 (mesh)
- BGP session established (AS65001 ↔ AS65000)
- TINC VPN mesh operational between laptops
- Mock-ISP successfully pings mesh node (44.30.127.2) via BGP routing
- 0% packet loss on all paths, ~1.7ms latency through tunnel
- Documented troubleshooting and lessons learned
- Added raw command outputs from Laptop2 and RPi for reference

Key achievements:
- BGP route propagation working (44.30.127.0/24 announced to ISP)
- TINC tunnel encryption with TCP+UDP port 655
- IP forwarding through border router
- Return routes configured for bidirectional connectivity
- Move hardware test compose files to deploy/hardware-test/:
  - docker-compose.isp.yml (RPi Mock-ISP)
  - docker-compose.border-router.yml (Laptop n1, renamed from hardware-n1)
  - docker-compose.mesh-node.yml (Laptop n2, renamed from node2)

- Delete obsolete override files:
  - docker-compose.wifi-test.yml (test used Ethernet, not WiFi)
  - docker-compose.hardware-test.yml (replaced by standalone border-router.yml)
  - docker-compose.external-isp.yml (not used in actual test)

- Keep docker-compose.yml at root for local simulation (5 nodes)

- Update all documentation references to new paths:
  - first-test-rpi/*.md
  - docs/EXTERNAL-ISP-INTEGRATION.md
  - docs/ISP_TESTING.md

- Add deploy/hardware-test/README.md with usage instructions
Santiagocetran and others added 8 commits December 1, 2025 14:24
Hardware test results and architecture improvements
- Update README with 3-device architecture (RPi ISP, Laptop Border, Laptop Mesh)
- Create self-contained deploy folders for each device
- Add Netmaker documentation and route distribution rationale
- Add SETUP.md in each deploy folder with IP configuration steps
- Move Dockerfile/entrypoint to each deploy folder (no shared dependencies)
- Clean up old TINC-based configs, tests, and Makefile
- Document security TODOs for production use
- Add Caddy reverse proxy for HTTPS (netclient v0.24.x requires TLS)
- Fix BIRD direct protocol to use "netmaker" interface name
- Fix docker-compose: remove sysctls with network_mode host
- Fix entrypoint.sh: create /run/bird directory
- Update SERVER_API_CONN_STRING and SERVER_HTTP_HOST without scheme
- Update all SETUP.md with correct deployment steps
- Update README with complete deployment guide and known issues
- Update docs/NETMAKER.md with API reference and troubleshooting
- Add .gitignore entries for TLS certificates

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Enables netclient to trust self-signed certificates from Netmaker server:
- Add Dockerfile to build custom image with CA certificate support
- Change docker-compose to use build instead of upstream image
- Remove sysctls (incompatible with host network mode)
- Add .gitignore for server-specific CA certificates

Resolves x509 certificate validation errors during node registration.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Allow .crt files to be tracked in repository for documentation:
- Remove *.crt from root .gitignore
- Include netmaker-ca.crt as example certificate
- Update laptop-mesh .gitignore to only ignore .env

The certificate serves as an example for the custom Dockerfile approach
that builds netclient with trusted CA certificates.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
@Pablomonte Pablomonte merged commit a88de3e into master Dec 15, 2025
2 of 5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants