Skip to content

adstuart/vtap-pls-test

Repository files navigation

vtap-pls-test

⚠️ Caveat: this entire lab and write-up was produced autonomously by GitHub Copilot CLI. The Bicep, shell scripts, README, findings, and diagrams were all generated by an AI agent driving the Azure CLI against a real subscription. A human reviewed the final output, but no line was hand-written. Treat every claim, command, and conclusion as needing independent verification before relying on it in your own environment. If you reproduce the lab and find an error, please open an issue or PR.

Test lab: can Azure Virtual Network TAP (vTAP, public preview) be combined with Private Link Service + Private Endpoint to bridge packet-mirror traffic from one isolated VNet to another?

What is Azure Virtual Network TAP?

Azure Virtual Network TAP (vTAP) is a platform-native, agentless traffic-mirroring feature (public preview as of April 2026). It copies packets from source VM NICs, VXLAN-encapsulates them on UDP/4789, and delivers them out-of-band to a collector. It is the cloud-native equivalent of a SPAN port in a traditional data centre.

Microsoft resources:

The limitation this repo explores

Per the official docs, the vTAP source and destination must have IP connectivity — typically via VNet peering (or equivalent same-VNet placement). See Virtual Network TAP overview → Restrictions.

This repo documents a series of experiments attempting to bridge vTAP from a source VM in one VNet to a collector in a completely isolated VNet (no peering, no VPN, no ExpressRoute) by routing mirrored traffic through Azure Private Link Service + Private Endpoint. Spoiler: the naive approach is blocked by Azure Resource Manager, but a proxy-VM workaround does succeed.

Why this scenario matters

This is a pretty common shape of requirement in the real world: a managed service provider wants to offer a packet-capture / NDR / IDS service to its customers, host the collector fleet in its own isolated provider VNet, and let each customer's tap source live in their VNet — with no peering, VPN, or ExpressRoute between the two. Private Link is the obvious building block for "service in one VNet, consumer in another, no routed connectivity" — so the natural question is whether vTAP's destination can be a Private Endpoint. Today, as this lab shows, it can't: the platform explicitly forbids PE/PLS NICs as vTAP destinations. Maybe a future iteration of vTAP will support PE destinations natively; it doesn't today.

Caveat on the workaround

To be clear: putting an extra Linux VM in every customer VNet purely to DNAT VXLAN onto a Private Endpoint is not a realistic production pattern for a managed service. It pushes a stateful per-customer hop, an OS to patch, and a forwarding-plane SPOF into the consumer's environment — exactly the things a Private-Link-based service is meant to avoid. I'm sharing the lab anyway because going through the exercise was the clearest way for me to internalise why the limitation exists and where exactly ARM draws the line, and the evidence may save someone else the same round trip.

TL;DR for the impatient

  • What works: a lightweight Linux proxy VM in the source VNet, with iptables DNAT UDP/4789 → PE private IP (Scenario B2). VXLAN mirror lands on a collector in a non-peered VNet, inner HTTP intact.
  • What doesn't: making the PE NIC itself the vTAP destination (R1), or stuffing the PE NIC into an ILB backend pool (B1). Both are rejected by ARM at create time.
  • 💡 Why: Azure blocks Private Endpoint / Private Link Service NICs as vTAP destinations (CannotSetPrivateLinkServiceOrPrivateEndpointNetworkInterfaceAsDestinationOfVirtualNetworkTap) and as mutable NICs (CannotModifyNicAttachedToPrivateEndpoint). Put a vanilla VM NIC in between and ARM is happy.

Results

Approach Supported by ARM? Evidence Verdict
R1 vTAP destination = PE NIC tap-create.err CannotSetPrivateLinkServiceOrPrivateEndpointNetworkInterfaceAsDestinationOfVirtualNetworkTap
B1 PE NIC as ILB backend pool member round2-b1.err CannotModifyNicAttachedToPrivateEndpoint — PE NIC is immutable
B2 Proxy VM in source VNet, iptables DNAT UDP/4789 → PE private IP round2-evidence-proxy.txt, round2-evidence-col.txt Works end-to-end. VXLAN with decoded inner HTTP arrives on the collector in a non-peered VNet via PE → PLS → ILB.
A PLS Direct Connect n/a docs + feature flag Not in uksouth; docs forbid PE as destination IP; doesn't fix the vTAP-side blocker anyway
C1 Azure Firewall NIC as vTAP destination (reasoning) AzFW data plane isn't a Microsoft.Network/networkInterfaces resource

Topology — scenarios tested

Scenario R1 (❌ FAILED) — vTAP straight at a PE NIC

Direct approach: vTAP on vm-src's NIC with the destination set to the Private Endpoint NIC in the source VNet that fronts a PLS in VNet B. ARM rejects the vTAP create call at the source-side vTAP destination step — no packets ever flow.

flowchart LR
  classDef vnetA fill:#DEEBF7,stroke:#2E75B6,color:#1F3864;
  classDef vnetB fill:#E2EFDA,stroke:#548235,color:#375623;
  classDef err fill:#F8CBAD,stroke:#C00000,color:#000,font-weight:bold;
  classDef note fill:#FFF2CC,stroke:#BF8F00,color:#000;

  subgraph A["VNet A — vnet-src 10.30.0.0/16"]
    SRC["vm-src<br/>nginx :80<br/>10.30.1.4"]
    PE["pe-col<br/>Private Endpoint NIC<br/>10.30.3.4"]
  end
  subgraph B["VNet B — vnet-dst 10.40.0.0/16<br/>(no peering to VNet A)"]
    PLS["pls-col<br/>Private Link Service"]
    ILB["ilb-col<br/>Std ILB frontend<br/>10.40.2.4<br/>UDP/4789 rule"]
    COL["vm-col<br/>tcpdump<br/>10.40.3.4"]
  end

  ERR["✗ ARM rejects vTAP create<br/>CannotSetPrivateLinkServiceOr<br/>PrivateEndpointNetworkInterface<br/>AsDestinationOfVirtualNetworkTap"]:::err
  NOTE[/"ARM blocks PE/PLS NICs<br/>as vTAP destinations"/]:::note

  SRC -. "vTAP mirror<br/>VXLAN UDP/4789" .-> PE
  SRC -. fail .-> ERR
  PE --> PLS --> ILB --> COL

  linkStyle 0 stroke:#C00000,stroke-width:2px,stroke-dasharray:6 4;
  linkStyle 1 stroke:#C00000,stroke-width:2px,stroke-dasharray:2 2;
  linkStyle 2 stroke:#7F7F7F,stroke-width:1px,stroke-dasharray:4 4;
  linkStyle 3 stroke:#7F7F7F,stroke-width:1px,stroke-dasharray:4 4;
  linkStyle 4 stroke:#7F7F7F,stroke-width:1px,stroke-dasharray:4 4;

  class SRC,PE vnetA;
  class PLS,ILB,COL vnetB;
Loading

Verdict: ❌ vTAP resource never provisions. PE NIC is off-limits as a TAP destination — the platform owns the NIC.


Scenario B1 (❌ FAILED) — Put the PE NIC in an ILB backend

Alternate idea: keep vTAP legal by aiming it at an ILB frontend in VNet A, and stuff the PE NIC into that ILB's backend pool so the mirror lands on the PE via the LB. ARM blocks the backend-pool edit because PE NICs are immutable.

flowchart LR
  classDef vnetA fill:#DEEBF7,stroke:#2E75B6,color:#1F3864;
  classDef vnetB fill:#E2EFDA,stroke:#548235,color:#375623;
  classDef err fill:#F8CBAD,stroke:#C00000,color:#000,font-weight:bold;

  subgraph A["VNet A — vnet-src 10.30.0.0/16"]
    SRC["vm-src<br/>nginx :80<br/>10.30.1.4"]
    ILBA["ilb-src<br/>Std ILB frontend<br/>(vTAP dest candidate)"]
    PE["pe-col<br/>Private Endpoint NIC<br/>10.30.3.4<br/>(IMMUTABLE)"]
  end
  subgraph B["VNet B — vnet-dst 10.40.0.0/16"]
    PLS["pls-col<br/>Private Link Service"]
    ILB["ilb-col<br/>frontend 10.40.2.4"]
    COL["vm-col<br/>tcpdump<br/>10.40.3.4"]
  end

  ERR["✗ az network nic ip-config<br/>address-pool add rejected<br/>CannotModifyNicAttachedToPrivateEndpoint"]:::err

  SRC -. "vTAP mirror<br/>VXLAN UDP/4789" .-> ILBA
  ILBA -. "LB rule → backend" .-> PE
  PE -. fail .-> ERR
  PE --> PLS --> ILB --> COL

  linkStyle 0 stroke:#2E75B6,stroke-width:2px;
  linkStyle 1 stroke:#C00000,stroke-width:2px,stroke-dasharray:6 4;
  linkStyle 2 stroke:#C00000,stroke-width:2px,stroke-dasharray:2 2;
  linkStyle 3 stroke:#7F7F7F,stroke-width:1px,stroke-dasharray:4 4;
  linkStyle 4 stroke:#7F7F7F,stroke-width:1px,stroke-dasharray:4 4;
  linkStyle 5 stroke:#7F7F7F,stroke-width:1px,stroke-dasharray:4 4;

  class SRC,ILBA,PE vnetA;
  class PLS,ILB,COL vnetB;
Loading

Verdict: ❌ PE NIC cannot be enrolled into any ILB backend pool. Kills the "pre-load-balance onto the PE" idea.


Scenario B2 (✅ WORKS) — proxy VM with iptables DNAT (the breakthrough)

vTAP destination is an ordinary Ubuntu VM NIC (vm-proxy). The proxy has net.ipv4.ip_forward=1, rp_filter=0, NIC-level enableIpForwarding=true, and iptables DNAT UDP/4789 → PE private IP 10.30.3.4. From there the packet is a plain PE → PLS → ILB flow that the platform is perfectly happy with.

flowchart LR
  classDef vnetA fill:#DEEBF7,stroke:#2E75B6,color:#1F3864;
  classDef vnetB fill:#E2EFDA,stroke:#548235,color:#375623;
  classDef ok fill:#C6EFCE,stroke:#2E7D32,color:#1B5E20,font-weight:bold;
  classDef note fill:#FFF2CC,stroke:#BF8F00,color:#000;

  subgraph A["VNet A — vnet-src 10.30.0.0/16"]
    SRC["vm-src<br/>nginx :80<br/>10.30.1.4"]
    PROXY["vm-proxy<br/>10.30.4.4<br/>ip_forward=1, rp_filter=0<br/>NIC enableIpForwarding=true<br/>iptables DNAT UDP/4789 → 10.30.3.4"]:::ok
    PE["pe-col<br/>Private Endpoint<br/>10.30.3.4"]
  end
  subgraph B["VNet B — vnet-dst 10.40.0.0/16<br/>(no peering)"]
    PLS["pls-col<br/>Private Link Service"]
    ILB["ilb-col<br/>frontend 10.40.2.4<br/>UDP/4789 LB rule"]
    COL["vm-col<br/>tcpdump<br/>10.40.3.4<br/>decodes VXLAN → inner HTTP GET/200"]
  end

  NOTE[/"Mirror packets flow:<br/>vm-src NIC → proxy (DNAT) → PE → PLS tunnel → ILB → collector<br/>VXLAN UDP/4789 preserved on every hop"/]:::note

  SRC -- "vTAP mirror<br/>VXLAN UDP/4789" --> PROXY
  PROXY -- "DNAT<br/>VXLAN UDP/4789" --> PE
  PE -- "PLS tunnel<br/>VXLAN UDP/4789" --> PLS
  PLS -- "ILB backend<br/>VXLAN UDP/4789" --> ILB
  ILB -- "VXLAN UDP/4789" --> COL

  linkStyle 0 stroke:#2E7D32,stroke-width:3px;
  linkStyle 1 stroke:#2E7D32,stroke-width:3px;
  linkStyle 2 stroke:#2E7D32,stroke-width:3px;
  linkStyle 3 stroke:#2E7D32,stroke-width:3px;
  linkStyle 4 stroke:#2E7D32,stroke-width:3px;

  class SRC,PE vnetA;
  class PLS,ILB,COL vnetB;
Loading

Verdict: ✅ End-to-end VXLAN mirror arrives on vm-col in an unpeered VNet with inner HTTP GET / and HTTP/1.1 200 OK intact. Outer source IP on the collector is the PLS NIC 10.40.2.4; inner payload preserves the original L2/L3.


Scenario A (⊘ N/A) — PLS Direct Connect

Direct-connect changes only the provider side: it lets a PLS target an arbitrary user IP instead of an ILB, bypassing the provider-side load balancer. The consumer side is still a Private Endpoint, which is the side that blocks vTAP in R1. So direct-connect doesn't address the actual blocker, and it's not in uksouth anyway.

flowchart LR
  classDef vnetA fill:#DEEBF7,stroke:#2E75B6,color:#1F3864;
  classDef vnetB fill:#E2EFDA,stroke:#548235,color:#375623;
  classDef err fill:#F8CBAD,stroke:#C00000,color:#000,font-weight:bold;
  classDef note fill:#FFF2CC,stroke:#BF8F00,color:#000;

  subgraph A["VNet A — consumer side (unchanged)"]
    SRC["vm-src"]
    PE["pe-col<br/>Private Endpoint NIC<br/>(STILL the blocker)"]
  end
  subgraph B["VNet B — provider side"]
    PLSDC["pls-col (Direct Connect)<br/>targets arbitrary user IP<br/>instead of ILB"]
    TGT["vm-col<br/>user-owned target IP"]
  end

  ERR["✗ vTAP still can't use PE NIC as destination<br/>(R1 blocker unchanged)"]:::err
  NOTE[/"Direct Connect only removes the<br/>provider-side ILB. Consumer side is<br/>still a PE → same ARM rejection as R1."/]:::note

  SRC -. "vTAP mirror" .-> PE
  PE -. fail .-> ERR
  PE --> PLSDC --> TGT

  linkStyle 0 stroke:#C00000,stroke-width:2px,stroke-dasharray:6 4;
  linkStyle 1 stroke:#C00000,stroke-width:2px,stroke-dasharray:2 2;
  linkStyle 2 stroke:#7F7F7F,stroke-width:1px,stroke-dasharray:4 4;
  linkStyle 3 stroke:#7F7F7F,stroke-width:1px,stroke-dasharray:4 4;

  class SRC,PE vnetA;
  class PLSDC,TGT vnetB;
Loading

Verdict: ⊘ Not applicable to the blocker; B2 supersedes it.


Deploy

./deploy.sh     # idempotent; creates RG, VNets, VMs, ILB, PLS, PE, vTAP + attach
./test.sh       # tcpdump on collector + curl traffic generator

Resource group: rg-vtap-pls-test in uksouth. All resources tagged project=vtap-pls-test.

Round 2 (the B2 workaround) adds a proxy VM + iptables DNAT — see round2-proxy-cloudinit.yaml and round2-test.sh. Evidence in round2-evidence-proxy.txt and round2-evidence-col.txt.

Findings

Full write-up in findings.md — Round 1 (direct attempt, failed) and Round 2 (proxy DNAT workaround, works).

Cleanup

az group delete -n rg-vtap-pls-test --yes --no-wait

To pause spend without deleting:

az vm deallocate -g rg-vtap-pls-test -n vtappls-vm-src
az vm deallocate -g rg-vtap-pls-test -n vtappls-vm-col
az vm deallocate -g rg-vtap-pls-test -n vtappls-vm-proxy

Contributions

Issues and PRs welcome — especially additional vTAP corner-case scenarios, or confirmation of behaviour in other regions / subscription types.

Disclaimer

This is an independent community experiment; it is not official Microsoft guidance. vTAP is in public preview and its behaviour may change. Use at your own risk.

License

MIT — see LICENSE.

About

Azure Virtual Network TAP (vTAP) + Private Link Service lab: can you bridge packet mirroring across isolated VNets? ARM rejects the naive approach; a proxy-VM iptables DNAT workaround succeeds end-to-end.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors