Skip to content

peering_filters: checking "is ip in subnet" takes 26% execution time #33

@mrngm

Description

@mrngm

After another optimization (see #31), I've looked into other parts of the code that could be optimized. When running peering_filters (without all, so no calls to bgpq3; also having a locally cached version of 2 Peering DB JSON files, and the ColoClue peers YAML), pprofile reports the following:

Command line: ./peering_filters
Total duration: 42.9381s
File: /home/mrngm/.local/lib/python3.9/site-packages/ipaddr.py
File duration: 11.2164s (26.12%)

The responsible call in peering_filters:

for asn in peerings:
    # [..]
    for session in sessions:
        session_ip = ipaddr.IPAddress(session)
        for ixp in ixp_map:
            for subnet in ixp_map[ixp]:
                if session_ip in subnet: # this call

For another project, I came across PyTricia that can efficiently determine if an IP address (either IPv4 or IPv6) is in a certain subnet. I've patched this into peering_filters as follows:

diff --git a/peering_filters b/peering_filters
index ee6c03a..7666d83 100755
--- a/peering_filters
+++ b/peering_filters
@@ -23,6 +23,8 @@ import sys
 import time
 import yaml
 
+import pytricia
+
 
 def download(url):
     try:
@@ -94,8 +96,11 @@ with open("./cc-peers.yml") as f:
 ixp_map = {}
 router_map = {}
 for ixp in generic['ixp_map']:
-    ixp_map[ixp] = [ipaddr.IPNetwork(generic['ixp_map'][ixp]['ipv4_range']),
-                    ipaddr.IPNetwork(generic['ixp_map'][ixp]['ipv6_range'])]
+    #ixp_map[ixp] = [ipaddr.IPNetwork(generic['ixp_map'][ixp]['ipv4_range']),
+    #                ipaddr.IPNetwork(generic['ixp_map'][ixp]['ipv6_range'])]
+    ixp_map[ixp] = pytricia.PyTricia()
+    ixp_map[ixp].insert(generic['ixp_map'][ixp]['ipv4_range'], ixp)
+    ixp_map[ixp].insert(generic['ixp_map'][ixp]['ipv6_range'], ixp)
     router_map[ixp] = []
     for router in generic['ixp_map'][ixp]['present_on']:
         router_map[ixp].append(router)
@@ -287,11 +292,10 @@ for asn in peerings:
     else:
         continue
 
-    for session in sessions:
-        session_ip = ipaddr.IPAddress(session)
+    for session_ip in sessions:
         for ixp in ixp_map:
-            for subnet in ixp_map[ixp]:
-                if session_ip in subnet:
+            for im_circumventing_fixing_this_large_indentation_block in [1]:
+                if session_ip in ixp_map[ixp]: # pytricia lookup
                     print("found peer %s in IXP %s" % (session_ip, ixp))
                     print("must deploy on %s" % " ".join(router_map[ixp]))
                     description = peerings[asn]['description']

After profiling again:

Command line: ./peering_filters
Total duration: 27.7758s
File: ./peering_filters
File duration: 6.03189s (21.72%)
[..]
File: /home/mrngm/.local/lib/python3.9/site-packages/ipaddr.py
File duration: 0.525969s (1.89%)

(and now most of the time is in parsing YAML).

Is this something you would consider an interesting optimization? If there is any way I can verify that this patch does not break configuration, please let me know!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions