New Tools for Navigating the Highways of Internet Traffic
By Vaneshi Ramdhony, Illinois Institute of Technology, Project guided by Dr. Nik Sultana
We live in a world where the internet connects everything from global economies to smart homes. Just like how people travel from one place to another by different modes of transport and across different streets, data also travels from one point to another through different containers and takes different routes. Engineers use network analysis to learn more about all this digital traffic. Network analysis gives experimenters the type of network traffic, how much traffic, and many more inputs that engineers can use to know what the network needs. For example; does the network need expansion, security, speed, or all three? Networks are complex, and making sense of the massive amounts of traffic is no small task. This is where GraphBLAS, a powerful tool for network analysis, comes into play.
Recently, Professor Nik Sultana, Hyunsuk Bang, and I set out to deploy GraphBLAS on the FABRIC testbed—an international, programmable network testbed for advanced networking research. Think of FABRIC as a massive playground where researchers can test the wildest ideas in cybersecurity, data analysis, network management, and more —without breaking the internet.
What is FABRIC?
FABRIC (Framework for Advanced and Integrated Research Computing) is a highly versatile network testbed that spans 33 sites across the globe, including North America, Europe, and Asia. Each site offers a range of hardware, from servers to advanced network equipment. Researchers can log in, connect to these resources, and run experiments that span the world.
It’s built for the bold and curious, offering a fully programmable infrastructure where researchers can push the boundaries of what’s possible with network design. We can create virtual spaces to test our ideas and experiment with things to make computers and the internet faster, more secure, and more efficient. In our case, we used it to deploy GraphBLAS to analyze traffic.
Introducing GraphBLAS
GraphBLAS is a powerful linear algebra tool. It’s designed to handle complex relationships between different entities—in our case, network traffic—using matrices.
With GraphBLAS, you can select which set of values you want to keep from a big data file and obtain those outputs in a simple text document. In this project, the values selected are the source and destination IP addresses. The output helps experimenters see the communication patterns, which can help to detect anomalies such as potential security threats.
The Fun Part: Running GraphBLAS on FABRIC
Now, let’s talk about the exciting part: setting this whole thing up on FABRIC. Our goal was twofold:
- Use an advanced method to watch network activity on FABRIC.
- Make the deployment easily shareable with other FABRIC users.
The experiment was carried out using Jupyter notebooks, an interactive coding platform that allows researchers to write and share experiments. This approach makes it easy for others to reproduce the setup, ensuring that the work can be expanded upon by the broader research community.
The first step was to set up a virtual machine (VM) on FABRIC, install GraphBLAS, and provide it with traffic to analyze. A virtual machine allows you to run multiple computers on the same physical machine, each working independently. The team used pre-recorded network traffic files called pcap files, which contain captures of data flowing through a network. These files were converted into matrices using a tool called pcap2grb, allowing GraphBLAS to print the communication patterns in the traffic.
Testing the Setup
Once the setup was complete, it was time to put GraphBLAS to the test. We ran several sets of pcap files through the system, ranging from small files with 637 packets to larger ones containing 791615 packets. An IP packet is like a small package of data that gets sent over the internet from one device to another.
We validated the result using another tool called Wireshark whose output was filtered to preserve only the pairs of communicating IP addresses. The communication pairs represented by the matrix data from GraphBLAS were compared to the ones from Wireshark and they matched.
Conclusion
Deploying GraphBLAS on FABRIC allows us to analyze traffic patterns at a scale, detect security threats, and share our findings with other researchers. We’re also looking at extending the setup to handle live traffic from ongoing experiments on FABRIC, which would allow for real-time network monitoring and threat detection.
We’re excited to see where this project leads, and we hope other researchers will join us in exploring the potential of GraphBLAS or FABRIC or both. Big things are coming, and you can be part of the action!
Check out the code that the team wrote and released: https://gitlab.com/d-r-r/release/gbf
Check out the project uploaded as a FABRIC Artifact so that other experimenters can use and build on it : https://artifacts.fabric-testbed.net/b5ea111e-733b-4b35-839c-05fb0a85f1fc
Acknowledgements:
I thank my collaborators, Hyunsuk Bang and Dr. Nik Sultana and my FABRIC mentor Komal Thareja. I also thank Nishanth Shyamkumar for help with FABRIC, and Michael Jones and Jeremy Kepner for help with GraphBLAS.