User login

Blogs

11

Sep

2017

This week I finished off implementation of the dynamic topology module. Fixed up issues with the OSPF dynamic topology module and with the processing of topology information in the network module. I also performed more tests to make sure that the network changes are received by the disaggregator and interpreted correctly. I have also fixed some bugs relating to the unit test runner and modified the disaggregator to use a logging module, in preparation for performance testing. The log level for the output can now be specified as an argument when running the disaggregator.

At the end of the week, I started to work on performance testing the code. I am currently modifying the testing tool to allow benchmarking of the disaggregator. There are several other BGP implementations that can be benchmarked with the tool. We can use these to compare our results.

04

Sep

2017

Most of this week I spent on getting dynamic topology information from the network. I mostly focused on implementing support for OSPF. I extended the simulation tools to allow starting quagga ospfd router instances and extended the connection tools to work as well. I then created a test OSPF network and spent some time trying to connect to it to receive link state updates. After some experimentation, I managed to integrate a tool that will establish an OSPF connection to a router and receive link state updates, which it stores and processes. Network information is built from the link state database. This info is sent to the disaggregator which will process it and create a topology object from it. When a network change occurs, either due to something expires. a link going down or something new being advertised, the internal topology of the disaggregator will be modified accordingly.

I have also spent some time modifying the configuration file in preparation for support of more protocols to get dynamic topology information of our network. The config file now accepts multiple protocols as well as multiple static files to load topology information from. The dynamic protocol probe tools are automatically started by the network module based on the config file protocol type and configuration attributes.

At the end of the week, I spent some time working on implementing better filtering of prefixes to peers based on the negotiated MP-BGP AFI/SAFI attributes.

04

Sep

2017

Spent most of the week working with the TCP throughput test to investigate what I can actually do with the new information, and integrating it into the test. Retransmit counters, RTT etc are easy to extract and explain. I also get information about time spent blocked due to the receive window on the remote end, or the send buffer on the near end of the connection, but I'm not convinced about how accurate these are (or I'm not understanding what they mean correctly) - drastically limiting my send buffer size will only sometimes report any time spent limited by the send buffer, and querying when I know for certain that there is no outstanding data doesn't always report as being application limited. It's a starting point at least, so I'll keep looking at the data and see what can be done about it.

Kept working on making the BGP disaggregated router more resilient. Implemented a few new messages to communicate the state of peers between different processes within a single instance so that they can be compared with other instances. Got some simple logic working that will disable any instance that is known to have an incomplete view of the peers.

04

Sep

2017

Wrote up a sample program to test out some heartbeat ideas, and some fairly simple ideas look like they should work ok, while sharing minimum state between instances of the routing engine. Started to implement that inside the actual BGP disaggregated router code to see how it will behave with real data. Started setting up my test environment to allow multiple instances to be running and connected to the same BIRD process so that they all get the same routes.

Had another look at the new TCP info available to processes now that I'm running a 4.10 kernel. My userspace hasn't been updated, but all the new information is available to me if I use the updated version of the struct. Looks like I should be able to get timing information about what is causing send to block (which end is at fault), as well as retransmit counts, RTT, etc that can be used to try to determine why the throughput test reported a particular result.

04

Sep

2017

Implemented basic route refresh functionality in the BGP disaggregated router, and wrestled with exabgp to find out how to pass through the messages I required to do so. Also spent some time chasing down what looked like bugs in the topology module, but was actually a broken data file that didn't correctly describe the layout of the network.

Had another attempt at getting my chromium/youtube test working. It works fine when I build it within the chromium source tree alongside their example headless applications, but otherwise fails. It appears to be linking against a lot of object files deep inside chromium, as well as the headless static library (which I thought should contain everything needed to build a headless application?), as well as all the normal shared libraries. Back into the too-hard basket until they sort their stuff out or I have some more time to push through this.

Spent a lot of time reading about different approaches to HA/resilience, what sort of information nodes often pass around and how they go about sharing state (or avoiding sharing state).

04

Sep

2017

Finished up writing and testing the new address family selection options in AMP and making sure that all of the tests work properly when they are set. Changing the way the config files worked to allow globally setting options (but able to be overridden at test level) meant there were a few more edge cases than anticipated.

Started thinking and writing about how we might go about making the BGP disaggregated router more resilient, and what situations may arise that it will need to deal with.

04

Sep

2017

Spent some more time trying to fix the chromium/youtube test to get around segfaulting when javascript is present on a page, without any success yet.

Started work on improving manual IPv4/IPv6 address selection in AMP tests. Previously you could specify addresses to use for each address family, but generally couldn't control which family would be used (that was up to DNS). The plan is now to allow enabling/disabling individual address families globally or per test, and not require that a particular address be specified (but still allowing it if desired).

04

Sep

2017

Ranked every libprotoident protocol by observed bytes in three trace sets (from 2012, 2015 and 2017) as part of an attempt to demonstrate the relative transience of application protocols -- i.e. how protocols can appear, die off, surge or plummet in popularity. Some of the ranks didn't quite make sense, so I had to go back and validate a few of the rules. There were a couple of errors, which meant that I had to re-run the ranking analysis once I had fixed them.

Helped Brad with some stress testing of the Endace probe. This had the unfortunate side effect of making Vision unusable for a few days afterwards, but hopefully the process was helpful to Endace in the long run.

Continued working on a method of automating myself.

28

Aug

2017

This week I spent some time looking at comparing the Python3 and Python2.7 versions of the disaggregator to see which version offers better performance and is more memory efficient. The Python 3 version of the disaggregator uses less memory and also seems to be a bit faster. This is now the versions used in the master branch of the repo.

Most of the week I worked on adding IPv6 support to the disaggregator. I extended the prefix class to support IPv6 prefixes and then extended the BGP prefix parsing to retrieve the advertised and withdrawn IPv6 prefixes. Have also added parsing of the ExaBGP negotiated messages to retrieve the negotiated MP-BGP NLRI families for the disaggregator peers. The code was modified to uses these negotiated families when parsing the update prefixes. This will make it easier to deal with other MP-BGP NLRI families that will be supported later on.

More unit-tests and Mininet unit-test cases were also added to check the new modifications are working. At the end of the week, I also started looking at ways to get the topology information of the network dynamically.

28

Aug

2017

Back at work on Wednesday after a couple of weeks away. Spent most of my week catching up on emails and preparing for a STRATUS workshop coming up on Monday.

Did manage to spent a little bit of time looking at unknown traffic on the Waikato capture point again. Added a couple of new protocols: Smite and Fliggy. I've also found what appears to be another IP sharing "tool" similar to IPSharkk -- this one looks like it is installed as malware, so the network user is probably unaware that their machine is being used to proxy other people's traffic. Will try to dig a little more into this next week.