I've mostly been writing lately, but I have been fixing some bugs in my network tests as well. And in the process I have started getting some results back.
So it turns out that in a network of 1000 nodes to connect each pair of nodes with two paths you end up creating lot of paths. And when you are writing in python so you arent worrying about destroying your data structures, then try to do 1000 iterations with 3 different algorithms, you use a heck of a lot of memory.
So my week has been spent adding proper memory management as well as things like turning lists into generators and trying to reuse objects and splitting my tests up so they can be run as separate processes to try to reduce that footprint.
In the process I discovered a few bugs, like occasional infinite paths as well as some others that seem to only be common on huge networks which is a nuisance to debug since it takes several minutes to run the algorithms on a huge network.
Anyway, while doing this I have started to get some results back and they are kinda cool. So my three approaches I am testing are: 1. Building two minimally overlapping paths between each pair of points. 2. Building two minimally overlapping paths in such a way that it guarantees that you will never have more than two flows to each destination. 3. Building one path, but at each hop having an alternative path that you can use if the next link is broken.
And it turns out that by far the most efficient method is 3, even though this can require 4 flows per switch per destination. The thing is that you only need those flows on transit hops. Because you dont build the full secondary path you end up with far fewer transit hops. But even cooler is that method 1 is actually more efficient than method 2. Because it builds shorter paths, even though each node can have a whole bunch of paths to a destination (and in the worst case it does end up costing more) on average the shorter paths mean fewer total flows.
The results from the first successful Doubletree run using the MDA data to approximate regular Traceroute make fairly good sense, and show that the steps taken to reduce unnecessary processing have been successful. The second half of the full analysis have now been included, now that initial testing is complete. A full analysis run is now underway. I also plan to analyse some many sources scenarios using the data in reverse direction.
The Megatree MDA data analyser has been updated to analyse some initial many sources scenarios, and a new run is underway. This meant modifying the factor structure of the program to include new combinations. So far Doubletree and Megatree use the same factor structure. Next I plan to include some many to few scenarios that activate varying sized sets of sources one after the other, and supplying global distributed information amongst the sources on this basis.
The ICMP black hole simulator run has completed successfully and results have been generated using warts analysis. This method finds less load balancers in total, but however is still useful as this can be countered by carrying out more runs, and this is intended to be done on Planetlab.
A graph has been produced of the probe packet counts versus timing modes using data from the Internet Simulator that runs on Caida data. The packet counts for few sources across the timing modes (1stage, 2stage, 10stage and staggered) are very similar showing that there is only a small affect of sharing global information. For many sources 1stage is higher than the others which are identical. Strangely the 1stage level is higher than the Traceroute level and so I am rerunning this data point after regenerating leaves topology data and then trace list data. The stage counts are the number of stages carried out before all subsequent probing is carried out simultaneously. It would appear that some of the window size options used in the other simulators might also be appropriate here. It is also interesting to note that the processing times recorded for these simulations are of the order of a few minutes, excluding generating the memory map file which in some cases is reused.
Added a basic test for the entire parallel libtrace system, based upon the existing test-format code. This tests most of the basic functions including pausing and restarting a trace, however is limited to only the file based formats.
I setup a BSD VM at home and spent some time tidying up the code to ensure it still compiled. Merged in the latest upstream fixes, and spent a while finding a build bug where visibility for some 'exported' functions were hidden.
Fixed the delay in name resolution which was causing one amplet to
timeout tests when first starting. It was caused by the default
behaviour being to perform as a recursive resolver when no resolvers
were specified. It now properly uses the nameservers listed in
/etc/resolv.conf if there is no overriding configuration given.
Implemented the change in the receive code to use timestamps direct from
the test sockets. All tests that use the AMP library functions to
receive packets will be able to pass in a pointer to a timeval and have
it filled with the receive time of the packet. I haven't merged this yet
as I plan to spend some more time testing it under load and comparing it
to the previous approach.
Updated the schedule/nametable to allow selecting specific address
families of test targets even if the name/address pair was manually
specified in the nametable rather than resolved using DNS. This should
all behave consistently in the schedule file now regardless of type.
Spent some time investigating a bug in the code to rename the test
processes to more useful names than the parent. rsyslog starts printing
incorrect process names when logging and can lead to crashes. Renaming
works fine when run in the foreground with logging directly to the
terminal, and the correct process names are shown.
Short week last week, after being sick on Monday and recovering from a spot of minor surgery on Thursday and Friday.
Finished adding support for the amp-throughput test to amp-web, so we can now browse and view graphs for amp-throughput data. Once again, some fiddling with the modal code was required to ensure the modal dialog for the new collection supported all the required selection options.
Lines for basic time series graphs (i.e. non-smokeping style graphs) will now highlight and show tooltips if moused over on the detail graph, just like the smoke graphs do. This was an annoying inconsistency that had been ignored because all of the existing amp collections before now used the smoke line style. Also fixed the hit detection code for the highlighting so that it would work if a vertical line segment was moused over -- previously we were only matching on the horizontal segments.
I wrote my interim report a couple of weeks ago (since the last time I wrote a blog entry) and this week I have been beginning to get back into the swing of things. I spent some time looking into 6LoWPAN/RPL gateway router solutions (to bridge the LoWPAN with Ethernet) and found that a common solution (6LBR) seems to be to run the router on a more powerful device such as a Raspberry Pi or BeagleBoard, using the radio of a USB-connected mote via a serial tunnel. An alternative to this would be to connect a radio directly to the board, but radios available are limited and generally not mass produced. To this end I went back to trying to get a serial tunnel working so that I could communicate between the host PC and a (remote) mote. I read into the code for the tunnel a little further and managed to work out that packets are being received and recognised between the host PC and tunnel mote, but not forwarded further to the remote mote. This is confusing since I had expected the problem to lie with the tunnel software (which I had such trouble initially compiling) and based on what I've read, it sounds like the motes should automatically establish a LoWPAN between themselves and communication should be no problem.
Following the plan I laid out in my interim report, I was aiming this week to nail down the hardware I would need for the coming weeks. Since Richard N has procured another STM32w RFCKit for me already and Richard S has a Raspberry Pi he doesn't need any more, I should be good to go.
A single simulation rerun was completed on the Doubletree Internet simulator. The automated graphs were updated to include this data point. The graphing was also modified to do bars rather than lines and points, and monochrome dashed boxes were also set up in the png file output. Looking at the data I noted that the numbers of control packets reported need to be checked. Once this is done I plan to run the simulator with reduced data to create another factor variable.
Automated graphing was also set up for the Megatree Internet simulator graphing. The monochrome png file settings were also applied to this program. Once again a number of categorical data variables were applied to the x axis. These included timing which means the number of consecutive traces that contributed to the local and global data sets, number of sources included in the simulations, and similarly the number of destinations included.
The black hole detector drivers were changed over to ICMP probing, and a run was set going on Yoyo. This was a step towards running on PlanetLab. The original data that made me think that I was finding black holes in load balancers was reviewed and a typical case was analysed. This was where Paris Traceroute ran one hop shorter than the two MDA Traceroute runs. The MDA traces showed that one arm of a terminal load balancer connected to the destination whereas the other had no further links found. One trace collected using Paris Traceroute goes down one arm of a given load balancer, because the flow ID is kept constant. It is interesting that the Paris trace stops short in the other arm in this case without reaching the destination, which there should be a valid path to. Note that the convergence point of the load balancer did not appear to be reached.
Deployed a new version of the amplet client to most of the monitors.
Found some new issues with name resolution taking too long and timing
out tests on one particular machine. Fixed the SIGPIPE caused by this,
but have yet to diagnose the root cause. No other machine exhibits this.
Kept looking into the problem with packet timing when the machine is
under heavy load, and after looking more closely at the iputils ping
source managed to find a solution. Using recvmsg() grants access to
ancillary data which if configured correctly can include timestamps. My
initial failed testing of this didn't properly set the message
structures - doing it properly gives packet timestamps that are much
more stable under load.
Updated the test schedule fetching to be more flexible and easier to
deal with. Client SSL certs are no longer required to identify the
particular host (but can still be used if desired).
I investigated implementation of the BPF filters in regards to thread safety. It turns out that what I originally suspected is true, the compilation is not thread safe but the running of the filter is. The possible problem was dependent on where the scratch memory with the filter was stored, it appears that when running the filter this is allocated on the stack within the pcap code and any of the kernel implementations. I suspect that the LLVM JIT does the same, however since this is disabled by default I'm not too concerned for now. The BPF filter compilation on the other hand uses lots of globals in libpcap which is not thread safe, this case I have already locked against previously.
Hooked up configuration options for the tick packets allowing them to be turned on and off with a specified interval to the nearest 1msec resolution.
Reworked how aggregation binsizes are calculated for the graphs. There is now a fixed set of aggregation levels that can be chosen, based on the time period being shown on the graph. This means that we should hit cached data a lot more often rather than choosing a new binsize every few zoom levels. Increased the minimum binsize to 300 seconds for all non-amp graphs and 60 seconds for amp graphs. This will help avoid problems where the binsize was smaller than the measurement frequency, resulting in empty bins that we had to recognise were not gaps in the data.
Added new matrices for DNS data, one showing relative latency and the other showing absolute latency. These act much like the existing latency matrices, except we have to be a lot smarter about which streams we use for colouring the matrix cell. If there are any non-recursive tests, we will use the streams for those tests as these are presumably cases where we are querying an authoritative server. Otherwise, we assume we are testing a public DNS server and use the results from querying for 'google.com', as this is a name that is most likely to be cached. This will require us to always schedule a 'google.com' test for any non-authoritative servers that we test, but that's probably not a bad idea anyway.
Wrote a script to more easily update the amp-meta databases to add new targets and update mesh memberships. Used this script to completely replace the meshes on prophet to better reflect the test schedules that we are running on hosts that report to prophet.
Merged the new ampy/amp-web into the develop branch, so hopefully Brad and I will be able to push out these changes to the main website soon.
Started working on adding support for the throughput test to ampy. Hopefully all the changes I have made over the past few weeks will make this a lot easier.