My conference slides were updated to include a many sources to many destinations diagram of the situation that we wish to find out about. Also a stages count blurb was included to describe the stage information in the graphs. In particular, that there is no control information sent on the final stage as it would serve no purpose.
The event based simulator has been updated to use source windows to give it a more realistic approach to the many sources to few destinations scenario. Source windows involve dividing traces into equal sized blocks and only forwarding control information to the next block to be run. The traces in each block or window run simultaneously. Initially the changes did not compile easily, until a few type handling related bugs were ironed out. The first run is underway, so I will soon see if the results are sensible.
Spent some more time checking up on the traceroute test, after merging
all the stopset/ASN changes. Found and fixed a case where ICMP error
codes weren't being properly recorded. Also found and fixed what appears
to be the main cause of the test running too long - some targets will
decrement the TTL before responding with a port unreachable message,
which throws the path length estimate off by one and can cause the same
TTL to be probed multiple times.
Added the ability to signal tests that their time is running out, giving
them an opportunity to report any partial results they have collected
and to gracefully exit before they get killed. This is configurable per
test type, depending on whether or not it is possible to get useful
information without the test entirely completing.
Updated the schedule interface display a bit more information about test
timings, and tidied up some documentation about the new format. Fixed
the raw interface to properly check if-modified-since headers from
amplets requesting new configs, so only new configs are sent.
Modified the amp-web matrix to add a dropdown selector for the type of latency to show on the latency matrix (TCP, ICMP or DNS). Removed the tabs for absolute and relative DNS latency, as this is now incorporated into the generic latency tabs.
My heuristics for identifying multimodal series were not quite as effective as I had hoped, so I spent the remainder of my week investigating methods used by real statisticians to find modes in a sample set. The approach I have taken involves estimating the probability density from the observed measurements using a kernel function. This results in a smoothed line graph where the peaks represent likely modes.
By examining the differences between consecutive values on the line graph, I find the local maxima and minima in the density function. The maxima are, of course, the modes themselves while the minima are required for the following step. I then use Fisher and Marron's method to eliminate or merge "minor" modes in my set of maxima. This seems to work reasonably well in the limited test cases I have provided so far, although much of the math is too complicated for me to implement entirely within netevmon. Instead, it looks like we will be calling out to R to generate the density function, but it seems likely that R will be able to do this much faster than any naive implementation I write anyway.
Finished moving all the standalone traceroute ASN fetching from DNS to
the TCP bulk interface. Decided to reuse the trie datastructure to make
an actual unique set of addresses to query (rather than the previous
simple system that just looked at nearby ones), minimising the data
needing to be sent/received. Fixed a few bugs in the buffer management
that meant new ASN data was possibly clobbering the last unprocessed
portion from a previous read. Merged all these changes and they should
now be running on atest amplet deployment.
Fixed up some bugs in the new schedule parsing code that didn't work
properly when the test type was not specified. Most other settings were
optional and had sensible default values, but it wasn't expected that
the most important option would be missing from (usually generated)
files. Schedule items without a test type are now properly ignored. Also
merged all these changes which are now running on a test deployment.
Added parameters for the throughput and HTTP tests to the scheduling web
interface. Slightly modified the throughput test options to make it much
easier to schedule the sorts of tests that it is commonly used for. Also
updated the HTTP test to follow 3XX redirects and to record that they
happened (with timings, sizes etc for both the redirect and the followup
One graph in the slides for the internal PhD conference has less categories than the others, so I updated the Megatree simulator and started a rerun to gather the extra data. The data required are the packet counts including control packets for the use of global information without local.
Further black hole data was downloaded from Planetlab and analysed. The analysis was upgraded to fix a minor problem with dealing with asymmetric load balancers. The counts of these will now be more accurate. The issue is that some nodes may occur at more than one TTL or hop count. If these multiple hop counts are determined then the simplified algorithm I use to determine if a Paris Traceroute stop point node is in a load balancer, will function better.
Keywords were added to the PAM paper, and then the paper was resubmitted.
Finished and submitted my PAM paper, after incorporating some feedback from Richard.
Fixed a minor libwandio bug where it was not giving any indication that a gzipped file was truncated early and content was missing.
Managed to get a new version of the amplet code from Brendon installed on my test amplet. Set up a full schedule of tests and found a few bugs that I reported back to the developer. By the end of the week, we were getting closer to having a full set of tests working properly -- just one or two outstanding bugs in the traceroute test.
Got netevmon running again on the test NNTSC. Noticed that we are getting a lot of false positives for the changepoint and mode detectors for test targets that are hosted on Akamai. This is because the series is fluctuating between two latency values and the detectors get confused as to which of the values is "normal" -- whenever it switches between them, we get an erroneous event. Added a new time series type to combat this: multimodal, where the series has 2 or 3 clear modes that it is always switching between. Multimodal series will not run the changepoint or mode detectors, but I hope to add a special multimode detector that alerts if a new and different mode appears (or an old mode disappears).
Spent last week on leave, getting my balance down :)
Results from the last non event based simulator run have been analysed and the graphs added to the PhD conference slide set. The graphs show simulation of many sources to few destination using the Traceroute multipath detection algorithm (load balancer) data. It contains effects of both the global inter-monitor data and the local intra-monitor data.
The churn paper has been further refined for PAM and submitted. This involved adjusting headings, adding keywords, moving table captions, setting letter paper and adding some address data.
Another run of the black hole detector has been finished and the results downloaded.
Turned a lot of the scheduling web interface code into templates that
can be reused between creating and updating tests. They were similar
enough that most of it can be reused, with only a few minor changes
specific to each view.
Fixed up some small bugs in the ASN query code to make sure that all
addresses in the path are fetched (paths shorter than the initial TTL
weren't querying for the ASN of the final hop). The cache will now be
cleared regularly during operation and will also tidy up properly after
itself on program end. Started work on replacing the ASN fetching using
DNS with the TCP bulk whois for the standalone traceroute tests too.
Spent some time applying patches and building old bash from source to
update the old amplets against the new bash vulnerability. These
machines are really due for a software refresh!
This week the focus has been moved to writing my honours report.
I've started from my mid-term report and I am reusing some parts of the introductory chapters which have changed little. Like a lot of honours students the word count for tracking my progress is up at http://wand.net.nz/520/.