Started the week by doing a summary of the Smokeping data that Shane and I have collected last year. This included grouping the streams based on average means (i.e. < 5, < 30, < 100, > 100) and summing up the number of FPs and significant/insignificant/unclassified events for the whole stream and also on a per detector basis. Using these numbers, I was able to find out accurate probability values for each detector. This also made it easy to see exactly where we needed more data, e.g. only having 5 Mode events throughout all the streams with an avg mean of < 5.
Then, I modified my eventing python script to use different probabilities based on the detector that fired and the average mean of the stream at that time. These probability values will still need to be updated later on since the sample size is too small for some of the detectors. However, this is tricky since some detectors (especially Mode) only fire occasionally when the mode of the time series has changed considerably, so getting a big enough sample size is tricky.
Spent some time looking over Bayes Theorem, which I plan on using as a comparison of different fusion methods.
This week I wanted to dedicate some time to cleaning up the fragmented amp-web interface to improve the consistency of CSS and markup across the site, and to remove unnecessary JS libraries, in the process determining which libraries best suit our purposes in cases where features overlap.
As a first step I included Bootstrap's CSS globally and rewrote the rest of the global stylesheet around it, restructuring hacky CSS that relied on (inline) markup. I would prefer to include only the Normalize reset and Bootstrap's class-based CSS rather than the base CSS that styles other elements, which I might investigate at some point, but for now everything works fairly well. Including Bootstrap globally broke the Matrix, whose CSS definitions overlapped with Bootstrap's (so I fixed this temporarily by renaming the affected classes).
So after breaking the matrix, I spent a lot of time cleaning it up (albeit mostly because I wasn't aware it existed in the first place). The Matrix used jQuery UI to display tooltip-style popups, which I replaced with a similar feature (popovers) in Boostrap. This took a bit of time and more rewriting than I'd expected as the JS for instantiating each is very different (particularly as the popovers are intended to appear on click rather than on mouseover), but it worked eventually and I managed to streamline some of the Matrix code in the process. I also replaced the matrix's jQuery UI tabs with custom ones, and that allowed me to remove jQuery UI and its CSS, the JS library cssSandpaper (which had been used for backwards compatibility that wasn't really relevant), and its dependency libaries cssQuery, EventHelpers, sylvester and textShadow.
I added a CSS hack to fix graphical glitches that were sometimes produced when rendering rotated text (the matrix headers). It seemed to only occur on Voodoo, but as well as preventing the flickering issue, the fix also looks to have improved legibility on all platforms.
Finally I spent some time integrating the traceroute map with the latest changes and updated it to use real data. It was interesting to see what a difference this made to the summary view, whose highly aggregated data is no longer useful for representing unique paths or for being able to see where paths change. Will have to look at how to best address this next.
Got back into the swing of things by spending the week fixing a multitude of UI problems and general bugs in Cuz, with the aim of getting closer to something we feel comfortable demonstrating at NZNOG.
The main improvements are:
* Finally added a "graph browser" page which lets the user choose a collection to explore.
* Event groups are shown on graphs rather than individual events. This greatly reduces clutter when big events occur.
* Fixed various inconsistencies between the line colour shown on the legend and the line colour actually being drawn on the graph.
* Stopped creating tabs that go to empty graphs.
* Fixed a bug where the rainbow summary graph would only show the first couple of hops rather than the entire path.
* Added basic tooltips to the legend which show more detail about the group being moused over, e.g. what exactly is represented by each line colour.
* Better handling of database exceptions in Cuz so that Brendon's buggy AMP test results don't crash NNTSC :)
So over the break I was trying to fix ovs, but after finally talking to the guy who wrote the ovs mpls branches this week, I am now giving up on that.
So instead I have the polling working with vlan tags and unique flows for each pair of nodes. It is currently just printing out values for packets sent and received, but it is counting them correctly and not losing any packets.
So then I started reading a few papers on passive monitoring techniques to focus on how they tested them. They've actually been fairly interesting. A couple using very similar techniques to mine.
This week I included the PlateauLevelDetector into my Hidden Markov Model Detector.
The HMMDetector now subclasses the PlateauLevelDetector, which it passes "magnitude" values (the quotient of the base log-probability and the current log-probability) such that the values are sufficiently smoothed out. This means that existing plateau detection code can be used to determine if the probabilities generated by the HMM are due to an event state.
I have tested this detector with it's current parameters and updated the testing data provided by Shane. These tests show that the detector works well for some streams, but not so well for others. Future work could look at optimising the detector parameters on a per-stream basis.
Further analysis of Caida run data was carried out to ignore load balancers with a collection TTL less than five when determining population proportions of paths with different load balancer types and different probe types.
The paper was updated with the new results and some other aspects were updated as well.
The six monthly report was filled in with goals and achievements. This may need to be further updated.
This week I managed to multi-thread my implementation of the Genetic Algorithm.
This reduced the runtime of the algorithm by approx. 30%.
After optimising the GA I have begun to investigate building the detector proper, using the PlateauLevelDetector to smooth out the otherwise very noisy probabilities.
The warts analysis program has been upgraded to provide further information for the paper. This has included counting ICMP paths with no route changes and acculating non matching load balancer successor sets for ICMP.
Counts of load balancers at given hop counts have also been calculated for three packet types and three load balancer types. Subsequently to this a program has been written to count percentage paths with one of the three load balancer types where the load balancers with a hop count less than five are not counted. This is to avoid repeatedly counting the same load balancers.
Some of the graphs in the paper have been redone to include ICMP data.
The points raised about the paper at the last meeting have been addresses, though the load balancer path percentages are being run today.
Spent a fair amount of time reading papers on the limits and alternatives to Dempster-Schafer for combining evidence. The main limitation of D-S is that it can produce counter-intuitive results in case of strong conflict between argument beliefs. However, there are no elements of conflict in the belief functions for the detectors, which makes D-S the preferred option (until I find a better alternative). I also came across a number of other rules (Bayesian, fuzzy logic, TBM, etc) that I plan on reading about during the break.
Also spent a considerable amount of time looking at the events for the AMP-ICMP graphs for the Quarterpounder to Google.com stream. There are a huge number of events, which makes grouping and rating them take forever. I need to do more of the amp-icmp stream analysis before calculating the belief values for each detector, and that's something that I plan on doing during the break too.
As per discussing with Richard and Brendon, I aimed this week to produce a more fully connected version of the traceroute graph, for the purpose of visualising the networks represented by interconnected routes more accurately. I knew that in order to do this I needed to abandon my existing tree structure and instead represent the data as a graph, so I spent some time reading about drawing algorithms for directed graphs, which led me to reading further into hierarchical graphs, which appeared to be exactly what I was looking for. Hierarchical graphs are typically drawn according to Sugiyama's framework, which breaks the process down into 5 stages, each requiring the selection of an algorithm meeting the requirements of application.
Of course, multiple stages of the layout process are NP-hard or NP-complete, so drawing these graphs is slow. Dagre would hang the browser for potentially several seconds before anything could be drawn to the canvas.