Added configuration modals for dns, smokeping, munin and some lpi data,
so that multiple data series (of the same type) can easily be viewed at
once and compared. Refactored the initial modal implementation used by
icmp and traceroute data to be much cleaner and easier to integrate the
new data types.
Updated the legend labels describing the currently displayed data to use
the more detailed information Shane has made available. Included in this
is a line number that is now used to fix the colouring order, making
sure that the line colour matches what the label describes.
Spent some time reworking small details on the newest AMP Debian
packages to install and run properly when installed by puppet on the new
Havent been super productive lately. I am still digging away at openvswitch, as well as reading things relating to what I am going to do if I actually discover packet loss (that is packet loss not caused by problems in experimental branches of openvswitch).
Warts analysis was updated to find matching load balancer successor sets when counting non matching successor sets for each unique load balancing interface. This gave confirmation for the analysis of numbers of successors to the same load balancing interface.
Warts analysis for data from the updated scamper running on Planetlab was designed. This update makes use of more destination addresses for per destination analysis. It would be useful to collect some per destination data with a larger set of addresses.
Changes were made to the section of the paper on turnover. Data with no route changes have been analysed together and incorporated into the text. Data on the numbers of successor sets for each unique load balancing interface were incorporated.
The discussion and conclusions sections of the paper have been written.
Continuing work on parallelising Libtrace, added in a sorting methods for use with order resuls such as packets that will be written to file in the reduce step such as returned by traceanon. This has resulted in close to double the speed in the encryption case for traceanon on my dual core testing machine.
Added in a interthread message handling system allowing a mapper to easily trigger the reduce step to run etc and when a mapper thread ends etc.
Still ironing out some kinks with possible deadlocks and other issues in the code.
I met with Dr. Joshi from the Stats Dept and confirmed that the method I was using was indeed correct. He also mentioned looking into Bayes' theorem as an alternative, and I spent some time reading up on it. There is an element of "undecidedness" with the event significance, which is why Dempster-Schafer is more appropriate than Bayes'.
Also updated Netevmon to periodically send out mean updates to the eventing script. These mean values will be used in deciding which probability values to use in different cases (e.g. when the measurements are noisy/constant, etc). Also also, looked at and rated the events for a couple of streams and updated some of the "busier" streams with last week's events.
Spent most of the week adding view support to all of the existing collections within ampy. Much of the work was modifying the code to be more generic rather than the AMP-specific original implementation Brendon wrote as a proof of concept.
Added a new api to amp-web called eventview that will generate a suitable view for a given event, e.g. an AMP ICMP event will produce a view showing a single line for the address family where the event was detected.
Updated the legend generation code for views to work for all collections as well. Added a short label for each line so it will be possible to display a pop-up which will distinguish between the different colours for the same line group.
I've been creating the traceroute map this week (the first half spent initially coding and the second half spent fixing it to work with a much larger real data set). Instead of trying to port the existing PHP or RaphaelJS (SVG library) code, I decided it would be easier to roll my own for Flotr2. I've had a fair amount of success so far, and my graph now looks like this image:
You can also hover over a path to highlight the entire path and show information about it, and hover over an individual hop to highlight all hops to the same host.
Colours represent unique hosts and there are certain conditions governing path divergence and convergence so that it remains clear to the human eye which path is which. Implementationally the data structure used is a tree in which each node is a path who has zero to many branches. Therefore one constraint is that a path will only ever join up with its root node (the same path it diverged from) and it will currently only ever diverge and converge a maximum of once each. Each alternative path is also drawn on a new line for clarity. I think these constraints help to strike an effective balance between accurate network representation versus visual complexity.
Decided that we needed to simplify the database schema for storing
traceroute data, so spent some time working on that. The new schema
works better with the existing aggregation functions and is faster to
query. Moved all the existing data to the new schema.
Merged in the rainbow traceroute graphs that Brad created and got them
using data from the new traceroute data. Moved the default view of
combined traceroutes to use smokeping rather than a basic line graph to
better show what is happening with multiple addresses.
General tidyup of code that had got a bit crufty, removed some sections
that were duplicated or no longer required. Started work on moving the
DNS test to use views.
I have modified the Hidden Markov Model class I have written to use log transformed probabilities. This allows smaller probabilities to be expressed without risking underflow issues.
I am currently still working on the genetic algorithm that will be needed to determine the initial parameters of the Hidden Markov Model for use with anomaly_ts.
Updated scamper on Planetlab to use more destination addresses when carrying out per destination MDA (Multipath Detection Algorithm) analysis. The modified version has been run on one node to test the changes.
The code for detecting traces with no change in non load balancer nodes has been applied on a per trace basis to the code for detecting turnover. This has been run to produce more results, and these have been added to the paper.
I have been investigating the validity of the results I have for numbers of load balancing diamonds attached to the same load balancing node. Dumps of unique LB interfaces and successor sets have been examined and an added count of unique successor sets has been added. In calculating this, two sets with one address in common were taken as a match.
More work on the paper has been done. This has included Richards corrections, counts of collapsed load balancers and incorporation of data based on detection on route changes.