I tracked down the ovs bug. I have got it doing what I wanted it to, but it is currently failing ovs test suite tests to do with bfd and lacp for whatever reason. These are taking quite a long time to run, but I will double check that they arent also affecting the branch of ovs I am using without my changes. And then have a go at running it with routeflow. Hopefully I can get all this sorted then start the new year with my routeflow path polling all set up and ready to do some tests on.
Updated the event tooltips to better describe the group that the event belongs to, as it was previously difficult to tell which line the event corresponded to when multiple lines were drawn on the graph.
Brad's rainbow graph is now used whenever an AMP traceroute event is clicked on in the dashboard. Fixed a couple of bugs with the rainbow graph: the main one being that it was rendering the heavily aggregated summary data in the detail graph instead of the detailed data.
Replaced the old hop count event detection for traceroute data with a detector that reports when a hop in the path has changed.
Fixed a tricky little bug in NNTSC where large aggregate data queries were being broken up into time periods that did not align with the requested binsize, so a bin would straddle two queries. This would produce two results for the same bin and was causing the summary graph to stop several hours short of the right hand edge.
Started working on making the tabs allowing access to "similar" graphs operational again. Have got this working for LPI, which is the most complicated case, so it shouldn't be too hard to get tabs going for everything else again before the end of the year.
Added configuration modals for dns, smokeping, munin and some lpi data,
so that multiple data series (of the same type) can easily be viewed at
once and compared. Refactored the initial modal implementation used by
icmp and traceroute data to be much cleaner and easier to integrate the
new data types.
Updated the legend labels describing the currently displayed data to use
the more detailed information Shane has made available. Included in this
is a line number that is now used to fix the colouring order, making
sure that the line colour matches what the label describes.
Spent some time reworking small details on the newest AMP Debian
packages to install and run properly when installed by puppet on the new
Havent been super productive lately. I am still digging away at openvswitch, as well as reading things relating to what I am going to do if I actually discover packet loss (that is packet loss not caused by problems in experimental branches of openvswitch).
Warts analysis was updated to find matching load balancer successor sets when counting non matching successor sets for each unique load balancing interface. This gave confirmation for the analysis of numbers of successors to the same load balancing interface.
Warts analysis for data from the updated scamper running on Planetlab was designed. This update makes use of more destination addresses for per destination analysis. It would be useful to collect some per destination data with a larger set of addresses.
Changes were made to the section of the paper on turnover. Data with no route changes have been analysed together and incorporated into the text. Data on the numbers of successor sets for each unique load balancing interface were incorporated.
The discussion and conclusions sections of the paper have been written.
Continuing work on parallelising Libtrace, added in a sorting methods for use with order resuls such as packets that will be written to file in the reduce step such as returned by traceanon. This has resulted in close to double the speed in the encryption case for traceanon on my dual core testing machine.
Added in a interthread message handling system allowing a mapper to easily trigger the reduce step to run etc and when a mapper thread ends etc.
Still ironing out some kinks with possible deadlocks and other issues in the code.
I met with Dr. Joshi from the Stats Dept and confirmed that the method I was using was indeed correct. He also mentioned looking into Bayes' theorem as an alternative, and I spent some time reading up on it. There is an element of "undecidedness" with the event significance, which is why Dempster-Schafer is more appropriate than Bayes'.
Also updated Netevmon to periodically send out mean updates to the eventing script. These mean values will be used in deciding which probability values to use in different cases (e.g. when the measurements are noisy/constant, etc). Also also, looked at and rated the events for a couple of streams and updated some of the "busier" streams with last week's events.
Spent most of the week adding view support to all of the existing collections within ampy. Much of the work was modifying the code to be more generic rather than the AMP-specific original implementation Brendon wrote as a proof of concept.
Added a new api to amp-web called eventview that will generate a suitable view for a given event, e.g. an AMP ICMP event will produce a view showing a single line for the address family where the event was detected.
Updated the legend generation code for views to work for all collections as well. Added a short label for each line so it will be possible to display a pop-up which will distinguish between the different colours for the same line group.
I've been creating the traceroute map this week (the first half spent initially coding and the second half spent fixing it to work with a much larger real data set). Instead of trying to port the existing PHP or RaphaelJS (SVG library) code, I decided it would be easier to roll my own for Flotr2. I've had a fair amount of success so far, and my graph now looks like this image:
You can also hover over a path to highlight the entire path and show information about it, and hover over an individual hop to highlight all hops to the same host.
Colours represent unique hosts and there are certain conditions governing path divergence and convergence so that it remains clear to the human eye which path is which. Implementationally the data structure used is a tree in which each node is a path who has zero to many branches. Therefore one constraint is that a path will only ever join up with its root node (the same path it diverged from) and it will currently only ever diverge and converge a maximum of once each. Each alternative path is also drawn on a new line for clarity. I think these constraints help to strike an effective balance between accurate network representation versus visual complexity.
Decided that we needed to simplify the database schema for storing
traceroute data, so spent some time working on that. The new schema
works better with the existing aggregation functions and is faster to
query. Moved all the existing data to the new schema.
Merged in the rainbow traceroute graphs that Brad created and got them
using data from the new traceroute data. Moved the default view of
combined traceroutes to use smokeping rather than a basic line graph to
better show what is happening with multiple addresses.
General tidyup of code that had got a bit crufty, removed some sections
that were duplicated or no longer required. Started work on moving the
DNS test to use views.