First week with our new summer students, so spent some time working with
them and getting them all set up.
Updated the views database to be slightly more complex, making it much
easier to add or remove line groups to/from a view. Also wrote the
supporting code that actually enables users to do this - the label
describing each line group can now be clicked on to remove it from the
graph. The matrix will now generate any views that it needs when the
page is loaded so this no longer needs to be done by hand.
Started to add a streams-to-view interface so that events can be plotted
easily. Events are based on single streams rather than groups of
streams, so need to be viewable individually.
Spent a bit of time tidying up the new AMP packages to be more
consistent with how files and directories are named. Logging, config,
etc should all use the same name rather than 2-3 different ones. Fixed a
couple of small bugs in tests/reporting that Shane found while adding
them to NNTSC.
A run on Planetlab of ICMP echo probes using MDA at 99.99% confidence has been carried out. These data are to be combined with a previous run to give estimates of stopping values at 99% confidence via CDF analysis.
UDP per flow addresses and ICMP per destination addresses have been collected. These have been compared even though they were collected from different vantage points. A good number have been found in common, helping to explain the different population proportions of different types of UDP and ICMP load balancers.
Continuing with the Internet simulator, a problem seems to have been found with creating control node paths using the large Caida data sets. This seems to fail to make progress and eventually says that the path cannot be created.
Work on a paper for publication has continued. Tables and graphs for the sections on stopping values, load balancer population proportions and methodology have been added along with appropriate text.
Finalised the TEntropy detector and committed the changes to Git. Then spent the next few days reading up on belief theory and the Demspter Schafer belief functions. Started working on a Python script for a server that listens for new connections, understands the protocol used by AnomalyExporter and parses the event data received fron the anomaly_ts client.
Plan for the next week is to finish the eventing Python script (which will include event grouping by ts and stream # initially) and start gathering data from the events in order to calculate a confidence level for each detector. This is necessary for using belief theory to determine whether an event is really an event since the various detectors might not produce the same results.
This week I learned some Python, reading and working through some of Brendon's exercises with the AMP REST API and then moved on to looking at graphs in Cuz. I have since been working on creating a "rainbow" traceroute graph type for Cuz with JS and Flotr2, which is now mostly completed aside from some minor tweaking and checking to ensure the graph will still draw correctly when given errors or null data etc. The library's lacking documentation made coding feel like a slow process so I intend to document my files fairly thoroughly starting next week.
I changed rfserver over to a proper MPLS approach, but am having trouble with the ovs recirculation branch. My packets are hitting one table then disappearing before they reach the next..
Spent the week planning and researching my approach to make Libtrace more threaded on a SMP machine to better use the available processing power. Currently the main area of focus is to provide a framework in Libtrace that allows parallel packet processing following a MapReduce model while still providing flexibility to the user if MapReduce doesn't fit there packet processing pipeline.
Per Shane's suggestion I worked through a variety of existing Libtrace tools (tracestats, traceanon and tracertstats) with the new parallel approach to ensure the framework provided will be a good fit for typical Libtrace uses.
Spent some time tweaking the new TEntropy-based detectors in netevmon to reduce the number of false positives and insignificant events that they were reporting. Mostly this involved tuning the various thresholds used by the Plateau detector that is run over the TEntropy values rather than the TEntropy methodology itself.
As I was doing this, I started putting together a gigantic spreadsheet of the events observed, their significance, which detectors were picking them up, and the delay between the event starting and the detector reporting it. This is useful for two main reasons:
* As I adjust and tweak the existing detectors I can easily compare the events I used to detect with what I am detecting now (and what I think I should be getting).
* We will need to calculate the probability that a given detector is right for the next major phase of Meena's project. This spreadsheet will form the basis for estimating these probabilities.
Added support to NNTSC for collecting and storing AMP HTTP test results. Seems to work reasonably well (after fixing a bug or two in the test itself!) but it'll be interesting to see how query performance pans out once the table starts to get large, given our travails with the traceroute data.
This week I have read a number of papers detailing Hidden Markov models, and have investigated how to apply this to the existing detectors in libanomalyts.
I have started writing code to implement a Hidden Markov Model.
Spent the week testing the TEntropy detector on different streams and refining the Symboliser to reduce the number of FPs.
One problem that I discovered was that the magnitude of the events were not represented correctly by the Symboliser: a severe event had the same t-entropy result as a trivial one since only 1 character was inserted at a time. Hence, a solution to this was the introduction of multiple characters based on the severity of the change.
Another problem was that small, insignificant changes were triggering events when deally, they should have not been creating entropy. Hence, I added a condition that checks whether a measurement is significantly different from the previous mean before triggering a non-default character.
Lots of small fixes to things that use the new view interface. Fixed a
few more caching problems where the list of stale streams to fetch was
being ignored and instead all streams were being fetched. Updated the
tooltips on the matrix to use the new API and to split IPv4 and IPv6
results. Updated traceroute graphs to use the new API.
Replaced most of the dropdowns for the amp-icmp data with a modal
bootstrap window that allows selecting which streams (or groups of
streams) to display. Wrote most of the code to insert these views into
the database if they have not been seen before, and to fetch them out
when required. There are a couple of edge cases around determining
members of a view that look like they will require a slight redesign of
the database to accommodate.