User login

Blogs

19

Nov

2013

Finalised the TEntropy detector and committed the changes to Git. Then spent the next few days reading up on belief theory and the Demspter Schafer belief functions. Started working on a Python script for a server that listens for new connections, understands the protocol used by AnomalyExporter and parses the event data received fron the anomaly_ts client.

Plan for the next week is to finish the eventing Python script (which will include event grouping by ts and stream # initially) and start gathering data from the events in order to calculate a confidence level for each detector. This is necessary for using belief theory to determine whether an event is really an event since the various detectors might not produce the same results.

18

Nov

2013

This week I learned some Python, reading and working through some of Brendon's exercises with the AMP REST API and then moved on to looking at graphs in Cuz. I have since been working on creating a "rainbow" traceroute graph type for Cuz with JS and Flotr2, which is now mostly completed aside from some minor tweaking and checking to ensure the graph will still draw correctly when given errors or null data etc. The library's lacking documentation made coding feel like a slow process so I intend to document my files fairly thoroughly starting next week.

18

Nov

2013

I changed rfserver over to a proper MPLS approach, but am having trouble with the ovs recirculation branch. My packets are hitting one table then disappearing before they reach the next..

18

Nov

2013

Spent the week planning and researching my approach to make Libtrace more threaded on a SMP machine to better use the available processing power. Currently the main area of focus is to provide a framework in Libtrace that allows parallel packet processing following a MapReduce model while still providing flexibility to the user if MapReduce doesn't fit there packet processing pipeline.

Per Shane's suggestion I worked through a variety of existing Libtrace tools (tracestats, traceanon and tracertstats) with the new parallel approach to ensure the framework provided will be a good fit for typical Libtrace uses.

18

Nov

2013

Spent some time tweaking the new TEntropy-based detectors in netevmon to reduce the number of false positives and insignificant events that they were reporting. Mostly this involved tuning the various thresholds used by the Plateau detector that is run over the TEntropy values rather than the TEntropy methodology itself.

As I was doing this, I started putting together a gigantic spreadsheet of the events observed, their significance, which detectors were picking them up, and the delay between the event starting and the detector reporting it. This is useful for two main reasons:
* As I adjust and tweak the existing detectors I can easily compare the events I used to detect with what I am detecting now (and what I think I should be getting).
* We will need to calculate the probability that a given detector is right for the next major phase of Meena's project. This spreadsheet will form the basis for estimating these probabilities.

Added support to NNTSC for collecting and storing AMP HTTP test results. Seems to work reasonably well (after fixing a bug or two in the test itself!) but it'll be interesting to see how query performance pans out once the table starts to get large, given our travails with the traceroute data.

15

Nov

2013

This week I have read a number of papers detailing Hidden Markov models, and have investigated how to apply this to the existing detectors in libanomalyts.
I have started writing code to implement a Hidden Markov Model.

12

Nov

2013

Spent the week testing the TEntropy detector on different streams and refining the Symboliser to reduce the number of FPs.
One problem that I discovered was that the magnitude of the events were not represented correctly by the Symboliser: a severe event had the same t-entropy result as a trivial one since only 1 character was inserted at a time. Hence, a solution to this was the introduction of multiple characters based on the severity of the change.
Another problem was that small, insignificant changes were triggering events when deally, they should have not been creating entropy. Hence, I added a condition that checks whether a measurement is significantly different from the previous mean before triggering a non-default character.

12

Nov

2013

Lots of small fixes to things that use the new view interface. Fixed a
few more caching problems where the list of stale streams to fetch was
being ignored and instead all streams were being fetched. Updated the
tooltips on the matrix to use the new API and to split IPv4 and IPv6
results. Updated traceroute graphs to use the new API.

Replaced most of the dropdowns for the amp-icmp data with a modal
bootstrap window that allows selecting which streams (or groups of
streams) to display. Wrote most of the code to insert these views into
the database if they have not been seen before, and to fetch them out
when required. There are a couple of edge cases around determining
members of a view that look like they will require a slight redesign of
the database to accommodate.

12

Nov

2013

A couple of areas that require further checking have arisen as a result of my investigations. These are max widths of diamonds and counts of load balancers based on the set of next hops alone. The latter is based on the results of accumulating load balancer counts over 18 vantage points.

Plots representing the data collected on Caida using three modes of flow ID selection have been generated. This involved running analytical programs to generate the refined data sets.

I started another run on Planetlab to bolster the data that I have collected using MDA with a confidence level of 99.99%.

I gave my talk at the PhD conference.

I was able to do a bit of work on the load balancing paper, namely design and methodology.

11

Nov

2013

Hit a little hitch with implementing polling packet counts. The poller works fine---aside from breaking forwarding within the fabric.

The problem is that I'm only able to push one layer of label and I need to edit the label twice. I could push metadata instead, and then push the complete label in a single action, but that would scale quadratically in the number of switches. I also thought about using the vlan pcp field as a bonus tag, but that seemed a little bit messy. Plus that's only 3 bits, so it limits me to 8 switches.

So instead I'm gonna use a development branch of ovs with MPLS support. It's doing it the proper way, but for the time being it is pretty much completely unsupported by anything else.