Started working on code to split a single trace to run in parallel across multiple threads/cores - later formats like DPDK will be directly mapped to many cores.
I've got a basic first in first served version that will give a packet to next available thread and I'm working through getting a hashing version working that will direct packets to a certain thread for processing. Currently trade-offs need to be made between speed and maintaining packet order on the receiving cores.
I've noticed so far that the overhead of locking seems to be quite large especially on the simple tracestats application that I'm testing with, with only a couple of simple filters the single threaded version will run faster than the multi threaded version. However I'm certain a significant improvement will be seen on a more computationally intensive application such as traceanon.
Finished up the code to turn a single stream id into a view, for use
with events where we want to see the anomalous data. Merged all the view
changes back into the main branch, which highlighted a few broken cases
with things I hadn't considered (netevmon). Worked with Shane to get
those all sorted and working fairly quickly again.
Put together a nice query that will aggregate traceroute data to the
most common path within a binning period. Added a function to fetch this
in NNTSC which works fine for periodic data, but ran into some
difficulties extending this to a single, most recent block of data. It
shouldn't be difficult to get this working - hopefully a fresh look at
it on Monday will get it sorted.
Had a cold most of last week, so I got basically nothing done. I am still working on tracking down the problem of the disappearing packets in the recirculation branch.
Spent the first part of the week fixing various bugs and less than ideal behaviours in netevmon and nntsc. Some examples include:
* Preventing an event from being triggered when an amp-traceroute stream reactivates after a long idle time
* Fixed a crash bug in anomalyfeed due to an incorrect field name being used
* Fixed a problem in NNTSC where the HTTP dataparser would fall over if a path contained a ' character.
* Added a rounding threshold to the Mode detector so that it can be used with AMP ICMP streams, as these measure in usec rather than msec. Now we can round to the nearest msec.
Brendon finally merged his view changes back into the development branches of our software. This caused a number of problems with netevmon, as this had been overlooked when testing the changes originally. Managed to patch up all the problems in a rather hurried session on Tuesday afternoon and got everything back up and running.
Restarted netevmon with the TEntropy detectors running. They seem to be performing very well so far and are a useful addition.
Started working on adding the ability to group streams into a single time series within anomalyfeed. The main reason for this is to be able to cope better with the variety of addresses that AMP ICMP typically tests to. It makes more sense to consider all of these streams as a single aggregated stream rather than trying to run the event detectors against each stream individually, especially considering many addresses are only tested to intermittently. Grouping them will ensure there should be a result at every measurement interval. So far I've got this working for AMP ICMP, AMP traceroute and AMP DNS and will need to reimplement the other collections using the new system.
Spent a fair chunk of time reading up on belief theory and Dempster-Shafer so that I could give Meena some pointers on what she will need to be able to apply them to our event data. Managed to come up with some rough ideas that seem to work, but not sure if the theory is being applied 100% correctly.
I spent this week continuing work on the rainbow graph, implementing the following features:
* Mouse tracking (no conflict with mouse tracking for events)
* Option to measure latency instead of hop count
* Optional minimum height for points/bars to give improved readability at the expense of a small degree of accuracy
* Out of order data points will be set to the minimum height and stacked on top of previous points (e.g. in the case of a traceroute where the second hop has a latency coincidentally less than the first hop)
* Improved caching for more efficient traversal of data points in many cases, particularly mouse tracking
I began looking at how events could be better marked on graphs given that as lines, they may currently be easily confused with data points depending on the graph type and density of points. I have been pursuing the idea of drawing event markers above the plot area (along with the existing lines) and I think this seems to be an effective way to tell at a glance where events fall. I've implemented this using a plugin, which can draw directly to the canvas as opposed to only the plot area that is normally accessible to a graph type. A convenient side effect of this is the ability to draw events either behind or in front of the data points because plugins can intercept beforedraw and afterdraw (Flotr) events. Events are currently drawn behind data, but the rainbow graph is one example of where they may be more appropriately drawn in front. In this implementation I also rewrote how events are processed to favor caching, and addressed a minor bug in one of Flotr's internal plugins.
This week I implemented a Hidden Markov Model class for use as a detector for netevmon. This currently generates a sequence of symbols from the time series data and calculates the probability of observing that sequence, based on a previously stated model.
The parameters of the model have yet to be determined, and I am currently investigating using a genetic algorithm to find the optimal initial parameters.
A number of papers I have encountered have also applied a system such as this for determining the protocol of traffic even when encrypted or employing port hiding, which suggests that some of this work may also be useful for libprotoident.
First week with our new summer students, so spent some time working with
them and getting them all set up.
Updated the views database to be slightly more complex, making it much
easier to add or remove line groups to/from a view. Also wrote the
supporting code that actually enables users to do this - the label
describing each line group can now be clicked on to remove it from the
graph. The matrix will now generate any views that it needs when the
page is loaded so this no longer needs to be done by hand.
Started to add a streams-to-view interface so that events can be plotted
easily. Events are based on single streams rather than groups of
streams, so need to be viewable individually.
Spent a bit of time tidying up the new AMP packages to be more
consistent with how files and directories are named. Logging, config,
etc should all use the same name rather than 2-3 different ones. Fixed a
couple of small bugs in tests/reporting that Shane found while adding
them to NNTSC.
A run on Planetlab of ICMP echo probes using MDA at 99.99% confidence has been carried out. These data are to be combined with a previous run to give estimates of stopping values at 99% confidence via CDF analysis.
UDP per flow addresses and ICMP per destination addresses have been collected. These have been compared even though they were collected from different vantage points. A good number have been found in common, helping to explain the different population proportions of different types of UDP and ICMP load balancers.
Continuing with the Internet simulator, a problem seems to have been found with creating control node paths using the large Caida data sets. This seems to fail to make progress and eventually says that the path cannot be created.
Work on a paper for publication has continued. Tables and graphs for the sections on stopping values, load balancer population proportions and methodology have been added along with appropriate text.
Finalised the TEntropy detector and committed the changes to Git. Then spent the next few days reading up on belief theory and the Demspter Schafer belief functions. Started working on a Python script for a server that listens for new connections, understands the protocol used by AnomalyExporter and parses the event data received fron the anomaly_ts client.
Plan for the next week is to finish the eventing Python script (which will include event grouping by ts and stream # initially) and start gathering data from the events in order to calculate a confidence level for each detector. This is necessary for using belief theory to determine whether an event is really an event since the various detectors might not produce the same results.
This week I learned some Python, reading and working through some of Brendon's exercises with the AMP REST API and then moved on to looking at graphs in Cuz. I have since been working on creating a "rainbow" traceroute graph type for Cuz with JS and Flotr2, which is now mostly completed aside from some minor tweaking and checking to ensure the graph will still draw correctly when given errors or null data etc. The library's lacking documentation made coding feel like a slow process so I intend to document my files fairly thoroughly starting next week.