Started the week by doing some reading and going over the theory behind a forecasting technique called Adaptive Response Rate Single Exponential smoothing. I then spent the next few days implementing and testing a detector that uses the smoothing technique to obtain the next forecasted byte count. Still need to figure out how to tweak the parameters so as to obtain a detector that does not produce a delayed copy of the actual measurements.
Plan for next week is to do some reading on event detection in time series data, especially looking at techniques/methods that NetEvMon does not currently use.
Added traceroute data from AMP to the matrix display, which involved
feeding it through all stages of the pipeline - collection, parsing,
storage, querying. Expanded the matrix to be smarter about selecting the
collections to query and refactored some of the AMP data fetching code
to make it easier to add new AMP collections.
While doing testing with the data collected by the new AMP I found and
fixed a bug where the results of name resolution of some targets in the
schedule were not being properly used (only the first address was being
tested to). This has also shown that we need a sensible way to deal with
multiple targets having the same label in the matrix and graph displays.
Started looking at being able to format AMP ICMP test data in a similar
manner to smokeping so that it can be used with the more interesting
smokeping style graphs. Looks like the database can do most of the heavy
lifting to generate percentiles, which we can then use to plot the
shaded smokey regions.
Spent a bit of time updating the REANNZ weathermap to bring it in line
with recent network changes.
Added support for the AMP ICMP collection to ampy and amp-web, so we are now able to plot graphs of the test data Brendon has been collecting.
Spent a decent chunk of an afternoon working through the DPDK build system with Richard S., trying to make the DPDK libraries build as position-independent code so that we can link libtrace against them nicely.
Reworked a large amount of code in amp-web to move the collection-specific code out of the core source files and into separate little modules for each collection. This means that the core code should be much easier to follow and work on. Adding support for new collections should also be simpler and require less inside knowledge of how the whole system works.
Finished working on the autotools process for libtrace with DPDK so that '~make; make install' will build and install the shared library without any additional complexity. With Shane's assistance came to the conclusion that a patch is required to modify Intel's DPDK build system to allow libtrace to be built as a shared library.
Committed my work (i.e. the DPDK format) to the libtrace SVN. Put some basic documentation on the libtrace wiki (to be expanded upon).
Managed to implement Swingtime successfully in order to extend Horizon and implement a booking page. Still need to tweak a few of the views and link it with the OpenStack database but the functionality is working.
Started writing topology creation scripts this week. So far they are able to create full mesh networks as well and ring networks for a specified number of switches. Since then I have been advised that mininet may be worth looking into. Next week I plan to try out mininet, even if it is not quite what I'm looking for it may give me ideas about methods on generating topologies.
Spent the week taking a look at the detectors written by Shane and Brendon and managed to cover them all and have a good understanding of how they work. Also had a brief look at the theory behind the Dempster-Shafer belief functions, but need to go through them in more depth at some point.
Plan for next week: start reading up on event detection in time-series data and get set up with my own copy of netevmon. Also look into other smoothing functions that could potentially be useful for new detectors.
Spent most of the week working on backend things that are required to
get the AMP data into the matrix view in the way that we want.
Installed memcached and changed the way memcache keys work when fetching
recent data to enable it to be cached properly, with different timeouts
for different duration data.
Updated all the data/metadata fetching to use the new NNTSC API rather
than pulling data from the old REST interface on erg. As part of this
added the ability to specify multiple aggregation functions across
(possibly duplicate) columns to efficiently fetch all the data required
(e.g. means, standard deviations, etc of the latency values). Mean and
standard deviation are now used to colour the matrix cells rather than
absolute latency differences. Also slightly tidied up the way tooltips
and sparkline graphs are drawn to better present this data.
Table partitioning is now up and running inside of NNTSC. Migrated all the existing data over to partitioned tables.
Enabled per-user tracking in the LPI collector and updated Cuz to deal with multiple users sensibly. Changed the LPI collector to not export counters that have a value of zero -- the client now detects which protocols were missing counters and inserts zeroes accordingly. Also changed NNTSC to only create LPI streams when the time series has a non-zero value occur, which avoids the problem of creating hundreds of streams per user which are entirely zero because the user never uses that protocol.
Added ability to query NNTSC for a list of streams that had been added since a given stream was created. This is needed to allow ampy to keep up to date with streams that have been added since the connection to NNTSC was first made. This is not an ideal solution as it adds an extra database query to many ampy operations, but I'm hoping to come up with something better soon.
Revisited and thoroughly documented the ShewhartS-based event detection code in netevmon. In the process, I made a couple of tweaks that should reduce the number of 'unimportant' events that we have been getting.
Continued to work on the session booking extension to the dashboard. Having trouble with using swingtime in the extension so currently trying another route.