User login

Shane Alcock's Blog

16

Jul

2012

Continued playing around with manual event classification. Adding sub-categories does seem to have helped somewhat, although there is no immediately obvious distinction between, say, spike events and volume changes. However, in the process I did manage to find a cluster of false positives that can easily be identified and correctly classified.

Brendon, Brad and I spent some time looking into NDT, the measurement software used by REANNZ's New Zealand Broadband Test which went live last week. Running NZBT on red cables didn't give particularly accurate or reliable results so we ran a few tests in the lab to see if NDT gave appropriate results in perfect conditions. Eventually managed to get reliable results, but only after setting up NDT on a physical (rather than virtual) machine. However, I remain unconvinced as to NDT's suitability for this kind of project.

09

Jul

2012

Spent a lot of time staring at various statistics about major and minor events in my example data sets, looking for an obvious property that would allow me to separate genuine events and the false positives I am currently getting. Didn't have a lot of success, although I have managed to get rid of a few of the more obvious false positives.

Midway through the week, I had the idea of further sub-categorising events based on the "type" of event, e.g. spike events, level changes, increased noise. I'm hoping that there will be something specific about, for example, spike events that make them easier to identify compared with a binary "is this an event?" approach.

Prepared some more 301 content -- over half-way now and the semester hasn't even started.

Watched the remaining lecturer candidate presentations on Friday.

02

Jul

2012

Continued experimenting with my new web-based manual classifications. Tried a few tweaks here and there; the main one being changing my sample mean change calculation to use the maximum of the absolute values of the two sample means being compared. Still in the process of evaluating the effect of this change.

Spent a fair bit of time working on lecture slides and the first assignment for 301. The slides seem to be coming together pretty well, although I'm a little concerned about the amount of content I'm trying to squeeze into each lecture.

Watched the research and teaching talks given by 5 of the candidates for the available lecturing positions.

Had Tuesday off to catch up on a number of personal errands.

25

Jun

2012

Developed a web-based system for performing manual classification of anomalous traffic measurements. This should greatly speed up the process of validating my anomaly detection code and will make it possible to start to crowd-source the validation in the long term.

Using my new system, I classified a new time series which had quite a few false positives. Based on this, I've concluded that my most recent threshold changes were a step in the right direction but using a fixed threshold (unsurprisingly) only works well in some instances. In response, I've started experimenting with trying to calculate a new threshold for sample mean movement that is based on the properties of the time series itself.

18

Jun

2012

Revisited my original manual classifications of various anomalies. Added yet another threshold that must be exceeded before an event is triggered -- the anomalous measurement must now cause a much more significant movement in the sample mean away from zero. This reduced the number of false positives in a couple of my graphs and only minor events were lost. Decided that many of the false negatives I had noted last week were arguably not events at all (and definitely not significant enough to warrant alerting) so re-classified them appropriately.

Started working on some C programming lectures for 301 next semester. Hopefully I can do a good enough job that we can re-use the content as an online C teaching resource rather than every course that requires C having to do a rushed job of teaching it to students who may not have come across it before.

Spent a fair bit of time helping various 513 students with using libtrace to analyse CDN traffic in a packet trace.

11

Jun

2012

Marked the latest batch of libtrace assignments for 513.

Continued manually checking for anomalies in my various test time series. Found a few scenarios where a certain type of pattern in the time series would cause the prediction algorithm to go a bit crazy for a while. After a bit of investigation, I found the problem was caused by the denoising algorithm which was creating the problematic pattern.

Throwing all my manual classifications into WEKA unfortunately did not reveal any obvious avenues for improvement, so it is back to the drawing board a bit on this one. None of the metrics I currently have can help resolve the false positives and false negatives, so I'll need to look for a new one.

05

Jun

2012

After a couple of days of mind-numbing mathematics, I managed to nail my implementation of the ARIMA modelling. Instead of taking nearly 40 minutes to run, the ARIMA modelling step now takes less than 4 minutes. This makes it much easier to experiment with a variety of time series as I don't have to sit around waiting for ages when I want to try something new.

Noticed my current setup was producing a few false positive events when testing against a time series of http outbound flows. Spent a decent chunk of time tweaking thresholds and seeing if there was any simple solution. I've managed to make some improvements but still not getting ideal results. Started working on some manual classification of a few time series to see if we can apply some machine learning to a training set and find some suitable thresholds using actual science rather than me just fudging the values until the results look good.

28

May

2012

Finished documenting the current versions of my anomaly detection python scripts. Tested them against a few more noisy data sets, where they performed pretty well, so reasonably happy with the technique so far.

However, the ARIMA modeling component is quite sluggish so starting trying to figure out ways to improve the performance. At the moment, I simply call into R and tell it to reapply the original model to the last N measurements with to little to no control over the math that is performed during that process, e.g. R could be calculating statistics that I don't need.

After carefully poring over lots of R code and papers on ARIMA modelling, I've been able to start developing my own Python version of the ARIMA model function. Managed to get correct values for the residuals when using a simple (1,1,0) model but more complicated models required a surprising amount of additional math. At this stage I'm not sure if my efforts will end up improving the performance of my event detection to any significant degree, but at least I understand the underlying math a lot better!

21

May

2012

Spent most of the week playing with Shewhart control charts, trying to find useful thresholds for determining when an anomaly has occurred. After a lot of fiddling and frustration, I think I've managed to settle on something that gives reasonable results for all of the test data that I have. Basically, I look for any forecast residual that causes a significant shift in the current sample standard deviation (calculated over the last 20 measurements). The residual must also result in a large shift in the current sample mean away from zero and must be more than eight sample standard deviations away from the complete sample mean.

A lot of that probably doesn't make much sense to anyone who hasn't been working with these numbers for as long as I have, so I'm now working on documenting the whole process and explaining the reasoning behind it. I'm also planning on generating a lot more test data to verify that my approach works well in a variety of situations.

15

May

2012

Spent the week working through a number of possible techniques for finding a suitable threshold for reporting anomalies in my forecast residuals. This involved learning about Mahalanobis distances, chi-square distributions and Stahel-Donoho estimates. Currently, I'm playing with Shewhart control charts, which are mainly used for measuring industrial quality. Managed to get a good result with the Windows Messenger test data I have, although the results for the BitTorrent test data suggest that more tweaking may be required.

Had to quickly revise my IMC paper to pass the format checker on the submission site. This took a little longer than anticipated as the format checker was configured with the wrong page size!

Helped out with Open Day on Friday, mainly by talking to people and impressing them with the wonders of networking.