I haven't done a blog in ages! Oops.
Since my last blog, I have been refining the DAG format so the new cards work well. I wrote a bash script to configure the memory and hashing on the cards because that's a long process. Basically you tell the script how many streams you want, how much RAM you want to give them and which card to configure and it will go away and do just that.
I validated using ostinato and the dagsnap utility which showed a pretty good balance across all the streams.
The API between DAG 4 and DAG 5 changed slightly. I need to change how the configuration works by using the new CSAPI, and I also had to drop DUCK support. I really have no idea how DUCK works, and all references to the word DUCK have disappeared from the endace documentation. I left a helpful TODO message.
I also spent a bunch of time discussing how we're going to refactor the code base in the formats to better support using two different APIs - parallel and non-parallel libtrace. In the end we decided to use a linked list to hold per stream data. The non-parallel interface will get initialised and points to the first item in the list (FORMAT_DATA_FIRST). The parallel initialisation will add more entries to the end of the list. Any code that is common between the parallel and non-parallel code just get wrapped in a way that passes the correct data structure to the method.
Implementing this took a fair amount of time as there were a lot of references to modify. It compiled, and tracestats still counts the correct number of packets. I guess I should also make sure it performs well to be sure.
Monday was a day off, and I spent most of Tuesday working on my slides and graphs as per Shane's suggestions for my NZNOG talk.
The rest of the week was spent attending NZNOG. The WAND presentation went well, however slightly overtime. There were many interesting talks and I enjoyed my time. I talked to a few people interested in my research after the presentation.
Continued refactoring the pstart method and other related parts of the code.
I wrote slides for NZNOG which for the practice presentation on Thursday. I got Dan to help run some last minute results for the slides, however was still missing one graph.
On the Friday I fixed my presentation as per suggestions given after the practice presentation. And I worked on a python ostinato script to run tests and generate results automatically and left this running some situations over the weekend.
Tidied up the way signing requests were dealt with, to help make sure
that they weren't cluttering things up - checking for the certificate
before sending a (possibly unneccessary) request, deleting them when no
longer required, making sure memory is freed.
Spent Wednesday to Friday in Rotorua at the NZNOG conference.
First week back this week, so spent some time catching up on my notes
about where I left off last year. Began testing the AMP CA
initialisation, key/certificate generation and distribution from start
to finish to make sure the system worked together with the amplet
client. Also made lots of minor fixes, removing extraneous debug
messages, documentation updates, etc.
Some inconsistencies had crept into the directory structure that the
webscripts expected compared with the command line tool, so these all
now share a common configuration space. This also ties in to a new
initialisation command which will set up the directory structure as
I generate my certificates with slightly different options than the
default openssl tools do, which meant that the key portion was in a
different place in the certificate. Instead of blindly trying to load
portions of the certificate as a public key, I now properly parse them
all as ASN.1/DER strings and look for the object identifier tag that
describes and RSA key.
Spent Tuesday combining all of our slides into a single presentation and therefore fixing all the weird LibreOffice glitches that resulted (broken diagrams, funky background colours, etc.). Tweaked a few slides to be less wordy.
The talk itself on Friday went OK, if a little over-time.
This week I have been extending the code for gathering flow infomation, in order for it to provide a "Flow Fingerprint". This consists of the server IP address, server port, the transport protocol and the application protocol as identified by libprotoident.
I surmise that these attributes are enough to identify the majority of elephant flows such as: "TCP traffic to a port other than 80 directed toward a dropbox server using the HTTP protocol" is likely an elephant flow.
Combining this simple approach with the data gathered by the flow information scraper, and a manually set threshold (C) should be able to identify common elephant flow configurations.
Continued working on the new detector for netevmon. It is no longer a KS test, strictly speaking, but performs a similar function. Experimented with using Earth Movers Distance as an alternative, but this tended to be badly affected by outliers in the distribution. Managed to come up with a couple of tweaks that improved the performance of the detector overall.
The first was to examine the distribution of the interquartile values only, i.e. discard the bottom 2 and top 2 values from the original distribution, to minimise the impact of outliers in general. Another change I made was to require the total sum of the values in each sample to differ by a non-trivial amount, which would prevent the detector from alerting when the distance between the two distributions is very small.
Ran the new detector against the ground truth dataset to determine how well it performs. Results are not too bad so far -- looks like it will reach similar levels of reliability to the BinSeg detector which is one of the better detectors we have.
I completed my presentation and delivered it to members of the machine learning group on Tuesday.
It was agreed that my investigation involves a number of complex problems, and that a simple approach would be required for the time being to create a training data set of elephant flows.
I was also advised to use the command line version of Weka when using large data sets, as it tends to be more efficient.
With a shift in focus, I will now be investigating simpler methods for identifying hot-spot IP address and port combinations. Essentially each time a packet exceeds an elephant threshold, it's IP address and port number will be logged. If a certain number of elephants are observed from this combination, each flow originating from there will be treated as an elephant.
Implemented and tested the final features of the event database - updating groups and events as new detections are added to an event. This meant updating the DS score, magnitude, timestamp of the most recent detection, detection count and detectors that fired for each event. The group that the recently updated event belongs to is also updated: the end timestamp of the group and the severity (the highest probability across all events is used). Spent a lot of time testing since I kept double-guessing the correct addition and updating of events. Shane also plans on deploying the eventing code so that it'll be possible to see event groups, etc on the graphs, so this will be useful for debugging too.
Shane needed some accuracy values of DS with and without using magnitude as another source of input for use in his presentation. Started the rerun of the sample dataset against eventing with magnitude enabled and disabled.