Found a paper where they cluster event logs with word vector pairs; this approach compares each pair to each other pair in the supplied logs allowing it to cluster lines with similar parts. There is also a toolkit associated with the paper that allows you to specify the input files and the support for making a pair then outputs the clusters where the support is reached, the outlier clusters can also be outputted. This will need to be investigated further to see if it is a good possible solution.
Had a meeting with Antti Puurula about possible approaches, where we discussed outputting a ranking into lists of safe and unsafe events. It was discussed on how this could be evaluated with a Mean Average Precision measurement and then a few algorithms that could be used for scoring events like clustering if the feature space could be separated, supervised learning if all the data was tagged or his recommended a supervised learning where the user manually updates the list of safe/unsafe and the classifier updates iteratively.
Then non language features like time stamps were discussed on how to integrate them as well by having another algorithm like niavebayes handling continuous features. This way we could identify events happening within a certain time period of one another to tie events between files.
Short week as I was off sick on Monday and Tuesday.
Spent some time looking into using a headless web testing environment
as an alternative to the current HTTP test. This would give us
don't (due to them being generated programmatically or obfuscated). Not
all of the headless testing software appears to give full access to the
events that I'm interested in, while some are written such that they
will be awkward to integrate into an AMP test. Currently looking at
embedded Chromium as most likely to be useful.
Started refactoring some of the configuration parsing code in amplet to
remove some unnecessary globals and remove some cruft from the main loop
that didn't really need to be there.
Updated the website authentication to make it easy to toggle on and off,
as we don't want to protect the public site. Merged this and the rest of
the recent changes (raw data fetching etc) back into the develop branch.
Spent some time looking into what appear to be periodic MTU issues on
one of our test connections that are preventing the throughput test from
running. Confusing matters is that I'm not sure how well the route cache
deals with network namespaces - it sometimes appears as if it is all
shared between all connections, but sometimes it doesn't. It's possible
these symptoms would go away with a newer kernel version (route cache
was removed, better network namespace support).
Brad managed to track down a newer video card for quarterpounder, so now BSOD is up and running again.
Added Meena's lpicollector to our github so now I can finally deprecate the lpi_live tool that comes with libprotoident. Spent a bit of time updating some documentation and reworking the example client scripts so that everything is a bit easier to use. Also fixed a couple of memory bugs that I may have introduced last time I worked on the collector.
Continued working with the new event groups. Found a problem where I was incorrectly preferring shorter AS path segments over longer ones when determining whether I could remove a group for being redundant. Having fixed that, many event groups now cover several ASNs so I've redesigned the event list on the dashboard to be better at displaying multiple AS names.
Have a copy of ubiquiOS in hand. No further progress on the project.
Looking forward to integrating ubiquiOS and improving all the low-level OS functionality (software timers, memory allocation and debug output spring to mind!).
The source code for both BSOD and Meenakshee Mungro's reliable libprotoident collector have been added to the WAND github page. Developers can freely clone these projects and make their own modifications or additions to the source code, while keeping up with any changes that we make between releases.
This is the first time we have released the libprotoident collector under the GPLv3 license. This project is a replacement for the lpi_live tool included with libprotoident, which should now be considered deprecated.
We're also more than happy to consider pull requests for code that adds useful features to either project.
WAND on GitHub
This week I've been continuing to look at OFLOPS (a testing framework for OpenFlow switches) and enquiring about hardware to test upon. Josh Bailey has put me in contact with the original OFLOPS authors, so I will work with them to figure out the best way to update OFLOPS.
I've been looking into what the best library is for constructing/parsing OpenFlow messages on top of a measurement platform such as OFLOPs. I'm looking for something in a low level language such as C/C++ for performance reasons. However I also need to support at least OpenFlow 1.0 and OpenFlow 1.3. Hopefully the differences between the two will be abstracted as much as possible allowing the same code to create both OF 1.0 and 1.3 messages (and any future versions). Ideally the code is going to be concise etc.
I'm currently looking at libfluid, (floodlight's) loxigen and OFConnect. I'm currently writing some simple cases in each to see how easy they are to use.
The IS0 Doubletree chapter was updated. Now included is data on link usage and global stop set hits. This information is available because of the greater complexity of IS0 than the trace based simulator and it also helps with validation to some extent.
Fuller results were added for the CAIDA data based Doubletree trace based simulator.
In the Megatree chapter graphs formatting was improved and the introduction and validation sections were double checked and updated.
A new run of the black hole detector was initiated. Also all of my warts files were compressed.
My NNTSC live queue continued to keep up satisfactorily, so I've turned my attention back to testing AS-based event grouping in netevmon. Updated the dashboard to use AS names rather than numbers to describe event groups. Replaced the "top sources" and "top targets" graph with a "top networks" graph.
Spent Thursday hosting one of the candidates applying for a position with STRATUS.
Added BSOD to our github on Friday. Tried to get the client running on the big TV, but ran into some issues with our video card no longer being supported by fglrx. Attempting to get the client to build and run on the Mac was not much more successful, since Xcode seems to have lost track of some of our dynamic libraries.
I have most of the association procedure implemented meaning nodes are able to associate with the coordinator. However I'm having problems with my custom OS. Rather than spend more time on that venture, the plan is to use ubiquiOS for now. I can always write a new OS later - or use something like Contiki.
I have a fair bit of experience with ubiquiOS so this will be the fastest path forward.
Not much will get done until at least the end of next week. The next step is to integrate ubiquiOS and make sure error states are handled with the association. Then it's on to security!