User login

Shane Alcock's Blog

03

Oct

2016

Finished the draft of my NNTSC paper. Got some initial feedback from Brendon which I've been able to incorporate into the paper.

Still not entirely happy with Influx-NNTSC and netevmon running on the same machine, as the combined memory usage will push skeptic's current hardware to its limit. Experimented with running netevmon on a separate VM just to make sure that a remote event database does actually work, so we at least have the option of moving netevmon onto its own dedicated machine.

Finished my implementation of the imprecise pattern mining algorithm. Starting working on a more homegrown algorithm for detecting repeated sequences of syscalls within a larger trace, based on existing techniques for using a suffix tree to find repeated substrings within strings.

23

Sep

2016

Returned to my half-written NNTSC paper with an eye towards submitting it to PAM in a few weeks. Paper is now around 75% finished, including a couple of nice diagrams showing the NNTSC architecture and the database schema. Space is starting to get a bit tight, so I'll have to revisit some of my earlier writing and cleanse it of unnecessary waffle.

With the help of an explanation from Harris, I've been able to decipher the temporal property mining algorithms. Managed to implement the simple version this week, which seems to be doing the right thing, and started working on a
more complicated variant that allows for some imperfections in the source data (e.g 9/10 times a close follows an open, but every now again someone forgets to call close before opening something else).

19

Sep

2016

Kept tinkering with my mock skeptic install. I was a little concerned about the memory usage of anomaly_ts so I went back over some previous work I did to work out relative accuracy rates of each detector under a variety of different parameter settings to try and find good settings for each detector that used a minimal amount of stored history.

Spent a bit of time reading over some papers on mining temporal properties from sequences of function calls. The algorithms that these people are using are a bit tricky to decipher -- the explanation is a bit terse and I don't really have the background in the area to fill in the gaps -- so hopefully Harris will be able to get further than I did.

Continued building FSMs for common syscall patterns. Started working with the user study data which is not at all well covered by my existing FSMs. This appears to be mostly because of various Gnome / X processes and widgets that are continuously polling and receiving events. The syscalls generated by these processes drowns out everything else, so it is hard to find the actions that the users actually performed during the study.

Arranged travel and accommodation for my upcoming trip to IMC.

12

Sep

2016

Finished up the libtrace4 and wandio releases and pushed them out.

Installed a mock version of skeptic on an openstack VM to test how InfluxDB copes with the full public AMP dataset. In general, InfluxDB seems to be coping OK when inserting / browsing data but the memory requirements of anomaly_ts are a bit larger than I would like so that's an avenue to chase up in the near future.

Continued implementing syscall FSMs manually to find out about other cases we need to consider when trying to automate the process. Added the ability to express a state as another FSM so we can build more complex machines from the smaller ones. Documented the code and put it into bitbucket so other people can start working with it.

Also started trying to use the FSMs on another dataset that Alan had collected. Turns out this dataset had a bunch of new syscalls that my previous parser hadn't seen before so it required a bit of updating.

05

Sep

2016

Libtrace 4.0.0 is now out of beta and considered ready for general release.

We've fixed quite a few bugs over the course of the beta. More details can be found on the ChangeLog page on libtrace wiki. However, while we're no longer in beta, there may still be a few bugs out there -- don't hesitate to report any problems you find to us at contact [at] wand [dot] net [dot] nz.

Another major change since the beta release is that we've re-licensed libtrace and libpacketdump to be under the LGPL v3 (rather than the GPL v2). Hopefully this will encourage people who were turned off by the restrictions of the GPL to now adopt libtrace for their packet capture and analysis needs.

This version of libtrace includes an all new API that resulted from Richard Sanger's Parallel Libtrace project, which aimed to add the ability to read and process packets in parallel to libtrace. Libtrace can now also better leverage any native parallelism in the packet source, e.g. multiple streams on DAG, DPDK pipelines or packet fanout on Linux interfaces.

Please note that the old libtrace 3 API is still entirely intact and will continue to be supported and maintained throughout the lifetime of libtrace 4. All of your old libtrace 3 programs should still build and run happily against libtrace 4; please let us know if this turns out to not be the case so we can fix it!

Learn about the new API and how parallel libtrace works by reading the Parallel Libtrace HOWTO.

Download the new release from the libtrace website.

05

Sep

2016

Libwandio 1.0.4 has been released today.

The main change in this release is that the licensing has moved from GPL v2 to LGPL v3.

The other major change is that we've hopefully finally fixed all of the segmentation faults that would occur if you used wandio on a 32-bit system.

More details on the changes in this release can be found in the Changelog file included with the libwandio source code.

You can download the new version of libwandio from our website.

02

Sep

2016

Released new versions of libprotoident and libflowmanager with the new LGPL licensing. Also re-licensed and tested potential libtrace and wandio releases but haven't quite got to the stage where I want to push out the releases just yet.

Continued messing around with deriving FSMs from common system call patterns and turning them into runnable code. I've got 8 FSMs drawn up and have implemented 5 of them. Developed a bit of backend for applying my FSMs to the log data so that I can implement new FSMs with the least amount of coding possible (e.g. common actions like checking fd consistency and making sure paramaters match expected values are all done within a parent FSM class and the child classes just list the relevant data to compare against). Hopefully this will help move towards automated generation of the FSM code.

Had a few meetings where we discussed the FSM approach (and RA3 in general) with a few of the industry partners and they seem reasonably pleased with what we are trying to achieve so that's reassuring.

Helped Brendon try to debug some issues with data not appearing on graphs on the recently updated deployment. As a result of this, we've realised we need to re-think how we are storing and presenting traceroute data so that we can't avoid these problems in the future.

29

Aug

2016

A new version of libflowmanager has also been released today.

Once again, the main change is that the licensing has moved from GPL v2 to LGPL v3.

We've also made some changes to make it easier to experiment with different flow expiry algorithms. Flow expiry behaviour is now implemented as separate plugins, rather than being hard-coded into libflowmanager itself. This means if you like the structure of libflowmanager but don't agree with our timeouts for inactive flows, you are able to write your own without having to touch the core of the library. We also added a couple of other config options that allow you to further tweak timeout behaviour -- see the ChangeLog included with the source code for more details.

You can download the new version of libflowmanager from our website.

We've also put libflowmanager up on our github, so you can follow any future libflowmanager development more closely.

29

Aug

2016

Libprotoident 2.0.9 has been released today.

The biggest change in this release is that libprotoident is now using the LGPL v3 license rather than the GPL v2 license. We hope that this will be welcome news to some people who had previously wanted to use libprotoident in their software but were put off by the restrictions of the GPL license. Note that we are aware that our other libraries (libtrace, libflowmanager, wandio) that libprotoident depends on are still GPL -- rest assured that LGPL versions of these libraries will appear soon.

We've also added support for another 12 new application protocols, including Facebook Messenger, Facebook Zero, Overwatch and Baidu Yun P2P. We've improved the rules for a further 16 protocols such as Google Hangouts, Minecraft, QUIC, World of Warcraft and DOTA2.

As always, the full list of changes can be found in the libprotoident ChangeLog.

Download libprotoident 2.0.9 here!

26

Aug

2016

Started looking at the most common patterns in my example sysdig logs. It's pretty obvious that we can easily recognise some low-level actions based on the sequence of system calls and produce models that can be used to identify them. For example, loading a .so shared library will generally result in the same sequence of system calls (with some minor variations) and therefore that can be expressed as a finite state machine.

Developed FSMs for four low level actions: loading a .so library, loading a python module, receiving a typed character via ssh and reading a modprobe config file. Implemented the SSH action as code so I can now find and replace those sequences in the logs with a single SSHCharInput action.

Helped Brendon install NNTSC, ampy and amp-web packages on one of our existing deployments on Thursday. We ended up with a problem where NNTSC would not return query data to the web-site and it took a lot of time (and debugging) to find the source of our problem: incongruous versions of psycopg2 in pip vs the debian package.

Started prepping a libprotoident release. libprotoident is moving to an LGPL license so I've had to replace the blurb at the top of every source file. Been working through the usual pre-release testing and ChangeLog updating.

Spent Wednesday at the Honours conference. I thought all of our students presented well and gave good accounts of their work so far.