User login

Shane Alcock's Blog

06

Jan

2017

Libprotoident 2.0.10 has been released.

This release includes rules to match new traffic patterns for many of the protocols that we introduced in the 2.0.9 release. We've also added two new protocols: BACnet and Maxicloud.

This release also no longer treats TCP keepalive packets as payload-bearing.

The full list of updated protocols can be found in the new libprotoident ChangeLog.

Download libprotoident 2.0.10 here!

19

Dec

2016

Tidied up and documented the FSM extraction code, so that I'll be able to remember how it works when I start working on it again in earnest next year.

Finished the matrix layout / selection changes and merged them back into develop. Hopefully we will get a chance to roll these out early next year once Brendon builds some new packages.

I had to run a test capture for a few days last week to make sure that some changes Richard had made to libtrace had not broken DAG and RT inputs. Ran the resulting traces through libprotoident to see if there are any new protocols worth investigating. Managed to make a few improvements to the rules for existing protocols to catch a few cases that we were missing but otherwise nothing particularly exciting cropped up.

12

Dec

2016

In Wellington for STRATUS forum on Monday. Had a few interesting chats -- definitely a lot of people out there interested in anomaly detection in a variety of contexts.

Continued refining my FSM generation code. Managed to get rid of most of the obviously incorrect transitions in my test cases now. There's still a bit of work to do in terms of tidying up some orphaned states that are left over as a result of the code realising they are redundant and trying to choose better start states, but my main focus before the end of the year will be tidying up the code and making sure it is sufficiently documented so I'll be able to pick it up again in the new year.

Fixed a bunch of small problems with amp-web and NNTSC that we've known about for a while. Started working on replacing the matrix selection tabs with dropdowns and combining related "tabs" into a single matrix type, e.g. http duration and http page size are combined into a single "http" matrix with the ability to change the metric using a dropdown.

06

Dec

2016

Continued working on the automatic FSM generation code. Managed to implement my earlier solution to the problem of loop recognition -- the core principle is that every candidate sequence is subjected to being transformed into its own suffix tree, which we can use to identify repeated patterns within the candidate itself. I've also placed an upper limit on the sequence length of candidates extracted from my original suffix tree, so that we do not waste time dealing with candidates that are obviously too long to represent a single action.

Once I had that working, I added some pydot code to generate visual representations of my state machines. Not only does this give me something to show people, it is also very handy for spotting incorrect transitions introduced by my code. Spent the rest of the week chasing up various incorrectness in my machines and testing across a handful of different input datasets.

28

Nov

2016

Back to work on Wednesday after my brief holiday in LA. Caught up on a backlog of email and bug reports, then started getting back into my FSM generation work for STRATUS.

Fixed some bugs where my code was incorrectly introducing loops into the auto-generated FSM. Continued testing with some of the logs that Alan had collected from specific applications and making sure that the resulting machines appear sane.

The most obvious outstanding problem now is that my original pattern extraction approach does not recognise cases where a pattern consists of a much shorter subpattern that is being repeated. I'd rather be using the subpattern to create my FSMs, otherwise I end up with FSMs that have a path for pattern X being repeated 4 times, X being repeated 5 times, etc. based on what combinations I saw in my original log. After drawing a lot of suffix trees on the whiteboard, I think I've come up with solution that might work -- next week's job is to implement and evaluate it.

23

Nov

2016

Double report due to being away at IMC.

Gave a practice run of my IMC talk. Still needed to carve out some content and streamline it a bit, so spent more time working on that.

Documented the process for adding a new metric (i.e. a new graph/matrix for an existing collection) to the AMP website. Worked through an example by adding HTTP page size to my development website. Identified a number of issues with some of the terminology in the amp-web code that need to be fixed in the long run, but this will probably require a decent bit of code re-architecting.

Attended IMC in Santa Monica. Talk seemed to go over pretty well and it was great to catch up with people I had met at AIMS earlier this year, as well as some familiar faces from previous IMCs. Took the opportunity to have a brief holiday in L.A. afterwards.

07

Nov

2016

Continued reading over versions of Dimeji's and Richard's papers and providing them with (hopefully) useful feedback.

Continued working on my IMC talk. All of the content is in place now, but it's probably a bit long. Will try to refine more before I leave.

More tweaking of the syscall analysis code to try and make sure the output matches my expectations in each example that I've tested it against. The algorithm still doesn't work too well in the presence of multi-syscall loops, so I'll need to think of an approach to recognise and represent loops rather than treating each iteration of the loop as a branch.

28

Oct

2016

Spent a couple of days reading over Richard S's paper and providing feedback.

Continued keeping an eye on the influx-nntsc test deployment. Pretty happy with it so Brendon and I will start working on packaging everything and rolling it out to skeptic next week.

Started working on an outline for my IMC talk.

Got some initial results back to Harris and Alan for their experiment using my suffix tree code. Had to rewrite a previously recursive algorithm to be iterative to work with some of the larger syscall logs, since Python is hopeless at recursion.

Migrated the iterative version back into my automatic FSM construction code, which I resumed looking at on Friday. Still finding plenty of cases where variant patterns are not being combined into the original FSM correctly, so this has mostly involved a lot of debugging. The code has started to sprawl a bit, so had to take some time to refactor it into a manageable state.

21

Oct

2016

Spent most of my week working on turning system call patterns extracted by my suffix tree into workable FSMs. So far the focus has been on recognising which patterns are variations on previously seen patterns and creating "branches" in my internal representations of those patterns that incorporate the allowed variations (ideally, without creating any invalid transitions). I also have to account for situations where the pattern has been "shifted", so naively looking at the edit distance between two patterns doesn't work too well.

The other challenge that I've run into, especially with shifted variants where the pattern repeats, is trying to determine the correct start state. In some cases, there are other variants to the pattern that indicate where the start could be -- i.e. the first system call that is common to all variants is probably the starting point -- but this extra information will not always be available.

End result: I've got an algorithm that seems to work as expected on the first couple of examples I've looked at. It'll need more testing on a wider variety of cases and there are still some outstanding situations that I know are not dealt with as well as I would like, e.g. loops that contain multiple distinct system calls.

Changed direction a bit to help Harris and Alan with an experiment they are running that tries to map application log items with system call patterns. Again using my suffix tree code, I am pulling out common system call patterns and reporting the pids and start times of all instances where those patterns appear in the system call logs. Alan will then see how well those correlate with the entries in the application logs.

Spent some time reading over Dimeji's paper and offering feedback.

17

Oct

2016

Added a new collection to NNTSC for storing traceroute path lengths. This allows me to store the path lengths in Influx (for fast matrix aggregation), while keeping the full traceroute data in postgres. Updated ampy and amp-web to use the new collection, so we now have better matrix performance for all data types. NNTSC memory usage still seems to be fairly stable, which is good news.

Made a few final tweaks to my NNTSC paper before submitting on Friday.

Started looking into how I can use the common sequences extracted by my suffix tree code to recognise syscall patterns that can be turned into FSMs. The interesting challenge is identifying and combining variants of the same syscall pattern -- this is still a work in progress but early signs are promising (at least for the one example I've got so far!).