User login

Search Projects

Project Members

Shane Alcock admin

Libprotoident

Libprotoident is a library that performs application layer protocol identification for flows. Unlike many techniques that require capturing the entire packet payload, only the first four bytes of payload sent in each direction, the size of the first payload-bearing packet in each direction and the TCP or UDP port numbers for the flow are used by libprotoident. Libprotoident features a very simple API that is easy to use, enabling developers to quickly write code that can make use of the protocol identification rules present in the library without needing to know anything about the applications they are trying to identify.

13

Nov

2017

Continued working on tweaking nDAG to both improve performance and add some handy features such as the ability for clients to recognise when an nDAG monitor has restarted and therefore may have missed some packets. Still got one or two ideas on how to improve performance further, so will try those out before merging the code back into mainline libtrace.

Started thinking a bit more about how my ETSI monitor is going to work and how much of it will intersect with libtrace. Will probably need to add an etsilive: read format to libtrace with suitable libpacketdump decoders to help with testing and validation, so that seems like a useful starting point.

Added a feature to my daily libprotoident analysis program to tell me what proportion of traffic on the campus network remains unidentified.

06

Nov

2017

Finished adding the core of nDAG client support to libtrace. Still a little bit of polish required before it is officially finished, but it seems to work. Managed to get around 3.5 - 4 Gbps of multicast to the libtrace client without losing anything, which is not too bad. Once I increase the data rate, it looks like the switch is dropping multicast packets rather than the client themselves so I may be starting to run into some hardware limitations.

Spent a bit of time playing around with libtasn1 and the ETSI ASN.1 specification to see how I can use the library to create some ETSI headers for packet encapsulation. Went public with a proposal for an open-source ETSI lawful intercept tool on Friday and have already got some encouraging responses.

Still seeing new patterns in the Waikato traffic, so libprotoident continues to improve. Reached 450 supported protocols this week -- next landmark is 500.

30

Oct

2017

Managed to get the new telescope software running at a decent packet rate. So far we can capture and multicast ~12 million packets per second without issues. The main limitation that prevents us from going any higher is the capacity of the 10Gb interface that we are multicasting on. Pretty happy with that result and now I can focus on ensuring that the clients will be able to keep up.

Started adding nDAG read support to libtrace. This is mostly a matter of adapting my existing test client code to work within the libtrace structure, as well as making sure that there are suitable code paths for each of the three APIs: parallel, single-threaded and event-driven.

Still seeing new protocols every week on the campus network, even with the decreasing amount of people who are present on campus. 3 new protocols this week; starting to get close to the 450 mark.

24

Oct

2017

Continued developing the new telescope software. nDAG records are now created and multicast out a specific interface. I also have a test client that is able to join the multicast groups and receive the packet streams. There's also a control channel that is used by the telescope to announce the ports that the streams will be transmitted on.

Continued tinkering with adding new libprotoident rules. Added another 6 new protocols this week, all games. Updated a few other existing rules as well to cover new variants or fix minor errors.

Had some meetings on Monday re: a possible open-source ETSI-compliant lawful intercept implementation. There's definitely some interest in the community for something open-source to exist.

17

Oct

2017

Helped Jayden with polishing up the final version of his Honours report. Hopefully he is happy with the final result!

Started testing the initial prototype of the DAG multicaster on our development boxes. Had a few issues getting dpdk pktgen to do exactly what I wanted (not helped by the terrible documentation!) but eventually managed to happily capture 10Gb of small packets split across 4 DAG streams with no real issues. Next step is to start encapsulating and multicasting some nDAG records.

Went to the STRATUS forum on Friday, flying down to Wellington on Thursday afternoon. Forum seemed to go pretty well; plenty of people that I spoke to thought that our work so far was interesting.

Released a new version of libprotoident.

09

Oct

2017

Libprotoident 2.0.12 has been released.

This release is mostly a protocol database update. We've added 26 new protocols and updated a further 33 others since the last release.

We've also added a new category for IP Camera protocols. Some already existing protocols have been moved into this category to better reflect their purpose.

The full list of updated protocols can be found in the libprotoident ChangeLog.

Download libprotoident 2.0.12 here!

09

Oct

2017

Started working on the DAG multicaster for STARDUST. Designed an encapsulation protocol for the multicaster and wrote some prototype code using the libdag API to start grabbing bunches of records and give them to the as yet unimplemented multicaster to encapsulate and send.

Spent some time reading over Jayden's honours report and gave him some (hopefully useful) feedback. The work he has done this year is really interesting; just needs a bit of literary polish so that his markers can fully appreciate it :)

Continued slowly working towards a libprotoident release. The code itself looks ready to go, so I just need to prepare the release announcements. I've updated my paper to include the extra 20 or so protocols that I've added since I started writing up the results -- the paper now covers 435 application protocols.

03

Oct

2017

Spent a decent portion of my week working on my reworked cluster evaluation code for STRATUS. The new version seems to be producing labels that are much more useful, so my ability to evaluate clusters and identify the least conforming members has improved greatly.

Continued to tweak and improve the libprotoident rules. Started working towards a possible 2.0.12 release by updating documentation and running some basic build tests on various operating systems.

25

Sep

2017

Back at work after a couple of weeks disrupted by illness. Spent most of the week working on my application protocol paper. Managed to produce a few interesting looking graphs and am now starting to get a rough idea of how my narrative is going to come together. Essentially, modern application protocols are vague and therefore require a lot more work and expertise to identify. However, they are still possible to identify and there are still plenty of new protocols appearing every year, so DPI hasn't outlived its usefulness entirely yet.

Had a meeting with Alistair from CAIDA about the first steps on the STARDUST project, which is essentially a redevelopment of their telescope to support 10G capture and multiple live clients. Obviously, this is going to build a lot on our experiences so far with parallel libtrace / wdcap -- one of my key jobs will be to develop a new parallel, multicast RT protocol as the old RT protocol simply won't be fast enough anymore.

04

Sep

2017

Ranked every libprotoident protocol by observed bytes in three trace sets (from 2012, 2015 and 2017) as part of an attempt to demonstrate the relative transience of application protocols -- i.e. how protocols can appear, die off, surge or plummet in popularity. Some of the ranks didn't quite make sense, so I had to go back and validate a few of the rules. There were a couple of errors, which meant that I had to re-run the ranking analysis once I had fixed them.

Helped Brad with some stress testing of the Endace probe. This had the unfortunate side effect of making Vision unusable for a few days afterwards, but hopefully the process was helpful to Endace in the long run.

Continued working on a method of automating myself.

28

Aug

2017

Back at work on Wednesday after a couple of weeks away. Spent most of my week catching up on emails and preparing for a STRATUS workshop coming up on Monday.

Did manage to spent a little bit of time looking at unknown traffic on the Waikato capture point again. Added a couple of new protocols: Smite and Fliggy. I've also found what appears to be another IP sharing "tool" similar to IPSharkk -- this one looks like it is installed as malware, so the network user is probably unaware that their machine is being used to proxy other people's traffic. Will try to dig a little more into this next week.

31

Jul

2017

Continued researching and writing for my application protocol paper. Added quite a lengthy background which summarises some of the key events and trends in the history of application protocols which will add a lot of context to my paper.

Also kept investigating new application protocol patterns that continue to appear on the Waikato passive monitor. Added another 5 new protocols this week, so progress continues to be made.

28

Jun

2017

Libprotoident 2.0.11 has been released.

Firstly, this release updates the existing tools to be compatible with both libflowmanager 3 and parallel libtrace. This means that the tools can now take advantage of any parallelism in the traffic source, e.g. streams on a DAG card or a DPDK-capable NIC.

Secondly, we've added 61 new application protocols to our set of detectable protocols, bringing the total supported number of applications to 407. A further 25 existing protocols have been updated to better match new observed traffic patterns.

Finally, there have been a couple of minor bug fixes as well.

Note that this release will require both libflowmanager 3 and libtrace 4, which means that you will likely have to upgrade these libraries prior to installing libprotoident 2.0.11. If this is problematic for you but you still want the new application protocol rules, you can use the '--with-tools=no' option when running ./configure to prevent the tools (which are the reason for the upgraded dependencies) from being built.

The full list of updated protocols can be found in the libprotoident ChangeLog.

Download libprotoident 2.0.11 here!

02

Jun

2017

Libflowmanager 3.0.0 has been released today.

The libflowmanager API has been re-written to be thread-safe (and therefore compatible with parallel libtrace), hence the major version number change.

The old libflowmanager API has been removed entirely; there is no backwards compatibility with previous versions of libflowmanager. If you choose to install libflowmanager 3 then you will need to update your existing code to use the new API. This should not be too onerous in most cases, as most of the old global API functions have simply been replaced with method calls to a FlowManager class instance. The README and example programs demonstrate and explain the new API in detail.

Note that much of our other software that relies on libflowmanager, such as the libprotoident tools and lpicollector, have NOT yet been officially released with libflowmanager 3 support. If you are currently using any of this software, you should continue to use libflowmanager 2.0.5 until we are able to test and release new libflowmanager 3 compatible versions.

You can download both libflowmanager 3 and libflowmanager 2.0.5 from our website.

08

May

2017

Added another 5 protocols to libprotoident -- having a slightly more powerful PC for installing and running various candidate applications has helped quite a bit. Updated the rules for several more protocols as well.

Made some more progress on my protocol taxonomy -- I'm up to 'P' for the TCP protocols so I'm probably about 1/4 of the way through now.

Continued re-factoring the FSM generation code. Getting close to done, although I suspect the amount of changes and variable renaming will require a fair bit of testing to make sure I've transferred everything across correctly.

Added the ability to choose between TCP and HTTP throughput data on the AMP matrix. To do this, I had to bring the amp-web/nntsc install on prophet back up to date after a few months of being untouched. As always, there were a few issues with dependencies and versioning which slowed everything down, but eventually Brendon and I got it all working correctly.

19

Apr

2017

Slightly disrupted week with Easter and cyclones having an impact on the productivity. Most of my time ended up being spent hunting down more previously unknown protocols. Just three new protocols this week, along with fixes for three more.

On the STRATUS side, I worked on creating a way to "combine" the suffix trees for each individual process so that we can account for sequences that appear frequently in the whole dataset but never more than once or twice within a given process. The original implementation would not recognise those sequences as frequent, because it considered each process individually. I think I've got this working now -- but I'm yet to look at the results too closely.

10

Apr

2017

Continued delving into the unknown traffic on the campus network. Had a mix of frustrating days and successful days -- one protocol (N2Ping) took nearly two days to track down but I got there in the end. 8 new protocols added to libprotoident this week again, so we're starting to get close to 400 supported protocols in libprotoident.

Another week of refinement on the FSM code. Most of the effort has been focused on loop recognition, particularly in terms of making sure we don't ignore candidates that can be used to identify loops.

03

Apr

2017

Have been using my new daily libprotoident email to make some good progress in terms of adding new protocols to libprotoident. Another 8 protocols added this week, with 5 existing protocols improved as well.

Found a few new bugs in my FSM tandem-repeat code after running it against my full test dataset and doing an initial validation of the resulting machines. Finished up a set of slides describing (broadly) what I'm doing overall with the FSM project and how I'm going about it, i.e. suffix trees, pattern extraction, variant detection and machine building.

Started looking into a parallel RT implementation for libtrace / wdcap, with an eye towards removing the combiner bottleneck from wdcap.

27

Mar

2017

Finished implementing tandem repeat detection within my existing pattern extraction code. The initial results look promising, i.e. the code has been able to identify "write,read" as a repeat in the FTP system call log with no obvious false positives. Next job will be to repeat the machine validation and make sure that I have improved the results overall.

Wrote a libprotoident program to perform daily monitoring of unknown payload patterns on the Waikato capture point and send me an email every morning with the 25 "biggest" patterns by payload, as well as a few example flows matching each pattern. Using this data, I've already been able to add a few new patterns to libprotoident and look forward to being able to be more proactive at keeping libprotoident up to date.

20

Mar

2017

Finished porting the remaining libprotoident tools to be parallel-compatible. Spent a couple of days looking at unknown payload patterns in some recent Uni traffic -- unfortunately I wasn't able to make much tangible progress on identifying much of the unknown traffic.

Worked on implementing an algorithm for finding tandem repeats in strings, with the eventual aim of porting it over to work with my system call sequences. The published algorithm consists of three phases, but each of those phases has either involved looking up and implementing several other string processing algorithms (LZ-decomposition, longest common extension) or has required modifications to my existing suffix tree code (extracting a suffix array, bottom-up traversal, storing the longest child suffix in each node). Therefore, I'm about half-way through implementing the algorithm.

Moved libtrace into its own github organization to reflect that libtrace is now going to be more of a community project than a WAND project. I'll still be helping out with maintaining it for now, but now the workload can be shared amongst a group of trusted libtrace users (including people outside of WAND). This will hopefully keep libtrace well looked-after, even as my available time gets more and more restricted.