Libprotoident is a library that performs application layer protocol identification for flows. Unlike many techniques that require capturing the entire packet payload, only the first four bytes of payload sent in each direction, the size of the first payload-bearing packet in each direction and the TCP or UDP port numbers for the flow are used by libprotoident. Libprotoident features a very simple API that is easy to use, enabling developers to quickly write code that can make use of the protocol identification rules present in the library without needing to know anything about the applications they are trying to identify.
Managed to write libprotoident rules for a couple of new applications, WeChat and Funshion. Released a new version of libprotoident (2.0.7).
Added support for the AMP DNS test to NNTSC, netevmon and amp-web. Wrote a new detector that looks for changes in response codes, e.g. the DNS response going from NOERROR to REFUSED or some other error state. This should also be useful for the HTTP test in the future.
Fixed a bug in the ChangepointDetector where it wasn't dealing well with streams that featured large values (i.e. >100,000). Also spent a bit more time tweaking the Plateau detector, mainly dealing with problems that show up when either the mean or the standard deviation are very small.
This release adds support for 14 new protocols including League of Legends, WhatsApp, Funshion, Minecraft, Kik and Viber. A new category for Caching has also been added.
A further 13 protocols have had their rules refined and improved including Steam, BitTorrent UDP, RDP, RTMP and Pando.
This release also fixes the bug where flows were erroneously being classified as No Payload, despite payload being present.
The full list of changes can be found in the libprotoident ChangeLog.
Short week due to remaining in Aus for a holiday after LCN.
Upon my return, I spent a bit of time trying to capture traffic for WhatsApp and other mobile messaging services. I had earlier found some flows that were possibly WhatsApp in some traffic I had captured before going away and wanted to confirm it.
It turned out to be a bit trickier to get this traffic than originally anticipated. WhatsApp required a mobile phone number to register an account so we needed to acquire a couple of new 2degrees SIM cards and receive the confirmation text messages on them. Also, the Android VM that we had created for this purpose wouldn't install WhatsApp because the image was intended for a tablet rather than a phone so we had to use Blue Stacks instead.
I also captured traffic for Kik, another similar application, and found that we were erroneously classifying Kik traffic as Apple Push notifications as they both use SSL on port 5223. Fortunately, some very subtle differences in the SSL handshake allowed me to write a rule that could reliably identify Kik traffic. Also tried to capture GroupMe traffic but could not reliably receive the text message required to register an account.
Spent most of Friday going over events reported by the Plateau detector in netevmon and made a number of tweaks which should hopefully make it quicker to pick up on obvious changes in latency time series as well as more reliable than before.
Spent most of the week in Sydney attending the LCN 2013 conference. Gave my presentation in the Workshop on Network Measurement to little fanfare.
Learned a few things at the conference:
* Named Data Networking exists and some people are taking it seriously: http://named-data.net/ . My first thoughts were that we've had enough trouble getting people to adopt a new IP version, let alone a system that completely changes how routers work.
* Lots of people are still using ns-2 to validate their research.
* The bar for publication is pretty low in some conferences / workshops, as long as you do something "innovative".
Spent most of the week preparing for my Sydney trip. Wrote the talk I will be presenting this coming Thursday and gave a practice rendition on Friday.
The rest of my time was spent fixing minor issues in Cuz -- trying not to break anything major before I go away for a week. Replaced the bad SQLAlchemy code in the ampy netevmon engine with some psycopg2 code, which should make us slightly more secure. Also tweaked some of the event display stuff on the dashboard so that useful information is displayed in a sensible format, i.e. less '|' characters all over the place.
Had a useful meeting with Lightwire on Wednesday. Was pleased to hear that their general impression of our software is good and will start working towards making it more useful to them over the summer.
Open-source payload-based trafﬁc classifiers are frequently used as a source of ground truth in the trafﬁc classification research ﬁeld. However, there have been no comprehensive studies that provide evidence that the classifications produced by these software tools are sufﬁciently accurate for this purpose. In this paper, we present the results of an investigation into the accuracy of four open-source trafﬁc classifiers (L7 Filter, nDPI, libprotoident and tstat) using packet traces captured while using a known selection of common Internet applications, including streaming video, Steam and World of Warcraft. Our results show that nDPI and libprotoident provide the highest accuracy among the evaluated trafﬁc classiﬁers, whereas L7 Filter is unreliable and should not be used as a source of ground truth.
To be published at the Workshop on Network Measurements (WNM 2013), October 2013.
Copyright (C) IEEE 2013.
Made a number of minor changes to my paper on open-source traffic classifiers in response to reviewer comments.
Modified the NNTSC exporter to inform clients of the frequency of the datapoints it was returning in response to a historical data request. This allows ampy to detect missing data and insert None values appropriately, which will create a break in the time series graphs rather than drawing a straight line between the points either side of the area covered by the missing data. Calculating the frequency was a little harder than anticipated, as not every stream records a measurement frequency (and that frequency may change, e.g. if someone modifies the amp test schedule) and the returned values may be binned anyway, at which point the original frequency is not suitable for determining whether a measurement is missing.
Added support for the remaining LPI metrics to NNTSC, ampy and amp-web. We are now drawing graphs for packet counts, flow counts (both new and peak concurrent) and users (both active and observed), in addition to the original byte counts. Not detecting any events on these yet, as these metrics are very different to what we have at the moment so a bit of thought will have to go into which detectors we should use for each metric.
Added support for the Libprotoident byte counters that we have been collecting from the red cable network to netevmon, ampy and amp-web. Now we can visualise the different protocols being used on the network and receive event alerts whenever someone does something out of the ordinary.
Replaced the dropdown list code in amp-web with a much nicer object-oriented approach. This should make it a lot easier to add dropdown lists for future NNTSC collections.
Managed to get our Munin graphs showing data using a Mbps unit. This was trickier than anticipated, as Munin sneakily divides the byte counts it gets from SNMP by its polling interval but this isn't very prominently documented. It took a little while for myself, Cathy and Brad to figure out why our numbers didn't match those being reported by the original Munin graphs.
Chased down and fixed a libtrace bug where converting a trace from any ERF format (including legacy) to PCAP would result in horrendously broken timestamps on Mac OS X. It turned out that the __BYTE_ORDER macro doesn't exist on BSD systems and so we were erroneously treating the timestamps as big endian regardless of what byte order the machine actually had.
Migrated wdcap and the LPI collector to use the new libwandevent3
Changed the NNTSC exporter to create a separate thread for each client rather than trying to deal with them all asynchronously. This alleviates the problem where a single client could request a large amount of history and prevent anyone else from connecting to the exporter until that request was served. Also made NNTSC and netevmon behave more robustly when a data source disappears -- rather than halting, they will now periodically try to reconnect so I don't have to restart everything from scratch when I want to apply changes to one component.
Finally, my paper on comparing the accuracy of various open-source traffic classifiers was accepted for WNM 2013. There's a few minor nits to possibly tidy up but it shouldn't require too much work to get camera-ready.
There is currently an increasing demand for accurate and reliable traffic classiﬁcation techniques. Libprotoident is a library developed at the WAND Network Research Group (WAND) that uses four bytes of payload for the classiﬁcation of ﬂows. Testing has shown that Libprotoident achieves similar classiﬁcation accuracy to other approaches, while also being more efficient in terms of speed and memory usage. However, the primary weakness of Libprotoident is that it lacks the visualisation component required to encourage adoption of the library.
This report describes the implementation of a reliable real-time collector for Libprotoident that will form the back-end component to support a web-based visualisation of the statistics produced by the library. The collector has been designed and implemented to support the classiﬁcation of ﬂows and exporting of application usage statistics to multiple clients over the network in separate threads, whilst operating asynchronously so as to achieve high performance when measuring multi-gigabit networks.
Spent a little time reviewing my old YouTube paper in preparation for discussing it in 513.
Tracked down and fixed a few outstanding bugs in my new and improved anomaly_ts. The main problem was with my algorithm for keeping a running update of the median -- I had a rather obscure bug when inserting a new value that was between the two values I was averaging to calculate the median that was causing all sorts of problems.
Added an API to ampy for querying the event database. This will hopefully allow us to add little event markers on our time series graphs. Also integrated my code for querying data for Munin time series into ampy.
Churned out a revised version of my L7 filter paper for the IEEE Workshop on Network Measurements. I have repositioned the paper as an evaluation of open-source payload-based traffic classifers rather than a critique of L7 filter. I also spent a fair chunk of time replacing my nice pass-fail system for representing results with the exact accuracy numbers because apparently reviewers found the former confusing.
Tried to continue my work in tidying up and releasing various trace sets, but ran into some problems with my rsyncs being flooded out over the faculty network. This was quite a nuisance so we need to be more careful in future about how we move traces around (despite it not really being our fault!).
Very short week this week, but managed to get a few little things sorted.
Added a new dataparser to NNTSC for reading the RRDs used by Munin, a program that Brad is using to monitor the switches in charge of our red cables. The data in these RRDs is a lot noisier than smokeping data, so it will be interesting to see how our anomaly detection goes with that data. Also finally got the AMP data actually being exported to our anomaly detector - the glue program that converted NNTSC data into something that can be read by anomaly_ts wasn't parsing AMP records properly.
Spent a bit of time working on adding some new rules to libprotoident to identify previously unknown traffic in some traces sent to me by one of our users.
Spent Friday afternoon talking with Brian Trammell about some mutual interests, in particular passive measurement of TCP congestion window state and large-scale measurement data collection, storage and access. In terms of the latter, it looks many of the design decisions we have reached with NNTSC are very similar to those that he had reached with mPlane (albeit mPlane is a fair bit more ambitious than what we are doing) which I think was pretty reassuring for both sides. Hopefully we will be able to collaborate more in this space, e.g. developing translation code to make our data collection compatible with mPlane.
Exporting from NNTSC is now back to a functional state and the whole event detection chain is back online. Added table and view descriptions for more complicated AMP tests; traceroute, http2 and udpstream are now all present. Hopefully we can get new AMP collecting and reporting data for these tests soon so we can test whether it actually works!
Had some user-sourced libtrace patches come in, so spent a bit of time integrating these into the source tree and testing the results. One simply cleans up the libpacketdump install directory to not create as many useless or unused files (e.g. static libraries and versioned library symlinks). The other adds support for the OpenBSD loopback DLT, which is actually a real nuisance because OpenBSD isn't entirely consistent with other OS's as to the values of some DLTs.
Helped Nathan with some TCP issues that Lightwire were seeing on a link. Was nice to have an excuse to bust out tcptrace again...
Looks like my L7 Filter paper is going to be rejected. Started thinking about ways in which it can be reworked to be more palatable, maybe present it as a comparative evaluation of open-source traffic classifiers instead.
Made a few modifications to Brendon's detectors which make them perform better across a variety of AMP time-series. In particular, the Plateau detector no longer uses a fixed percentage of the trigger buffer mean as its event threshold - instead it uses several standard deviations from the history buffer. Also fixed some problems we were having with being in an event and treating all the following measurements that are similar to those that triggered the event as anomalous. This is a problem in cases where the "event" is actually the time series moving to a new normality: our algorithm just kept us in the event state the whole time!
Once I was happy with that, got the eventing code up and running against the events reported by the anomaly detection stage. Had to make a couple of modifications to the protocol used to communicate between the two to get it working properly (there were some hard-coded entries in Brendon's database that needed a more automated way of being inserted). Tried to get the graphing / visualisation stuff going after that, but there are quite a few issues there so that may have to wait a bit.
Started looking into packaging and documenting the usage of all the tools in the chain that we've now got working. First up was Nathan's code, which is proving a bit tricky so far because a) it's python so no autotools and b) his code is rather reliant on other scripts being in certain locations relative to the script being run.
Added another protocol to libprotoident: League of Legends.
Spent a day messing around with the event detection software, mainly seeing how Brendon's detectors work with the existing AMP data. The new "is it constant" calculation seems to be working reasonably well, but there are still a lot of issues with some of the detectors. Need to spend a bit of uninterrupted time with it to really see how it all works.
Had a quick look at the latest ISP traces with libprotoident to see if there are any obvious missing protocols I can add to the library. Added one new protocol (Minecraft) and tweaked a few existing protocols.
Spent the rest of the week at NZNOG, catching up on the state of the Internets. Most of the talks were pretty interesting and it was good to meet up with a few familiar faces.
Decided to replace the PACE comparison in my L7 Filter paper with Tstat, a somewhat well-known open-source program that does traffic classification (along with a whole lot of other statistic collection). Tstat's results were disappointing - I was hoping they would be a lot better so that the ineptitude of L7 Filter would be more obvious, but I guess this does make libprotoident look even better.
Fixed a major bug in the lpicollector that was causing us to insert duplicate entries in our IP and User maps. Memory usage is way down now and our active IP counts are much more in line with expectations. Also added a special PUSH message to the protocol so that any clients will know when the collector is done sending messages for the current reporting period.
Spent a fair chunk of time refining Nathan to a) just work as intended, b) be more efficient and c) be more user-friendly / deployable. I've got it reading data properly from LPI, RRDs and AMP and exporting data in an appropriate format for our event detection code to be able to read.
Started toying with using the event detection code on our various inputs. Have run into some problems with the math used to determine whether a time series is relatively constant or not - this is used to determine which of our detectors should be run against the data.
Got the bad news that the libprotoident paper was rejected by TMA over the weekend. A bit disappointed with the reviews - felt like they were too busy trying to find flaws with the 4-byte approach rather than recognising the results I presented that showed it to be more accurate, faster and less memory-intensive than existing OSS DPI classifiers. Regardless, it is back to the drawing board on this one - looks like it might be the libtrace paper all over again.
Spent most of my week working with Meenakshee's LPI collector. The first step was to move it out of libprotoident and into its own project, complete with trac - this meant that future libprotoident releases are not dependent on the collector being in a usable state. Added support to the collector to track the number of local IP addresses "actively" using a given protocol. This is in addition to the current counter that simply looks at the number of local IP addresses involved in flows using a given protocol - an IP receiving BitTorrent UDP traffic but not responding would not count as actively using the protocol (i.e. the new counter), but would count as having been involved in a flow for that protocol (i.e. the old counter).
After meeting with Lightwire, it was suggested that a LPI collector that could give a protocol breakdown per customer would be very useful. As a result, I added support for this to the collector. In terms of the increased workload, the actual collection process seems to manage ok, but exporting this data over the network to the Nathan database client doesn't work so well.
Added some basic transaction support to Nathan's code, so that all of the insertions from the same LPI report are now inserted using a single transaction. Ideally, though, we need to be able to create transactions that cover multiple LPI reports - perhaps by extending the LPI protocol to be able to send some sort of "PUSH" message to the client to indicate that a batch of reports is complete.
Went over the collector with callgrind to find bottlenecks and suboptimal code. Found a number of optimisations that I could make in the collector, such as caching the name strings and lengths for the supported protocols rather than asking libprotoident for them each time we want to use them. I also had a frustrating battle with converting my byteswap64 function into a macro - got there in the end thankfully.
Finished up the draft of my L7 Filter paper.
Just a lonely two day week while everyone else was still on holiday.
Released a new version of libtrace (3.0.16) - now Richard's ring buffer code is out amongst the wide world and hopefully our users won't find too many bugs in it.
Got back into writing my paper on L7 Filter. Most of the content is there now, although I'm not entirely convinced that the way I have structured the paper is quite right. It's much more readable the way I have it now, but it looks more like a bulleted list than a technical paper.
Meenakshee's LPI collector worked pretty well running on some trace files over the break, which was pleasing. Next step is to get it working on our newly functional ISP capture point. Tested the capture point out by running some captures over the weekend - aside from a bug in the direction tagging everything looks good, so we have at least one working capture point.
Started writing a paper on my L7 Filter results - managed to get through an introduction and background before running out of steam.
Developed a module for Nathan's data collector that connects to Meena's LPI collector, receives data records, parses them and writes appropriate entries into a postgresql database. Ran into a bit of a design flaw in Nathan's collector - streams (i.e. the identifying characteristics for a measurement) have to be pre-defined before starting the collector. This doesn't work too well with LPI, where there are 250 protocols x 10 metrics x however many monitors one is running. Even worse, the number of protocols will grow with new LPI releases and we don't want to have to stop the collector to add code describing the resulting new streams.
Managed to hack my way around Nathan's code enough to add support for adding new streams whenever a new protocol / metric / monitor combination is observed by my module. Seems to work fairly well (at the second attempt - the first one ran into horrible concurrency problems due to a shared database connection).
Tried deploying the LPI collector at our ISP box, only to find that they've been playing with their core network a lot recently and now we don't see any useful traffic :(
Managed to get native BPF socket capture exporting correctly over the RT protocol. Changed the build system to make it possible to export captures taken using a native socket interface over RT to a machine running a different OS to the capture host, e.g. capture using Linux Native, export to a FreeBSD box.
WDCap now builds and runs on both Mac OS X and FreeBSD. Also changed the way the disk output module names files, based on some code submitted by Alistair King. You now specify your output filename format using strftime-style conversion modifiers, which offers a bit more flexibility to users rather than them being stuck with our particular file naming convention.
Continued working closely with Meenakshee on the new collector. Designed a binary format for exporting our collector messages called the libprotoident collector protocol (or LPICP for short).
Finished collecting traces for most of the protocols I wanted to test with L7 Filter and collated the initial results. Wrote a blog post about it (https://secure.wand.net.nz/content/case-against-l7-filter) and started working on a paper.
L7 Filter is used as a source of ground truth in the traffic classification field because it has been around for a long time and is widely known. However, my experiences with L7 Filter had raised a few questions in my mind with regard to its accuracy. After looking online, I did not find any evidence that L7 Filter is actually an accurate or reliable traffic classifier. In this blog post, I present some preliminary results from my own investigation into the correctness (or lack thereof) of L7 Filter's classifications using packet traces containing traffic for only a single known application.