Shane Alcock's Blog
Spent the early part of my week reading over Dan's and Darren's revised Honours reports and offering a final batch of suggestions.
Continued poking at libprotoident and the unknown traffic on various Web ports. Finally managed to get Blade and Soul (a Chinese MMO) installed and running and was able to confirm that it was responsible for some of my unknown flows.
Started turning my attention towards our STRATUS research this week. Initially, we are going to look at general metrics that we can extract from cloud infrastructure and see if any of our existing event detection techniques are useful for finding anomalous behaviour. For a start, we are using data collected by the Ceilometer module on the Waikato OpenStack instance. Spent some time bringing Harris up to speed on NNTSC and netevmon so that he can experiment with the data within our system. In the meantime, I'm going to take a closer look at the data that we've collected to see which series will be most suitable to focus on in the short term.
Gave more details about our STRATUS work / goals to the designers who will be producing a poster about our research for the upcoming STRATUS forum.
Also played with a service called ThisData which claimed to offer something similar to what we have envisioned from STRATUS. ThisData is certainly pretty, but doesn't really seem to offer much more than daily revision control for your cloud data.
Spent a fair chunk of my week proof-reading, first a document responding to questions about the BTM project, then Dan and Darren's Honours reports.
Tracked down and fixed a bug in parallel libtrace where ticks were messing with the ordered combiner, causing some packets to be sent to the reporter out of order. Also managed to replicate and fix the memory leak bug that was causing Yindong's wdcap on wraith to invoke the OOM killer.
Continued poking at unknown port 443 and port 80 traffic in libprotoident. Most of my time was spent trying to install and capture traffic from various Chinese applications that I had reason to suspect were causing most of my remaining unknown traffic, with mixed success.
Finally released the libtrace4 beta on Tuesday, after doing some final testing with the DAG cards in the 10G dev machines.
Managed to find a few more protocols to add to libprotoident, but am now trying to move towards releasing a new version. Starting having a closer look at TCP port 80 and TCP port 443 traffic in my Waikato traces, with the aim of trying to get as much traffic correctly classified as I can prior to doing an in-depth analysis of what is actually using those ports.
Spent Friday afternoon reading over Darren's honours report and providing some hopefully useful feedback.
The long-awaited libtrace 4 is now available for public consumption! This version of libtrace includes an all new API that resulted from Richard Sanger's Parallel Libtrace project, which aimed to add the ability to read and process packets in parallel to libtrace. Libtrace can now also better leverage any native parallelism in the packet source, e.g. multiple streams on DAG, DPDK pipelines or packet fanout on Linux interfaces.
At this stage, we are considering the software to be a beta release, so we reserve the right to make any major API-breaking changes we deem necessary prior to a final release but I'm fairly confident that the beta release will be fairly close to the final product. At the same time, now is a good time to try the new API and let us know if there are any problems with it, as it will be difficult to make API changes once libtrace 4 moves out of beta.
Please note that the old libtrace 3 API is still entirely intact and will continue to be supported and maintained throughout the lifetime of libtrace 4. All of your old libtrace 3 programs should still build and run happily against libtrace 4; please let us know if this turns out to not be the case so we can fix it!
Learn about the new API and how parallel libtrace works by reading the Parallel Libtrace HOWTO.
Download the beta release from the libtrace website.
Send any questions, bug reports or complaints to contact [at] wand [dot] net [dot] nz
Fixed the issues with BSD interfaces in parallel libtrace. Ended up implementing a "bucket" data structure for keeping track of buffers that contain packets read from a file descriptor. Each bucket effectively maintains a reference counter that is used to determine when libtrace has finished with all the packets stored in a buffer. When the buffer is no longer needed, it can be freed. This allows us to ensure packets are not freed or overwritten without needing to memcpy the packet out of the buffer it was read into.
Added bucket functionality to both RT and BSD interfaces. After a few initial hiccups, it seems to be working well now.
Continued testing libtrace with various operating systems / configurations. Replaced our old DAG configuration code that uses a deprecated API call to use the CSAPI. Just need to get some traffic on our DAG development box so I can make sure the multiple-stream code works as expected.
Managed to add another two protocols to libprotoident: Google Hangouts and Warthunder.
Finished the parallel libtrace HOWTO guide. Pretty happy with it and hopefully it should ease the learning curve for users who want to move over to the parallel API once released.
Continued working towards the beta release of libtrace4. Started testing on my usual variety of operating systems, fixing any bugs or warnings that cropped up along the way. It looks like there are definitely some issues with using the parallel API with BSD interfaces, so that will need to be resolved before I can do the release.
Now that I've got a full week of Waikato trace, I've been occasionally looking at the output from running lpi_protoident against the whole week and seeing if there are any missing protocols I can identify and add to libprotoident. Managed to add another 6 new protocols this week, including Diablo 3 and Hearthstone.
Met with Rob and Stephen from Endace on Thursday morning and had a good discussion about how we are using the Endace probe and what we can do to get more out of it.
Fixed the bug that was causing my disk-writing wdcap to crash. The problem was a slight incongruity in the record length stored in the ERF header and the amount of actual packet available, so we were touching bad memory occasionally. Since fixing that, wdcap has been happily capturing since Tuesday morning without any crashes or dropped packets.
Continued working towards a releasable libtrace4. Polished up the parallel tools and replaced the old rijndael implementation in traceanon with the libcrypto code from wdcap. Made sure all the examples build nicely and added a full skeleton example that implements most of the common callbacks. Fixed all the outstanding compiler warnings and made sure all the header documentation was correct and coherent.
Started developing a HOWTO guide for parallel libtrace that introduces the new API, one step at a time, using a simple example application. I'm about half-way through writing this.
Added anonymisation to the new wdcap, using libcrypto to generate the AES-encrypted bits that we use as a mask to anonymise IP addresses. This has two major benefits: 1) we don't need to maintain or support our own implementation of AES and 2) libcrypto is much more likely to be able to detect and use AES CPU instructions.
Deployed the new wdcap and libtrace on the Endace box. I've managed to get wdcap running but the client that is writing the packets to disk has a tendency to crash after an hour or two of capture, so there's still a wee way to go before it's ready. The wdcap server that is capturing off the DAG card and exporting to the client has been pretty robust.
Started on polishing up libtrace4 for a beta release. There's still quite a bit of fiddly outstanding work to do before we can release. Fixed the tick bug that I discovered a couple of weeks back and finished up the new callback API that I had proposed to Richard a few months back that he had mostly implemented. Ported tracertstats and tracestats to use the new API.
Continued working on the new wdcap. Wrote and tested both the disk and RT output modules. Re-designed the RT protocol to fix a number of oversights in the original design -- unfortunately this means that RT from wdcap4 is not compatible with libtrace3 and vice-versa.
Added parallel support for RT input to libtrace4 -- prior to this, RT packets were not read in a way that was thread-safe so using RT with the parallel API would result in some major races. The RT packets are now copied into memory owned by the libtrace packet before being returned to the caller; this adds an extra memcpy but ensures concurrent reads won't overwrite each other. Using a well-managed ring buffer could probably get rid of the memcpy, so I'll probably look into that once I've got everything else up and running.
Spent Wednesday at the Honours conference.
Continued working on the new WDCap. Most of my time was spent writing the packet batching code that groups processed packets into batches and then passes them off to the output threads.
Along the way I discovered a bug with parallel libtrace when using ticks with a file that causes the ordered combiner to get stuck. Spent a bit of time with Richard trying to track this one down.
Listened to our Honours student's practice talks on Thursday and Friday afternoon. Overall, was fairly impressed with the projects and what the students had achieved and am looking forward to seeing their improved talks on Wednesday.