Shane Alcock's Blog
Continued making progress with my unidentified mice flows in libprotoident. Added a whole pile of new rules, mostly for various Chinese apps again. Have probably done enough now that I can draw a line under this and start writing the paper itself; there are a few obvious patterns that I would like to identify but this has consumed a lot of time already.
Answered a handful of questions from 513 students -- mostly intelligent ones, so I'm reasonably confident about how the class is going overall. Due date is this coming Friday, so we'll know for sure soon enough.
Helped finish off the funding proposal in the first half of the week.
Continued working with libprotoident. This week I gave up on the elephant flows and started looking at the mice flows. Found some interesting stuff; the highlight being a huge number of flows on TCP port 80 that seem to be associated with the Baidu web browser. The behaviour of these flows is particularly odd: connect to server, send a FIN with seqno N, retransmit FIN a few times, send a non-FIN packet with 1 byte of payload (0x00) and seqno N-1 (incredibly invalid TCP behaviour!), server sends a RST. End result is > 150,000 flows over a week on port 80 with a single outgoing byte of payload.
Added some filters on the Endace probe to see if we can find people doing this traffic on campus, as the Baidu browser is pretty well-known for having a tendency to leak all sorts of private data back to its masters. Found multiple staff PCs that appear to be doing this sort of traffic, so Brad and I will try to prepare a report for ITS next week.
Met with Nathan at Lightwire on Thursday afternoon re: AMP and netevmon. Came away with plenty of ideas and suggestions for improvements we can make and hopefully we also helped Nathan understand parts of our system better as well. The good news is that netevmon seems to mostly be picking up valid events, but even so the number and frequency of these events can be overwhelming so we need better control over what events are shown to the user.
Worked on the next MBIE funding proposal document. Still got a fair way to go so this will probably eat up a lot of next week too.
Continued trying to identify the remaining Unknown applications in the Waikato Sept 15 traces. Only managed to identify one new protocol (Xunlei Accelerated) but this did account for 14G of unknown traffic on TCP port 8080 so that has gotten rid of the biggest outstanding quantity of unknown traffic. The rest are looking like they might get the better of me -- it's almost all Chinese in origin and I can identify the parent company (Tencent, CERNET, Taobao etc) but actually figuring out which of the myriad of apps these companies own is mostly just trial and error at this stage.
Continued working away at the Unknown traffic from my libprotoident port study. Added new protocols for Telegram Messenger and Kuguo, as well as improved DNS (especially TCP DNS) and NTP matching. I still have a bit more Unknown traffic to identify before I'd be comfortable putting the results in a paper, but we're getting closer.
Gave my 513 lectures this week. Looking forward to seeing how the class get on with my assignment.
Met with Ryan Jones who is doing an Honours project that will use netevmon to try and find events in the CSC data. Gave him access to the code and a few hints to start out, but I imagine I'll have to dedicate some more time to this over the course of the year.
My fixes to Andy's InfluxDB code seems to be resulting in consistent and correct bins being stored in the rollup tables. Threw netevmon at the development system to see if it can cope, which it seems to be doing OK. There's still a bit of a concern around long-term memory usage, but I'll see how that pans out over the next couple of weeks.
Spent the rest of my week concentrating on finishing up JP's summer study on unexpected traffic on typically open ports. Managed to improve a few existing rules to recognise more traffic, as well as add new rules for QQ video chat and what appears to be a C&C covert channel for some Chinese malware using UDP port 53. Started framing up a paper for IMC based on this study.
Did some final prep work for the libtrace lectures and assignment for 513.
Arrived back in NZ on Monday, back at work on Tuesday. Brought Brendon and Richard N. up to speed on the things I learned at AIMS and the potential collaboration opportunities I discussed with people there. Spent a bit of time writing emails to chase up on some of these opportunities.
Deployed Andy's InfluxDB code on prophet. Spent much of the rest of the week playing around with the continuous query system to try and fix some outstanding issues caused by Influx's design decision to never automatically backfill the aggregated series when older / lagged data is received (e.g. when restarting NNTSC after an outage or AMP results arriving 40 seconds later than their timestamp due to timeouts). This was a bit trickier than you would think because there's no obvious way to find out when the last automatic continuous query ran (they don't happen exactly on the bin boundary) so I have to guess based on the current time, the time the bin should have ended and the timestamp of the current result.
Spent my week in San Diego attending the BGP hackathon and the AIMS workshop.
The hackathon went really well. I was so intimidating that nobody wanted to join my team, but I still managed to add a lot of useful filtering capabilities to CAIDA's BGPStream software. Will try to write a more detailed blog post on what I did at some point, but it was enough to win myself a prize for being one of the top teams.
The AIMS workshop was also very valuable, as there was definitely some interest in what we have been doing with both AMP and NNTSC. In particular, it seems that AMP might have some value for some big ISPs outside of New Zealand. Looking forward to seeing what comes from the discussions I had with various workshop attendees.
Finished prep for AIMS and hackathon. Gave a practice run of both of my talks on Tuesday and got some useful feedback. Incorporated this into the talks on Wednesday.
Flew out to San Diego on Thursday and had a quiet day to rest and recover on Friday before the hackathon on the weekend.
Continued prepping for the trip to San Diego. Wrote a talk on AMP to present at AIMS, since Brendon won't be able to attend. Managed to finally settle on a project that I'll be working on at the BGP hackathon: adding useful filtering to the BGPstream software.
Met with Harris to talk about the CSC dataset and how he can go about looking for interesting events in the dataset. Wrote some example code to extract a metric from the data (syscalls per second for each major type) and added a module to NNTSC for the new metric. Hopefully Harris will be able to use that to start adding his own metrics.
Noticed that there was a lot of variation in my rtstats performance test results. It seems that the achievable packet rate for ring: seems to fluctuate from test to test, but will remain constant within a test. For example, one test run I'll get 1 million packets per second consistently for 60 seconds, the next test run I'll get 40,000 packets per second for 60 seconds. Spent a lot of time looking into this further (including an afternoon with Richard S. trying out various things), but we're still unable to account for this inconsistency.
Ran some full experiments with the stats and rtstats workloads (using ring: to capture) to make sure the numbers match up with what Richard was seeing in his earlier tests. So far, we're getting the expected behaviour: adding more threads makes stats perform worse (due to threading overhead outweighing any performance gain), but helps with rtstats. Wrote the section in the paper that describes our evaluation methodology, so now I just need to fill it in with some results!
Wrote my talk on NNTSC for AIMS. It's a 10 minute talk that is meant to provoke some discussion, so it is pretty light on implementation details. At least I hope people will come away from the talk knowing that there are some battles in this space that we've already fought so they won't repeat our mistakes.
Helped Andy get started with NNTSC so he can try implementing some InfluxDB support for storing data. The idea so far is to keep postgres around for doing the things it does well (streams, traceroute data) and use Influx for the rest.