User login

Shane Alcock's Blog

07

Mar

2016

Continued working away at the Unknown traffic from my libprotoident port study. Added new protocols for Telegram Messenger and Kuguo, as well as improved DNS (especially TCP DNS) and NTP matching. I still have a bit more Unknown traffic to identify before I'd be comfortable putting the results in a paper, but we're getting closer.

Gave my 513 lectures this week. Looking forward to seeing how the class get on with my assignment.

Met with Ryan Jones who is doing an Honours project that will use netevmon to try and find events in the CSC data. Gave him access to the code and a few hints to start out, but I imagine I'll have to dedicate some more time to this over the course of the year.

01

Mar

2016

My fixes to Andy's InfluxDB code seems to be resulting in consistent and correct bins being stored in the rollup tables. Threw netevmon at the development system to see if it can cope, which it seems to be doing OK. There's still a bit of a concern around long-term memory usage, but I'll see how that pans out over the next couple of weeks.

Spent the rest of my week concentrating on finishing up JP's summer study on unexpected traffic on typically open ports. Managed to improve a few existing rules to recognise more traffic, as well as add new rules for QQ video chat and what appears to be a C&C covert channel for some Chinese malware using UDP port 53. Started framing up a paper for IMC based on this study.

Did some final prep work for the libtrace lectures and assignment for 513.

22

Feb

2016

Arrived back in NZ on Monday, back at work on Tuesday. Brought Brendon and Richard N. up to speed on the things I learned at AIMS and the potential collaboration opportunities I discussed with people there. Spent a bit of time writing emails to chase up on some of these opportunities.

Deployed Andy's InfluxDB code on prophet. Spent much of the rest of the week playing around with the continuous query system to try and fix some outstanding issues caused by Influx's design decision to never automatically backfill the aggregated series when older / lagged data is received (e.g. when restarting NNTSC after an outage or AMP results arriving 40 seconds later than their timestamp due to timeouts). This was a bit trickier than you would think because there's no obvious way to find out when the last automatic continuous query ran (they don't happen exactly on the bin boundary) so I have to guess based on the current time, the time the bin should have ended and the timestamp of the current result.

16

Feb

2016

Spent my week in San Diego attending the BGP hackathon and the AIMS workshop.

The hackathon went really well. I was so intimidating that nobody wanted to join my team, but I still managed to add a lot of useful filtering capabilities to CAIDA's BGPStream software. Will try to write a more detailed blog post on what I did at some point, but it was enough to win myself a prize for being one of the top teams.

The AIMS workshop was also very valuable, as there was definitely some interest in what we have been doing with both AMP and NNTSC. In particular, it seems that AMP might have some value for some big ISPs outside of New Zealand. Looking forward to seeing what comes from the discussions I had with various workshop attendees.

11

Feb

2016

Finished prep for AIMS and hackathon. Gave a practice run of both of my talks on Tuesday and got some useful feedback. Incorporated this into the talks on Wednesday.

Flew out to San Diego on Thursday and had a quiet day to rest and recover on Friday before the hackathon on the weekend.

02

Feb

2016

Continued prepping for the trip to San Diego. Wrote a talk on AMP to present at AIMS, since Brendon won't be able to attend. Managed to finally settle on a project that I'll be working on at the BGP hackathon: adding useful filtering to the BGPstream software.

Met with Harris to talk about the CSC dataset and how he can go about looking for interesting events in the dataset. Wrote some example code to extract a metric from the data (syscalls per second for each major type) and added a module to NNTSC for the new metric. Hopefully Harris will be able to use that to start adding his own metrics.

Noticed that there was a lot of variation in my rtstats performance test results. It seems that the achievable packet rate for ring: seems to fluctuate from test to test, but will remain constant within a test. For example, one test run I'll get 1 million packets per second consistently for 60 seconds, the next test run I'll get 40,000 packets per second for 60 seconds. Spent a lot of time looking into this further (including an afternoon with Richard S. trying out various things), but we're still unable to account for this inconsistency.

25

Jan

2016

Ran some full experiments with the stats and rtstats workloads (using ring: to capture) to make sure the numbers match up with what Richard was seeing in his earlier tests. So far, we're getting the expected behaviour: adding more threads makes stats perform worse (due to threading overhead outweighing any performance gain), but helps with rtstats. Wrote the section in the paper that describes our evaluation methodology, so now I just need to fill it in with some results!

Wrote my talk on NNTSC for AIMS. It's a 10 minute talk that is meant to provoke some discussion, so it is pretty light on implementation details. At least I hope people will come away from the talk knowing that there are some battles in this space that we've already fought so they won't repeat our mistakes.

Helped Andy get started with NNTSC so he can try implementing some InfluxDB support for storing data. The idea so far is to keep postgres around for doing the things it does well (streams, traceroute data) and use Influx for the rest.

15

Jan

2016

Tracked down a segfault in the Ostinato drone whenever I tried to halt packet generation on the DPDK interfaces. This took a lot longer than it normally would have, since valgrind doesn't work too well with DPDK and there are about 10 threads active when the problem occurs. It eventually proved to be a simple case of a '<=' being used instead of a '<', but that was enough to corrupt the return pointer for the function that was running at the time, causing the segfault.

Once I fixed that, I was able to write some scripts to orchestrate sending packets at specific rates for a period of time, while having a libtrace program running at the other end of the link trying to capture and process these packets. Once the packet generation is over, the libtrace program is halted. This will form the basis of my experiments to determine how much traffic we can capture and process with parallel libtrace. The experiments will use different capture methods (ring, DAG, DPDK, PF_RING etc), different packet rates, different numbers of processing threads (from 1 - 16) and different workloads ranging from just counting packets to cryptopan anonymisation.

My initial tests have shown that the numbers of dropped packets are not particularly consistent across captures with otherwise identical parameters, so I'll have to run each experiment multiple times so that I can get some more statistically valid results.

Also spent a bit of time helping Brendon capture some traces of his ICMP packets to help figure out whether his timing issues are network-based or host-based.

11

Jan

2016

Added a new graph type to the AMP website for showing loss as a percentage over time. This graph is now shown when clicking on a cell in the loss matrix, as well as being able to be accessed through the graph browser. Fixed a complaint regarding the matrix where clicking on a cell in an IPv4 only matrix would take you to a graph showing lines for both IPv4 and IPv6 so you would never get the smokeping-style colouring via the matrix.

Started messing around with ostinato scripting on the 10g dev boxes and using DPDK to generate packets at 10G rates. Had a few issues initially because I was using an old version of the DPDK-enabled ostinato that Richard had lying around; updating to Dan's most recent version seemed to fix that.

Spent a bit of time looking at the data collected during the CSC and how it might be able to be used as ground truth for developing some security event detection techniques.

18

Dec

2015

Finished up the implementation chapter of the libtrace paper. Added a couple of diagrams to augment some of the textual explanations. Got Richard S. to read over what I've got so far and made a few tweaks based on his feedback.

Spent a decent chunk of time looking at Unknown UDP port 80 traffic in libprotoident. Found a clear pattern that was contributing most of the traffic, which I traced back to Tencent. Unfortunately Tencent publishes a lot of applications so that knowledge wasn't conclusive on its own.

My initial suspicion was that it might have been game traffic so I downloaded and played a few popular multiplayer games via the Tencent games client, capturing the network traffic and comparing it against my current unknown traffic. No luck, but then I had the bright idea to look a bit more closely at video call traffic in WeChat (a messaging app). Sure enough, once I was able to successfully create two WeChat accounts and get a video call going between them, I started seeing the traffic I wanted.

Also added rules for Acer Cloud and OpenTracker over UDP.