User login

Shane Alcock's Blog




Worked on the camera-ready version of my IMC paper. Managed to add some nice content to address the reviewer feedback we got, only to find that I had been using a font size that was too small (i.e. the default font size that every previous IMC has ever used). Unfortunately, switching to the bigger font size would mean I would have to remove almost all the new content I had added, so I'm hoping the PC chairs will change their mind and revert back to the old font size.

Wrote the basic architecture for a provenance log parsing library that can be used with both live progger records and sysdig log files. This will replace the old progger-central which I had written as a hacky PoC which was in danger of becoming production code otherwise.

Got my script to extract common patterns from Sysdig logs working reasonably well. Took a few attempts to get some nicely formatted output that contains all the information I should need to track down what actions are causing the repeated patterns.

Spent a fair bit of time helping CROW get a handle on the Endace Probe, what it can do and how it might fit into their research goals.

Listened to our 520 student practice talks on Thursday. The projects themselves are pretty good -- just the usual issue of the students underselling just how much actual work they had put in to the development side of their project.




A short paper by myself, JP Möller and Richard Nelson titled "Sneaking Past the Firewall: Quantifying the Unexpected Traffic on Major TCP and UDP Ports" has been accepted for publication at this year's upcoming IMC. We'll post the final version of the paper once I've finished making the final revisions, but feel free to get in touch if you want a sneak peek.

As part of this research, we spent a lot of time investigating traffic on TCP and UDP ports 53, 80, 443, 8080 and 8000 that did not match the 'expected' application protocol for that port. At the outset of this work, the vast majority of the traffic was unable to be identified by libprotoident so we ended up adding or improving quite a few libprotoident rules. Our reviewers were particularly interested in the new rules that we created but space limitations in the paper itself mean that we are unable to include a lot of detail about the new rule developments in the text.

Therefore, this page is intended to serve as an addendum to the published paper by explicitly stating which protocols were identified as a result of the research paper and provide links to the source code in libprotoident where the new rules are defined.

Entirely New Applications (21)

  • the purpose of this protocol is not entirely clear but the remote hosts involved are typically owned by (a Chinese antivirus company).

  • 360 Safeguard: update protocol used by 360 Safeguard, a Chinese antivirus.

  • Airdroid: Application for remotely controlling Android devices from a desktop computer.

  • Bad Baidu: Strange behaviour observed on hosts with the Baidu web browser installed. Appears to be some sort of phone-home protocol, but manages to blatantly violate TCP specs in the process.

  • Dianping: Chinese online-shopping and establishment rating smartphone app. Also has a UDP protocol.

  • Kakao: Korean messaging and chat for smartphones.

  • Kankan: Chinese Video streaming service. Also has a UDP protocol.

  • Kuaibo: Chinese Video streaming service.

  • Kugou: Chinese Music streaming service.

  • Norton Backup: Backup and recovery service run by Norton, better known for their antivirus products.

  • QQ Download: File downloading software created by Tencent, who are also behind the popular Chinese messaging tool, QQ.

  • QQ PC Manager: Anti-malware software created by Tencent.

  • Telegram: Cloud-based messaging service with an emphasis on security.

  • Tensafe: Anti-cheating software that is integrated with major online games published by Tencent in China (such as Blade and Soul).

  • Weibo: Chinese microblogging service.

  • Wolfenstein: Enemy Territory: Free online multiplayer game, released in 2003 but still played.

  • Xiami: Chinese Music streaming service, owned by Alibaba.

  • Xunlei JSQ: Game acceleration service from the company behind Xunlei (a.k.a. Thunder).

  • Xunlei VIP: Fast download service for VIP users of Xunlei (Thunder), which pulls cached content from Xunlei servers rather than the standard P2P from other Xunlei users.

  • Xunyou: Chinese game acceleration service.

Existing Protocols Improved (10)

  • DNS: Protocol for mapping hostnames to IP addresses. If you're reading this, you should know what DNS is for.

  • Fortinet: Protocol for updating Fortinet network appliances.

  • Kaspersky: Russian security software.

  • NTP: Time synchronisation protocol.

  • QQ: Very popular Chinese instant messaging application.

  • QUIC: Protocol originally developed by Google for transferring streamed content (especially YouTube video) over UDP.

  • Taobao: Chinese online marketplace, similar to Amazon.

  • WeChat: Another popular Chinese messaging application.

  • Xunlei: Also known as Thunder. A Chinese file sharing system which also leverages other P2P technologies, e.g. BitTorrent, eDonkey etc.

  • Youku: Chinese video hosting / streaming service, somewhat analogous to YouTube.




Got NNTSC and amp-web working with the sysdig data that Harris gave me, so we have a simple proof-of-concept. After talking with Harris some more, he is interested in finding patterns in the syscall logs that are "predictable" so that we can build models of known specific actions on a system (e.g. opening a file with vim, starting a python interpreter etc). Started working on a script to identify common patterns in the sysdig logs so that we can get an idea as to what these patterns look like and how hard they will be to recognise and identify.

Continued tracking down unknown traffic patterns with libprotoident. Managed to nail one pattern that had been bugging me for a long time: the Baidu Yun P2P protocol. Also added rules for YY, Overwatch, Zoom TCP and NetCat CCTV.




My IMC paper on unexpected traffic on well-known ports was accepted, which is great news. Spent Monday going over reviewer feedback and thinking about what revisions I need to make for the camera-ready version.

Continued working on integrating STRATUS with NNTSC. Spent way too much time trying to figure out why my data was not being inserted into the Influx database -- turns out the timestamp for the test data I was using was too old for the default retention policies so it was being automatically discarded. Fudged the test data times to be more recent and it finally worked.

Added file operations metric support to ampy and amp-web so we can now look at simple graphs of open frequency data. Found some scalability issues with our modal dialogs in cases where the number of possible options for a dropdown is very high, so I've gone back and added pagination support to all modal dropdowns so they only load 30 or so options at a time. This had some interesting flow-on effects, especially for the latency modal dialog which had a lot of custom code for populating the tabs for the different latency metrics. I think I've ironed out all of the extra wrinkles now.

Spent a little more time with the July traces to track down some more unknown protocols. Added a rule for the Netcore vulnerability scan (which happens a lot!) and updated rules for a lot of (mostly game-related) protocols.




Started working on integrating some of the STRATUS metrics into NNTSC so that we can explore using time-series based event detection to highlight potentially interesting file interactions. Going forward, I'm going to be splitting my time 50:50 between STRATUS development and WAND research work -- existing research might progress a bit slower as a result.

Continued poking at unknown flows in the July trace data. Added protocols for Final Fantasy XIV and Facebook Messenger. Noticed that we are still having issues with the vDAG pipe on the probe that services wdcap dropping packets so our captures are sometimes missing packets. Moving IP encryption off onto wraith seems to have helped with this, but is not an ideal solution.




Short week after taking leave on Monday and Tuesday.

Spent most of my remaining week looking at some new captures I took using the upgraded Probe. The main aim was to see whether there were any new protocols that libprotoident should be able to identify. Managed to find a handful of new protocols: Facebook Zero, Forticlient SSL VPN and Discord, as well as made some improvements to the rules for existing protocols (including the AMP throughput test!).

Most of my time was actually spent unsuccessfully hunting down what appears to be a new Chinese P2P protocol, which is a shame because it was contributing a very large amount of unknown traffic in my sample dataset.

Using BSOD on the live traffic feed also allowed me to spot a student that was doing vast quantities of torrenting on the campus network (which Brad reported to ITS) and our WITS FTP server being hammered with tons of download attempts from China. Fair to say, we've gotten some good milage of the upgraded Probe already.

Fixed a couple of outstanding bugs in amp-web. Should be ready to push some new packages out to skeptic and lamp early next week now.




Ported my event group pruning code from amp-web to a separate daemon that runs as part of netevmon. Rather than tweaking the event groups prior to displaying them on the dashboard, the daemon periodically fetches the most recent event groups from the database and checks for any redundancies that can be pruned. If any are found, the database itself is updated in place.

The benefits of this approach over the amp-web approach are that we can save on space in the event database and we don't need to do the full redundancy processing every time someone loads the dashboard. The one downside is that any merges are effectively permanent so I have to be very careful about testing my redundancy checks before rolling them out live.

Found and fixed some more Influxdb memory problems when using the matrix. Most of the problems related to us using the last() function, which for some reason can result in Influxdb loading the whole table into memory. I've managed to rewrite the queries that used last() so that they don't require anywhere near as much memory (or processing time) so tooltips, in particular, should be a lot faster to process and less likely to push the server into swap.

Got the waikato capture point back up and running after its disks were replaced on Thursday. Used it to demo BSOD to various visitors who were here for the CSC.




Continued reading over Stephen's thesis.

Further refined my event dashboard improvements. Added an algorithm that should recognise redundant event groups based on ASNs that the groups have in common with other groups that occur at the same time. This allows us to get rid of a large number of the vague UoW-REANNZ-AARNet, REANNZ-AARNet and UoW-REANNZ groups that were cluttering up the dashboard on prophet. Found and fixed a few bugs with the self-updating dashboard that were causing event groups to disappear or appear in the wrong order.

Added a working summary graph to the traceroute path map view, with the added benefit of making the selector appear and actually work for this graph.

Continued to battle with InfluxDB's memory usage on prophet. Experimented with tuning a variety of configuration options to try and avoid some of the surges that we occasionally see. Since these surges usually eventually result in the OOM killer being invoked, we need to be able to better control the memory usage before we can consider rolling InfluxDB into production.




Spent most of my week looking into methods for reducing some of the redundant event groups that appear on the amp-web dashboard. Came up with an algorithm for detecting smaller groups that are already covered by one large group, as well as one for detecting when a large group should be removed in favour of the smaller sub-groups.

Implemented my techniques on prophet, but the range of event groups that I get are a bit limited to be sure that everything is working correctly. Next week I may look into grabbing a copy of skeptic's event database to see how well things work on a more diverse set of event groups.

Spent some time reading over Stephen's revised thesis.




Back into it after a couple of weeks spent moving house.

Worked with Brendon to get nntsc, ampy and amp-web upgraded on skeptic. Also got netevmon running on skeptic so we now have event detection running on the public AMP mesh.

While I was away, InfluxDB ran out of memory and died on prophet. Trying to catch up on the backlog of data kept causing InfluxDB to use ridiculous amounts of memory so I had to spend a decent chunk of my week chasing the cause down. At this point, my biggest wish is that someone will add sensible memory management to InfluxDB.

Did a bit of preliminary writing of a possible paper on NNTSC. Organised some of my thoughts on network measurement ecosystems and turned them into a blog post.