This study investigates the traffic destined for hosts that had been previously active but had not sent any packets for some time (so are presumably now inactive) to see whether ISP customers were receiving, and therefore being charged for, significant quantities of traffic when they were not actively using their Internet connection.
We also look at the applications that were responsible for any such traffic, as this may provide some indication as to why customers were receiving the traffic even though they were no longer actively using their connection. Finally, we also examine traffic sent to IP addresses that were never active so that we can distinguish between regular unsolicited background traffic and the traffic patterns that result from previous customer activity.
Rather disrupted week this week, only in for three days total.
Draft version of the new sessions paper is nearly finished. Thankfully, running the old analysis against new traces has produced similar results so I can "borrow" most of the text from an old rejected paper on outbound session analysis.
Checked the results of my sleeper analysis using the longer idle time threshold. Again, not much change to the overall results but I can feel more comfortable with the distribution of idle period lengths now. Have processed the 2009 and 2011 datasets using the new threshold.
Created some anonymised versions of the ISP 2009 traces for Asad. In the process I found a weird libtrace threaded I/O bug where the last block of compressed data won't be written out before the file is closed under very specific circumstances. This one is going to be a pain to track down...
Received a new version of NAVL from Vineyard, but unfortunately there is still a problem with double entries in the internal flow cache. I've created a NAVL-only version of the program I've been using and sent that off to them along with a small sample trace that should replicate the problem.
Got some good news in that our ATNAC paper has been recommended for publication in Telecommunication Systems. However, we need at least 40% new content on top of what we've already got and it needs to be ready by Jan 22. Richard suggested we chuck in the work I did measuring outbound TCP and UDP sessions for the SPNAT study, so I started running the analysis against some more recent traces and changing the introductory material to talk about outbound sessions as well as inbound.
Got my degradation graphs looking the way I wanted them to, but a bit of extra analysis revealed that I may have set my sleeper threshold too low. Most of the "sleeping" periods were only just longer than the original threshold of 5 minutes. I've repeated some of the earlier analysis with a threshold of 30 minutes to see how much of a difference that makes.
Continued looking into properties of sleeper traffic, primarily the rate at which sleeper traffic quantities degrade as the host continues to be idle. This has proved a bit tricky to visualise well, but finally managed to come up with what I think should be a useful graphing approach. This did require a lot of battling with R, though.
The fixed version of NAVL was not available last week, but I was able to continue looking at cases where PACE was able to identify traffic that libprotoident could not. Brad set me up with a Windows VM so that I could download various apps and capture traffic while using them, so that I can confirm PACE's classifications and add or update libprotoident's rules so that we can match the traffic as well. This meant I got to have a bit of fun playing Second Life and hanging out in chatrooms....
Started moving towards a new release of libprotoident, seeing as I've now added or updated the rules for quite a few protocols.
Started collating together the results of my analysis of dark and sleeper traffic in the ISP traces. It's not finished yet, but the results I have so far can be viewed at http://www.wand.net.nz/~salcock/sleepers/
CCR rejected my libprotoident paper, primarily due to a reviewer stating that we had not compared against the "state of the art" described in a paper from 2006 (http://www-rp.lip6.fr/site_npa/site_rp/_publications/737-conextFinal.pdf). This particular technique requires no packet payload, but is only able to identify 10 different TCP application protocols (although I can supposedly create new models for other TCP applications).
I tested the default models against some ISP traffic and found that it performed much better than I had expected, but was still less accurate than the weakest of the OSS DPI techniques. Their failure rate (in terms of misclassified bytes) was 24%, compared with 4.5% for libprotoident.
Started integrating Vineyard's NAVL library into my traffic classification evaluation tool. Started out OK, but ran into a few problems with not being able to force NAVL to expire internal entries for UDP flows when I have decided the flow has ended. This creates a problem if the 5-tuple reappears later, as NAVL returns an error when I try to create a new NAVL connection for that flow because NAVL believes the flow already exists. I've filed a support request, so hopefully I'll get some sort of solution in the next day or two.
Continued integrating Simon's OSPF code into libtrace.
Started looking into the traffic sent to "sleeper" hosts, i.e. IP addresses that have been active but are now inactive. Still putting together the initial results, but there is definitely a difference between the traffic observed heading to "dark" hosts vs the traffic observed heading to sleepers.
During the sleeper analysis, I've been able to improve a few of the libprotoident rules to correctly match more of the traffic I've been looking at.
Began integrating Simon's OSPF parsing code into libtrace. Been slightly trickier than I had anticipated due to major differences between OSPFv2 (which Simon's code parses) and OSPFv3 (which we may want to parse in future).
Had a brief phone meeting with Vineyard Networks. They've agreed to give us access to their NAVL library for evaluation.