Short final week for the year, as I had to take a couple of days of leave.
Finished fixing the highlighting of segments on the AS traceroute graph. I ended up going with a borderless approach as it was very difficult to get the border drawing right in a number of cases). Instead, the highlighted segment becomes slightly brighter which has much the same effect.
Added AS names to both the AS traceroute and monitor map graphs. These come from querying the Team Cymru whois server via its netcat interface and are heavily cached, so we shouldn't have to make too many queries. The monitor map has also been updated to use the same colour to draw nodes that belong to the same AS.
Migrated the last of the old libtrace trac wiki pages over to our GitHub wiki.
Kept working through improving the SSL code to exchange keys and operate
the control socket on the amplet client. It now validates the
Diffie-Hellman parameters before using them (aborting if they are not up
to scratch), and I disabled compression to avoid another known attack
vector. Validated that the options I was setting were set, and the
protocols/ciphers I enabled were in fact the only ones being used. Spent
some time refactoring the code to be cleaner and easier to follow and
also added more logging to help make it obvious what was going on at
each step where something could fail.
Found a test that was failing to pass when building packages for 32bit
Debian and spent some time trying to fix it. Some constants I was using
to test edge cases in scheduling were too large and overflowing
variables giving incorrect results. Forcing them all to be the expected
size fixed that and the tests all pass.
Finished updating the AMP latency event ground truth to include our new detectors. Generated some fresh probabilities for use in Meena's DS code. Also generated some probabilities based on the magnitude of the change in latency for an event so that we are more likely to recognise a large change as significant even if only one or two detectors fire for the event.
Updated the tooltips on the amp-web graphs to show the timestamp and value for the portion of the graph that the mouse is hovering over.
Started looking into fixing the bad border drawing on the AS path graphs, which would result in borders being drawn between segments that should be combined.
I did some benchmarking of TPACKETV2 and TPACKETV3 and found that TPV3 performs much the same as TPV2 but with a considerably lower CPU usage (due to processing blocks rather than individual packets). With really low traffic flows, TPV3 was dropping packets. Replicate by sending just two packets to TPV3 and TPV2, and notice that TPV3 doesn't see them ever.
This is a kernel bug! TPACKETV3 has an issue where if the kernel marks a block as timed out, it doesn't notify any of the poll watchers. If the kernel marks every block as timed out, the next packet to be received on the interface will be dropped (I'm not sure if this extends to multiple packets, but I think it does). I wasn't able to track this down during the week and plan to continue it on Monday (this blog is written in retrospect).
Further work has been done on my thesis. I have written a section on suggested usage for the many sources to many destinations scenario. This is based on the cases that we have available which are few to many and the inferred many to few case. The time frame for data collection is based on comparison of published results with my results. This comparison suggests that there is a high rate of temporary change of Internet topology but that there is a slower rate of permanent change. This makes a more workable rate of data collection seem feasible.
By removing DTLS I could communicate with the CoAP server (cc2538 device) via the Copper client on my host VM. I can get the uptime, and turn on and off each led light on the device.
I've been testing the er-rest-server and er-rest-client code provided by contiki (which uses the CoAP protocol) on the devices to understand what works with the cc2538 devices and to get a better understanding of the code. Aim is to have the devices wake up at regular time intervals and publish their data to a CoAP server (on the 6lbr router) then sleep after receiving an ACK to conserve power.
Did a bit of research into the contikiMAC driver which allows the radio on the device to be switched off for more than 99% of the time whilst still being able to relay packets. This would be ideal to further limit power consumption in the WSN.
Spent those last 2 weeks implementing and testing event grouping and writing those events to a Postgres database on Prophet. Shane also finished collecting ground truth and generating an updated set of probabilities including the newest detectors (BinSegChangepoint and SeriesModeDetector), so I worked on calculating the accuracy of the new probabilities. Shane also calculated a series of probabilities for using the magnitudes as another source of input for Dempster-Shafer by calculating the probability that a magnitude from a certain bin ( e.g. 0-10, 11-20, 21-30, etc) is significant, a FP or undecided. I added those values to the DS calculations and ensured that the magnitude probs would only be used as a source of input once, i.e. the DS score is calculated for the detectors that had fired at the time, and then the magnitude is combined with the DS score.
Because of previous experience with indexing by streams, we decided to test the use of a separate table for each stream. I implemented the creation of new tables as they are encountered, adding events to each stream's tables and started working on updating the appropriate event after a detection occurs. Testing was quite sluggish, so will need to bug Brad about running Postgres on spectre instead of prophet.
I have concluded the hard investigation for my literature review, and am now focusing on including more details in my summaries and general polish to the writing.
I am also investigating Data Steam Mining after having a conversation with Sam Sarjent about the topic. I have arranged a meeting with Bernhard Pfahringer to discuss this area of research further, and see if it can be used for this project.
This week I shall conclude writing the literature review, and continue investigating data stream mining techniques for flow classification. Hopefully by the end of the week I shall be able to make an informed decision about which type of flow classifier would be practical and accurate enough to build an implementation of.
Continued work on the DPDK format this week. I ran into a few more issues with perf crashing the kernel, so upgraded one machine to jessie with the 3.16 kernel. So far this has been working well, it also appears developers have also added support for more CPU counters.
I've removed some extra header information from the dpdk header which mainly existed for rt capabilities. A discussion with Shane and Dan resulted in the proposal that rt should be removed from libtrace itself and become supported via a libtrace application. For now rt might be broken and rt support will be reworked once formats are working efficiently.
Most remaining performance issues left with the dpdk format code appear to be simply due to running to many instructions per packet in the libtrace library rather than an excessive number of cache/branch/tlb misses etc. I will continue to look at reducing this for now the easiest place to optimise this would be to reduce the number of calls to clock_gettime() to one per batch or less and essentially create fake timestamps. What we are currently doing --- calling clock_gettime() in a loop for a batch --- is not any better. Of course 1Gbit cards that support hardware timestamping of every packet will be able to get a more accurate timestamp, sadly no Intel 10Gbit cards support this currently.
The event and time based simulator of Doubletree and Traceroute was extended to cover sources windows of 1, 2, 10 and 100. These all performed well right down to 1. A sources window of one is different to no sources windows because each trace is processed one by one, whereas the later processes them all at the same time. It should also be noted that there is a delay before control information is sent making these two situations more different from each other in terms of total traffic.
Further work was carried out on the related work section of my thesis. This included load balancer turnover and structure as well as Doubletree and black hole analysis.