After speaking with Bernhard at the end of the year, I have been working on a presentation to be made at the next Machine Learning Group meeting.
The presentation is indended to elicit any ideas that members of the ML group might have, regarding a solution to the problem.
It summarises the problem that my research attempts to address and the progress I have made so far in my investigation, especially items taken from my literature review.
The feedback I receive from this presentation will inform the direction that any development takes.
I finished up performance optimisations for the DPDK format for the time being. A libtrace application without any other load (tracestats with no filters) can almost capture 100% 64byte @10Gbit on a single thread.
This is not possible via the parallel interface on a single thread but can be reached easily with 2 threads.
As a summary of the last couple of weeks work on DPDK and what has made the biggest impact.
* Rewriting to DPDK format to use batch reads, this has been written into the parallel interface and makes a huge difference.
* Rewrites of the libtrace parallel pipeline to avoid duplicating work.
* Introducing pauses after failed reads helps to reduce load on memory and allows for faster capture rates.
* By default CPU cores on the same NUMA node as the PCI card are preferred which helped reach 100% capture.
* Time stamping is now done per batch, with packet spacing assumed to be at line rate. This greatly reduced the number of gettimeofday/clockgettime calls, which was slow.
* Simplified the layout of the DPDK packet by moving our additional header to be straight after the DPDK header rather than prepended to the start of the packet.
* Surprisingly NUMA configuration on m_buf memory seemed to make no difference, along with the number memory channels that you tell dpdk about. I'm not sure why.
This report is for the last week of last year.
I spent the week tracking down the bug in TPACKET_V3. This involved tracing the code path through net/packet/af_packet.c and also figuring out how the poll function works.
Poll works by first testing to see if any of the events have already happened. If they have, poll will exit. If not, poll will put the task that called poll into a sleep (specified by poll timeout). In order to break the sleep, something needs to signal the kernel that an event has occured. The kernel will then wake up poll and it will rescan for events.
In TPACKET_V3, poll wasn't alerted if a block expired meaning that, while waiting for data to be received, all of the blocks could time out leaving no space for the next packet to be received. The received packet would be dropped and the code would wake up and clear all of the blocks. If you were to send packets slow enough, you would drop all of them.
I made the patch and submitted it to David Miller who then forwarded it to Linus Torvalds. WAND helped me out a lot to actually submit the patch as it got rejected several times due to formatting problems. It got accepted, and can now be seen in the kernel: https://github.com/torvalds/linux/commit/da413eec729dae5dcb150e2eb34c5e7...
Started work on a program to help manage signing certificate requests
from amplet clients, similar to the puppet CA. Got most of the required
behaviour implemented - listing outstanding requests, signing them, and
revoking signed certificates. There also still needs to be a bit of
thinking done around how the amplet clients deal with revocation (how to
best do OCSP or similar) and how to reissue certificates that have expired.
Short final week for the year, as I had to take a couple of days of leave.
Finished fixing the highlighting of segments on the AS traceroute graph. I ended up going with a borderless approach as it was very difficult to get the border drawing right in a number of cases). Instead, the highlighted segment becomes slightly brighter which has much the same effect.
Added AS names to both the AS traceroute and monitor map graphs. These come from querying the Team Cymru whois server via its netcat interface and are heavily cached, so we shouldn't have to make too many queries. The monitor map has also been updated to use the same colour to draw nodes that belong to the same AS.
Migrated the last of the old libtrace trac wiki pages over to our GitHub wiki.
Kept working through improving the SSL code to exchange keys and operate
the control socket on the amplet client. It now validates the
Diffie-Hellman parameters before using them (aborting if they are not up
to scratch), and I disabled compression to avoid another known attack
vector. Validated that the options I was setting were set, and the
protocols/ciphers I enabled were in fact the only ones being used. Spent
some time refactoring the code to be cleaner and easier to follow and
also added more logging to help make it obvious what was going on at
each step where something could fail.
Found a test that was failing to pass when building packages for 32bit
Debian and spent some time trying to fix it. Some constants I was using
to test edge cases in scheduling were too large and overflowing
variables giving incorrect results. Forcing them all to be the expected
size fixed that and the tests all pass.
Finished updating the AMP latency event ground truth to include our new detectors. Generated some fresh probabilities for use in Meena's DS code. Also generated some probabilities based on the magnitude of the change in latency for an event so that we are more likely to recognise a large change as significant even if only one or two detectors fire for the event.
Updated the tooltips on the amp-web graphs to show the timestamp and value for the portion of the graph that the mouse is hovering over.
Started looking into fixing the bad border drawing on the AS path graphs, which would result in borders being drawn between segments that should be combined.
I did some benchmarking of TPACKETV2 and TPACKETV3 and found that TPV3 performs much the same as TPV2 but with a considerably lower CPU usage (due to processing blocks rather than individual packets). With really low traffic flows, TPV3 was dropping packets. Replicate by sending just two packets to TPV3 and TPV2, and notice that TPV3 doesn't see them ever.
This is a kernel bug! TPACKETV3 has an issue where if the kernel marks a block as timed out, it doesn't notify any of the poll watchers. If the kernel marks every block as timed out, the next packet to be received on the interface will be dropped (I'm not sure if this extends to multiple packets, but I think it does). I wasn't able to track this down during the week and plan to continue it on Monday (this blog is written in retrospect).
Further work has been done on my thesis. I have written a section on suggested usage for the many sources to many destinations scenario. This is based on the cases that we have available which are few to many and the inferred many to few case. The time frame for data collection is based on comparison of published results with my results. This comparison suggests that there is a high rate of temporary change of Internet topology but that there is a slower rate of permanent change. This makes a more workable rate of data collection seem feasible.
By removing DTLS I could communicate with the CoAP server (cc2538 device) via the Copper client on my host VM. I can get the uptime, and turn on and off each led light on the device.
I've been testing the er-rest-server and er-rest-client code provided by contiki (which uses the CoAP protocol) on the devices to understand what works with the cc2538 devices and to get a better understanding of the code. Aim is to have the devices wake up at regular time intervals and publish their data to a CoAP server (on the 6lbr router) then sleep after receiving an ACK to conserve power.
Did a bit of research into the contikiMAC driver which allows the radio on the device to be switched off for more than 99% of the time whilst still being able to relay packets. This would be ideal to further limit power consumption in the WSN.