Continued to work on oflops this week. I've worked through the code fixing many issues shown by valgrind. Such as correcting the order when stopping threads so that memory is no longer free'd by one thread while another is still accessing it. Along with many other memory leaks etc.
I've also looked at the kernel module packet generator (pktgen) which can be used to produce packets instead of using userspace pcap, and as expected it can produce higher packet rates than the userspace. Pktgen uses a delay between packets for rate limiting and will insert the current timestamp into a packet. However was applying the delay after the timestamping of the packet, this resulted in an offset in the timestamp similar to the delay however this is not guaranteed as the length of the delay varies depending upon how long tx and creating the next packet takes. I've rebuilt the module with the ordering fixed. When I get the time I would like submit my fix for this to the kernel.
More work on the dashboard this week:
* added the ability to remove "common" events from the recent event list and made the graphs collapsible.
* added a table that shows the most frequently occuring events in the past day, e.g. "increased latency from A to B (ipv4)".
* polished up some of the styling on the dashboard and moved the dashboard-specific CSS (of which there is now quite a lot) into its own separate file.
Started thinking about how to include loss-related events in the event groups, as these are ignored at the moment.
The new capture point came online on Wednesday, so the rest of my week was spent playing with the packet captures. This involved:
* learning to operate EndaceVision.
* installing wdcap on the vDAG VM.
* adding the ability to anonymise only the local network in wdcap.
* performing a short test capture.
* getting BSOD working again, which required the application of a little "in-flow" packet sampling to run smoothly.
* running libprotoident against the test capture to see what new rules I can add.
Defining the schema for network layer
- using RA with DNS backup instead of DHCP (stateless is easier to implement and less resource heavy)
- write up with a given setup what should happen when a new node connects to the gateway.
Moving onto application layer
- californium is a java CoAP API (more familiar with java)
- investigate what is possible with the API
- potentially look into dummy client / server setup on test machine (not the RPi's)
A modification was made to the diamond detection part of Megatree. This was a minor upgrade to better find the convergence points of asymmetric load balancer diamonds. In particular what was added, was a step to find the maximum value for probe TTL for a given link node. Doing this seems to reduce the chance of finding a node further downstream from the convergence interface as convergence. The updated analysis is running at present, and counts of convergence points found for a given divergence point will be compared with results before the modification.
I finished the initial checking of my thesis draft and so I am now hoping for some feedback to help finalise things.
I got AES going! 802.15.4 uses the CCM* block cipher mode with AES as the cipher (using 128bit keys).
This was really difficult! Small changes in the input resulted in wild changes in the output (as you would expect!). I wrote a tool with libtomcrypt to create some reference data (something that I would then try to replicate with the CC2538 libraries). In the end, I managed to make this work.
I also got this going with Wireshark (which highlighted an endianness issue with my libtomcrypt tool). Wireshark will now accept my AES encrypted messages which gives me confidence that the implementation is correct.
The implementation is a bit hacked together at the moment (as I was hacking about to get it going) however tidying it up should be simple. Then I can move on to decryption.
As for the security, I have no idea how well this implementation would perform against side-band power analysis attacks. This is something I'm not even going to consider trying to defend against at this stage! In the future, however, this might be important.
Spent some time investigating unusual data to make sure it wasn't
occurring in the amplet tests. Monitoring of management connections
found one that was sharing a physical link with a test connection. Some
HTTP tests were having unusually long run times which appears to be
caused by the server infrastructure and not our own DNS lookups or
Started testing a new version of the amplet client for deployment on the
NZ mesh. Ran into an issue with our large schedule files where a count
variable was too small and overflowing. Results were collected fine, but
most of them were being thrown away when reported. Split all report
messages into smaller chunks as a short term solution that doesn't
require updating the server side code (still aim to move to something
smarter like protocol buffers).
Made no useful progress on getting Chromium to fetch/modify headers
without crashing. There are newer versions I need to try, but they
require more recent versions of libraries than I have.
I've been working on getting a openflow switch testing platform ready this week, the plan is to have this complete by the end of the month ready to run some tests.
I've decided to use OFLOPS-turbo as a starting point and add support for the newer versions of OpenFlow. I've opted to replace the openflow connection handling with libfluid-base, which will handle the OpenFlow handshake and echo's as required regardless of the OpenFlow version in use. The modules can then use their library of choice such as rofl or libfluid to construct and parse OpenFlow messages. I've also been fixing other issues with the code such as high CPU usage due to polling on gettimeofday.
I made good progress with the application. I have set up most of the back end, particularly the values to be used to query the database. I can request flow information for the network as a whole or on a per-device basis. I have also added the ability to assign a name to a MAC address. I was thinking of using the hostnames of the hosts but it would be kind of hacky given only the MAC address, plus reverse-DNS would have to be set up on the local network.
Although sFlow was useful since it supported MAC addresses, the sampling isn't ideal for getting an accurate picture of the devices behaviour on the network. I found a program called softflowd which listens on an interface and is capable of exporting Netflow version 9. It looks like Netflow V9 is going to be the only protocol that can be used with my application since it supports everything I require, in particular MAC addresses, direction information and application information. Currently I inspect the interface index to determine direction in my parser script, which means that the SNMP must be configured to assign these values which isn't likely on a home network. I hope to get port mirroring set up on a switch and use softflowd to construct and export Netflow V9 packets to my collector. This won't affect the application itself because the database schema will be the same.
Continued testing and tweaking the event grouping in netevmon. My main problem was the creation of seemingly duplicate groups in cases where further (usually out-of-order) detections were being observed for events that were members of an already expired group. Eventually I tracked the problem down to the fact that the event was being deleted when the group expired so the later detection was resulting in the event and its parent group being re-created.
Started looking into methods for determining whether an event is "common" or "rare", so that we can allow users to filter events that occur regularly from the dashboard. This meant I had to change our method of populating the dashboard -- previously, we just grabbed the appropriate events in the pyramid view and passed them into the dashboard template, but now we need to be able to dynamically update the dashboard depending on whether the common events are being filtered or not.
Added some nice little icons to each event group to show what type of events are present within the group without having to click on the group. The current icons show latency increases, latency decreases and path changes.
Have fully integrated the new packet scheduler. This works great! I have the default number of link-layer retransmissions enabled (3) and this gives great reliability to my ping demo. I also added some debug GPIO so I knew what was going on when and things line up well enough (4 - 5ms of error between the coordinator and device slot timers ain't bad!).
I'm now looking into the AES-CCM* based security supplicant. I already had a supplicant that accepted messages to encrypted/decrypted but it didn't actually do any tranformation on the data. The CC2538 has an AES core built in where you pipe data in one end and encrypted data comes out the other. TI provides a library to do this, so I'm in the process of wiring it up. This is actually fairly simple once I figured out how the CCM mode operated (and understood the parameters L, M, a, m and c..!). I expect that once I get encryption working that decryption will be trivial (as you simply input the cipher text rather than the plain text and run the same algorithm on the data).
I'm pretty pleased with progress to date!