I made changes to the RheaFastpath code to reduce the number of rules installed and I've started working on IPv6 route conversion to OpenFlow rules. So far, I've established connectivity between an IPv6 host on the OF switch and its mapped port on the Rhea VS(dp0). I've encountered a bug where IPv6 routes are not being dropped in the code and I'll be fixing that this week and getting started with mapping recieved routes to installed OpenFlow rules.
Marked the 513 libtrace assignments. Some students performed very well and I was glad to see that the investigative task proved to be very doable.
Started working on adding the ability to filter events and event groups on the amp-web dashboard. Most of my effort so far has been in producing a mock-up of the interface, which I showed to Nathan and Chris on Thursday afternoon. Started replacing some hard-coded filtering settings with a dynamic template that uses user preferences stored in a database on Friday.
Fixed a few little netevmon issues that cropped when trying to restart netevmon on prophet prior to starting work on the dashboard filtering, mostly in relation to ensuring that the 'purge event database' option works sensibly.
Short week due to Easter holidays.
Went through the protocol buffer descriptions for all of the AMP test formats and wrote some short documentation about the fields, and some sample code to fetch, unpack and understand test result messages.
While working on the documentation I found a few instances of test options that were inconsistent or didn't quite behave correctly. In particular I made the TCP ping size parameter behave the same as the ICMP one (total packet size rather than payload size), and made sure that setting just the UDP payload size in the DNS test would correctly add an EDNS header.
I got fastpath working on Rhea last week and also modified my code such that interfaces added to the virtual switch are one end of a VETH pair created during the mapping process rather than OVS internal interfaces that are created and added. This allows packets hitting those interfaces on virtual switch(dp0) to be visible to the Linux kernel. I also had a discussion with Richard Sanger regarding the fastpath implementation on Rhea and he suggested that the rules being installed were too specific and would increase the number of rules installed on the switches, this situation may have unintended consequences as hardware switches have limited capacity to process OpenFlow rules.
I will be making changes to the fastpath code to fix then this week and proceeding to add the conversion of IPv6 routes to rules.
Started writing up a short paper on the unexpected traffic analysis I've been doing for the past few weeks. Made decent progress -- I've got a mostly complete draft, just missing a conclusion and an abstract.
Spent a decent chunk of Thursday dealing with the fallout from upgrading influxdb to 0.11 on prophet. This broke most of our existing rollup tables, as the data type that we were now inserting (int) was no longer compatible with the data type that we apparently used to insert (float). Compounding matters was influxdb's lack of visibility into what data types are associated with any given column. Ended up trashing and re-creating the database (somewhat by accident) which fixed the problem, but not an ideal solution if we ever roll this out in production.
513 assignment was due at 5pm on Friday, so dealt with a few final queries from students. 20 submissions in the end, so a bit of marking to do next week.
I don't have much to report this week as I haven't been able to get fastpath working properly. I've fastpath labels properly applied and I've been able to install rules for the fastpath ports but the issue I have is that ARP packets sent to the controller by all switches are not delivered properly. Debugging continues...
Updated some of the signals used with the amplet client to provide
better management - as well as being able to reload configuration from
disk, it can now force a refetch of remote schedule files with a SIGUSR2.
Also made sure that all children (tests, servers, etc) have their
signals unblocked and the signal handler restored to the default.
Libwandevent sets all these in the main process, which was being
propagated to the children and causing some unexpected behaviour. The
init scripts now try to kill the entire process group of the amplet
client, which means children should now get the signal too.
Renamed server processes in ps so that it was obvious what task they
Refactored some more of the repeated server code out of the
udpstream/throughput tests so they are now a lot cleaner. Moved some of
the test server control message code around so that it was grouped
together in a sensible place.
Continued making progress with my unidentified mice flows in libprotoident. Added a whole pile of new rules, mostly for various Chinese apps again. Have probably done enough now that I can draw a line under this and start writing the paper itself; there are a few obvious patterns that I would like to identify but this has consumed a lot of time already.
Answered a handful of questions from 513 students -- mostly intelligent ones, so I'm reasonably confident about how the class is going overall. Due date is this coming Friday, so we'll know for sure soon enough.
I don't have much to report this week, I've been working on FastPath for Rhea and I hope to get it completed before the next week.
Spent some time updating unit tests to work properly with the new
watchdog and control API. Improved checks to make sure that only valid
control messages are being parsed. Other small fixes to make sure that
errors are caught and reported properly.
Started refactoring the test control connections to use an SSL BIO so
that exactly the same code paths can be used to read and write control
messages whether SSL is in use (amplet, standalone tests) or not
(standalone tests), which has removed/simplified a lot of code. Also
figured out how to properly do non-blocking IO when the BIO functions
behave differently to normal read/write.
Went with Shane to visit Lightwire on Thursday and had a discussion
about how we can make event detection, measurements, graphs etc work
better for them.