User login

Blogs

23

Oct

2017

Last week I carried on with reading through the papers that I had found related to error detection and recovery of SDN. I have also written quick summaries for all the papers that I read through. At the end of the week, I started looking at traffic engineering and how this can tie into error detection and recovery. Thursday I attended the compulsory postgrad workshop.

This week I want to quickly tidy up the summaries that I have so far. I also want to look a bit more into traffic engineering and perhaps other potential enhancements that can be applied to error recovery and detection. I would also like to try and run a few experiments in Mininet with some of the presented systems in the papers I have looked at. This will hopefully help me to gain a better understanding of what has been done so far.

17

Oct

2017

Helped Jayden with polishing up the final version of his Honours report. Hopefully he is happy with the final result!

Started testing the initial prototype of the DAG multicaster on our development boxes. Had a few issues getting dpdk pktgen to do exactly what I wanted (not helped by the terrible documentation!) but eventually managed to happily capture 10Gb of small packets split across 4 DAG streams with no real issues. Next step is to start encapsulating and multicasting some nDAG records.

Went to the STRATUS forum on Friday, flying down to Wellington on Thursday afternoon. Forum seemed to go pretty well; plenty of people that I spoke to thought that our work so far was interesting.

Released a new version of libprotoident.

09

Oct

2017

Libprotoident 2.0.12 has been released.

This release is mostly a protocol database update. We've added 26 new protocols and updated a further 33 others since the last release.

We've also added a new category for IP Camera protocols. Some already existing protocols have been moved into this category to better reflect their purpose.

The full list of updated protocols can be found in the libprotoident ChangeLog.

Download libprotoident 2.0.12 here!

09

Oct

2017

Started working on the DAG multicaster for STARDUST. Designed an encapsulation protocol for the multicaster and wrote some prototype code using the libdag API to start grabbing bunches of records and give them to the as yet unimplemented multicaster to encapsulate and send.

Spent some time reading over Jayden's honours report and gave him some (hopefully useful) feedback. The work he has done this year is really interesting; just needs a bit of literary polish so that his markers can fully appreciate it :)

Continued slowly working towards a libprotoident release. The code itself looks ready to go, so I just need to prepare the release announcements. I've updated my paper to include the extra 20 or so protocols that I've added since I started writing up the results -- the paper now covers 435 application protocols.

03

Oct

2017

Spent a decent portion of my week working on my reworked cluster evaluation code for STRATUS. The new version seems to be producing labels that are much more useful, so my ability to evaluate clusters and identify the least conforming members has improved greatly.

Continued to tweak and improve the libprotoident rules. Started working towards a possible 2.0.12 release by updating documentation and running some basic build tests on various operating systems.

02

Oct

2017

This week I finished the route entry C implementation and ran several tests to make sure everything is working as expected. I then started integrating the new c modules with the python code and tested it to make sure everything is working correctly. At the end of the week, I re-ran the performance tests and collected performance stats to compare the new implementation to the old Python modules.

28

Sep

2017

This week I finished implementing and testing the new prefix module written in C (as a new Python type). After some preliminary testing, it seems that implementing the module in C offers a significant increase in performance and also a decrease in memory usage. Prefixes, however, use far less memory than route entry objects. I then started working on implementing the Route-entry module in C as well to further improve performance and hopefully resolve the memory issues we are having.

28

Sep

2017

Rewrote the prefix filtering code to use a radix trie rather than the naive list I started with. Was an interesting challenge to allow for a range of prefix matches, that might not be even close to the prefix specified in the rule itself. The new filtering is approximately 300 times faster than the old, as well as using much less memory.

Changed the format of messages passed between peer and table processes in the BGP router to allow tables to also talk to other tables. This allows filters to be applied in stages, or export routes to peers at different stages. If desired, work specific to groups of peers can all be performed at once by a single table rather than multiple times by each individual peer.

Started to write up some basic design documentation describing the current state of the BGP router, what it is capable of, and how it all fits together.

28

Sep

2017

Spent some time chasing down issues in my BIRD configuration in my BGP resiliency testbed that meant routes were being shared inappropriately between peers (missing filters, which wouldn't have been a problem except I'm also messing with settings allowing the local AS to appear). Added further peers and edge devices in different configurations to make sure that they are all properly isolated. Everything looks to be working pretty well, and enabling/disabling specific peering sessions causes the appropriate route updates. Adding a second controller to the test correctly keeps the best routes available even when one of them is unavailable.

Noticed that as the number of peers increased, the number of full route recalculations was getting large, so tried to remove some extraneous causes of updates to be sent. Often we already had enough information in a peer process to do the work without asking the table to do work as well (and possibly triggering it to send unnecessary updates to other peers). Also added a very short dampening period to updates so that many consecutive messages in a short time period only cause only a single route recalculation to occur.

Fixed a few bugs that would allow saved routes to be modified by filters, meaning the next time these routes ran through filters the results would be cumulative. Hopefully the saved raw/original routes and filtered routes for distribution are now quite separate.

28

Sep

2017

Had a very interesting chat with Perry about the BGP router project, and how I was going about trying to make it more resilient. He suggested a few ways to go about it that were much more simple than what I was planning, and also did away with the nastiness of shared state between the redundant controllers. Each controller can independently do its own thing and use BGP route selection on the managed devices to settle any differences arising.

Started setting up a test environment so that I can trial the changes made to help the BGP router more resilient to failures. It's currently a simple network using docker containers running BIRD to act as my peers/routers, with another couple running redundant instances of my code. Most of the work so far has gone into getting my edge devices running BIRD to do the right thing with the routes, importing and exporting using the correct tables to make sure they don't get inadvertently modified or shared at the wrong location.

Updated the router to better track which peers are in an active state, and to add communities to exported routes when peers are missing in order to flag the degraded state to the recipient.