Christopher Lorier's blog
Thinking about the logistics of polling paths, made me realise that if I want to have paths in place to fail over to means I am going to want to be able to create those paths by using as few openflow rules as possible. So this can be done using two rules per destination per switch, but discovering these is complicated.
So I have been having fun with graph theory. Proving that it was always possible to achieve this with only 2 rules per switch, then trying to find an algorithm that does this tidily. I think my current algorithm is polynomial, though probably not a very nice order of polynomial, and its a bit messy so I am not entirely confident of that fact.
I also started to get worried today that it wont actually find a result at all in some cases. This is all kinda peripheral to my project, so I dont really want to spend too much time on it. But it is interesting and I can probably spin a chapter out of it.
This week I tried to help Joe track down the OVS stats bug, but didnt really make any progress with that.
So then I started planning a hello based system for fault detection, and how best to fit it into the RouteFlow architecture. So I can push a lot of it pretty close to rfproxy, which, hopefully will mean that if the switch can implement its own fault detection like BFD for instance, then there wont need to be much change to RFServer.
This also means that the switches can respond to problems slightly quicker. This has caused me to slightly rethink the architecture I was using for the stats poller based fault detection, and I think moving more of that to rfproxy might be a good idea as well.
Dean Pemberton has written a different routeflow LSP plug in, which in some ways is a bit nicer than mine. It allows for arbitrarily assigning ports to rfvms and creates paths based on egress ports rather than datapaths. In his system he sets the src and mac destination for the packet once as it enters the system and then just forwards on labels, so I have looked a bit at using MAC addresses as LSP labels. It's nice because it is such a massive address space. It does feel like something that could go horribly wrong though.
I'm currently merging this with my multi-table stuff to use that as the basis for the fault detection.
Sam Russel solved my vlan problem for me. The issue was I was using vlan 1 and the pronto treats it as native. I tried my tests again using vlan 2 and everything worked perfectly.
I have however found a bigger issue with the pronto. The stats counters are extremely inaccurate. Much moreso than the problems I have been having with openvswitch.
Speaking of openvswitch, Joe has been trying to fix the counter inaccuracy for me, so I have been trying to help with that. I'm starting to come to grips with how the stats reporting is performed in OVS, but I still havent got any ideas about what is causing the problem with the counters.
On monday this week we set up the pronto so that I can examine its foibles in practice.
It turns out there are a few of them. Multiple tables is a wash, it simply doesnt seem to work.
I am currently testing what I can do with vlans. The documentation says vlan stripping doesnt work, however, as far as I can tell my packets are leaving the switch with no vlan tags. I figure this is something I am doing wrong with my test because it really makes no sense to me at all otherwise, but I cant track the problem down.
I also thought a bit about what to do when you discover packet loss. I can definitely identify whether packet loss is caused by congestion or more mysterious means provided the level of congestion is not too much more significant than the amount of mysterious packet loss.
I got the congestion packet loss to work by separating the ovs bridges with veth links and using tc to limit the throughput. As expected you see a small amount of loss across the paths.
In the process I have found that the issue with occasional packets being miscounted by ovs in the change between one of rule to another is fairly significant. It turns out to be a fraction of a percent of all packets, but that still means thousands of packets getting miscounted. On the other hand, that level of inaccuracy will likely be buried under the packets lost due to standard tcp behaviour. You wont be able to see the exact levels of packet loss that you would expect however.
My next plan is to get the pica8 set up so I can test what openflow functions exist in picos that arent in ovs. Supposedly MPLS and groups work, which is fairly important for what I am doing. I'm not sure I am entirely confident about this though.
I started looking more deeply at the approach to failure recovery that I had been planning on implementing, and for a couple of reasons it may not be appropriate for my purposes. Firstly its very computationally expensive, which wouldnt be the worst thing in the world, but secondly it requires a lot of configuration. Since that is exactly what I am trying to avoid, I may need to use a much simpler system. On the upside, that hopefully will mean way less computation.
I also continued with testing things. Polling with 1 second frequency works fine, any more than that does not work however, as packet counts are only updated once per second. I also tried to introduce packet loss due to congestion within a virtual network, but the packet loss only seems to occur as packets leave the network. My impression is this is because when you have multiple ovs bridges on a single machine they still function as a single bridge. So the congestion I am generating within the switch isnt actually cause it to drop packets until it tries to output from the switch to one of the hosts.
So over the break I was trying to fix ovs, but after finally talking to the guy who wrote the ovs mpls branches this week, I am now giving up on that.
So instead I have the polling working with vlan tags and unique flows for each pair of nodes. It is currently just printing out values for packets sent and received, but it is counting them correctly and not losing any packets.
So then I started reading a few papers on passive monitoring techniques to focus on how they tested them. They've actually been fairly interesting. A couple using very similar techniques to mine.
I tracked down the ovs bug. I have got it doing what I wanted it to, but it is currently failing ovs test suite tests to do with bfd and lacp for whatever reason. These are taking quite a long time to run, but I will double check that they arent also affecting the branch of ovs I am using without my changes. And then have a go at running it with routeflow. Hopefully I can get all this sorted then start the new year with my routeflow path polling all set up and ready to do some tests on.
Havent been super productive lately. I am still digging away at openvswitch, as well as reading things relating to what I am going to do if I actually discover packet loss (that is packet loss not caused by problems in experimental branches of openvswitch).
I spent the week dealing with the disappearing packets. My initial attempts to recreate the problem in a more simple setup kept resulting in kernel panics. To try to expedite the process of diagnosing the kernel panics we set up a virtual machine. This didnt particularly help with diagnosis however, as it fixed the problem immediately.
The mystery of the disappearing packets seems to be related to recirculating packets. That is, when you add or remove an mpls label the packet is recirculated to update the header information.
When I push an mpls label and send the packet to another table, if the flow on that table attempts to push another label the packet will be dropped instead. The flow count for the second flow is incremented but its actions dont seem to occur.
This only seems to happen with pushing labels. Other actions, like updating the mpls label fields (ttl or label), dont seem to cause this.
There are some other bizarre outcomes I am coming across. Popping mpls labels seems to be popping the innermost label, I end up with packets with one mpls tag, without the bottom of stack bit set and with the wrong label arriving at the hosts.
Also in some cases the flow counters are not getting updated, which could be a big problem for me.
All of this also is occurring in what appears to me at least to be a delightfully non-deterministic fashion.
I emailed the guy who maintains the branch I am using, but havent heard back yet. Hopefully he can shed some light on things.
But, in general, things are not going as well as they might be just at the moment.