User login

Christopher Lorier's blog

10

Sep

2013

I've been writing the proposal and thinking about how things will scale.

To poll both ends of a path I can add an extra flow on each switch specifying the ingress switch, so that when a packet leaves the system it is counted with the other packets that entered the fabric at that switch. This will require tables and stacked mpls labels (3 layers of mpls), though it could probably be made to work with two.
This way I can poll both ends of a path, but inbetween I am aggregating paths, since the alternative means I have a number of flows to create paths on each switch that is quadratic in the number of switches in the fabric. This is going to be a complication for accurately locating problems just by polling counters.

03

Sep

2013

Since doing my presentation I have done a bit more reading and have just started on writing the proposal.

So the idea is to include fault detection into the distributed router used for Cardigan. Looking at packet counts hitting various flows/ports and injecting packets to determine when there is a problem. I have to look at how to make it quick to react without overreporting or overwhelming the controller.

This would like to use openflow groups, which have failover mechanisms, however these are not implemented by anyone as far as I am aware.

23

Aug

2013

So this week I have worked on my conference presentation and read a lot of stuff about fault detection in and outside of SDN.

The SDN stuff seems to mostly agree that to be fast you need fast-failover groups, which arent in open vswitch yet.

For non sdn stuff I read stuff on EOAM and BFD and RFC 6374. As well as some other random stuff that wasnt much use.

19

Jul

2013

Handed in on monday, here's a copy if for whatever reason someone decides they actually want to have a look at this.

03

Jul

2013

I think it is safe to assume my blog entries will consist of me just writing up over
the next couple of weeks.

An extremely drafty first draft will be ready today.

26

Jun

2013

So I have everything working now, and have started writing the report. I should have a first draft of the first chapter finished today.

Some of the code is not as pretty as I would like it to be, I completely broke my interface between the load balancing module and rfserver..

18

Jun

2013

Having bodly claimed to have finished debugging the path learning last week, when I tried to get it to put the paths onto switches, I found that it wasnt actually deleting the paths properly, it was just telling me that it had with how I was testing it.

So I ended up completely overhauling how I am storing these things, and am just working out the last few kinks in that now. But hopefully that means that when it comes time to actually use them the whole process should be a lot easier.. Hopefully..

28

May

2013

Finished debugging the path learning. So now I learn paths between switches with dijkstra's correctly. But dont do anything with them yet.

15

May

2013

I started testing the path learning, which took a bit of effort, changing the test network to be a non full mesh, and adding scripts to bring switches down and back up again when I need to. So most of the week was spent debugging this.

I also did my in-class presentation, which went fine..

06

May

2013

So my plan this week had been to add the ability to readd flows to a switch when it comes back up in a full mesh topology. But I decided that the full mesh solution would be different enough from a non mesh solution that I wasnt going to gain much in the longer term by adding this. So instead I started adding support for non full mesh topologies.

So rfserver now calculates shortest paths between switches, but doesnt actually use them. It will update whenever a switch comes up, but doesnt fix paths broken when a switch goes down.