User login

Blogs

28

Apr

2016

Lots of minor fixes this week. Fixed the commands to properly kill the entire process group when stopping the AMP client using the init scripts. Still need a cleaner way to do this as part of the main process. Updated the AMP schedule fetching to follow HTTP redirects, which was required to make it work on the Lightwire deployment. Fixed the tcpping test to properly match response packets when the initial SYN contains payload. Different behaviour was observed in some cases where RSTs would acknowledge a different sequence number compared to a SYN ACK, and only one of these was being checked for.

Updated all the tests to report the DSCP settings that they used. They are not currently saved into the database, but they are being sent to the collector now.

Set the default packet interval of the udpstream test to 20ms, which is closer to VoIP than the global AMP minimum interval that it was using. Also wrote most of the code for the test to calculate Mean Opinion Scores based on the ITU recommendations, just need to add a latency measure to complete the calculation.

27

Apr

2016

Started working through setting up and running a handful of OpenFlow applications starting with switches. For this I'm trying to keep everything contained within Dockers and scripts to remind me how to run each, as well as keeping things as portable as possible. I'm using mininet to simulate a small number of hosts on each.

I've set up the ONOS docker again which includes a simple switch and a simple mininet network. I've also configured Valve a VLAN switch running from a docker with a VLAN'd networked. I wrote docker files for Faucet a VLAN switch and fixed a couple of bugs which have been merged back into github. Faucet is based upon Valve, however provides an interesting case by being a multi-table application unlike Valve and ONOS's switch.

I've spent sometime manually going through the resulting flow tables from the switches tested and it seems that it is hard to make many improvements to the single table rules such as converting it to a multitable similar to Faucet. A single table switch reactively installs rules connecting two hosts only when both try to talk to each other, if it did not it would result in a rule for each src dst pair, i.e. scales with hosts^2. Where as a multitable switch like Faucet will maintain a learning table and forwarding table, with each host in both scaling 2*hosts. As a result of the reactive single table learning not all src dst pairs are installing making the jump to a src and dst table invalid as this would install rules for src dst pairs that did not exist in the original.

I'm also working through recent literature and re-reading some existing in relation to the problem, I've just started compiling an updated document with possible approaches from literature.

26

Apr

2016

Only worked three days this week -- on leave for the rest.

Continued developing the event filtering mechanism for the amp-web dashboard. Managed to make all of the filtering options work properly, including AS-based filtering and filtering based on the number of affected endpoints.

Changed event loading to happen in batches, so if the selected time range covers a lot of events we will only load 20 at a time. A new batch is loaded each time the user scrolls to the bottom of the event list. This means that we can now replicate the old infinite scrolling event list behaviour on the dashboard, so I've removed the former page.

Added automatic fetching of new events to the dashboard, so the event list is now self-updating rather than requiring a refresh of the whole page to see any new events.

20

Apr

2016

Did some reading around calculating mean opinion scores for VoIP and started to add code to the udpstream test to calculate it both the Cisco way and the ITU E-model way. Neither of them explicitly take into account jitter which seems unusual, my best guess so far is that they count jitter as part of the delay. Other models I've found do include jitter as part of the delay calculation.

Spent some time writing more documentation about installing and configuring an amplet client. Install process, configuration options and schedule file options all get a first draft description, hopefully enough to help people install monitors with minimal assistance but I expect they will need to be expanded. Updated example configuration files to agree with the new documentation.

Various small fixes, including updating the standalone icmp and tcpping tests to print human readable icmp errors rather than printing the type and code, and using Python .egg format in the ampsave packages.

Merged my scheduling parts of the website back into the main branch so that others can start using the features I've added.

20

Apr

2016

Started looking at the things translating OpenFlow rules to fit new pipelines. The first part of this is to understand the types of rules controllers are installing and looking manually at what possible changes could be made. The first step of this is collecting runtime traces of a number of controllers working in realistic networks. So that we can identify rules that scale with hosts, vs one-time setup rules etc.

As such I've worked on my quickly hacked together passthrough OpenFlow controller and reworked threading to use a processing thread with a publish, process architecture, rather than spawning a new thread per message. I've then used libtrace to record the OpenFlow conversations. I've also added simple support to try and group similar sets of rules and count their frequencies.

Next week I will be compiling a set of test OpenFlow applications and collecting traces and flow tables.

19

Apr

2016

Continued working on the event filtering mechanism for amp-web. Added support for an ASN->AS name mapping database which will be used to manage the list of AS's that can be filtered on, as well as be used for labeling our traceroute graphs (instead of querying whois.cymru.org which can fail from time to time).

Changes to event filters are now posted back to the amp-web server and saved for the next time the user loads the event dashboard.

Started working on actually filtering the events based on the user's selections. I've got filtering working for time period, maximum event groups, event types, sources and targets. One interesting side effect of filtering is that the removal of certain events from event groups can create situations where we have duplicate event groups (because the events that made those groups distinct are no longer on the dashboard). Removing events can also change the start time of an event group and therefore event groups no longer appear in chronological order. As a result, I've had to re-work the event processing to correct for these issues.

13

Apr

2016

Worked with Brad to get access to the Netspace amplet and bring it slightly more up to date (site firewalls had been interfering with us making changes until now). That's the last amplet that was reporting to erg now upgraded.

Merged my amplet client control socket changes back into the main branch without too much apparent trouble. Need to do some more testing to make sure everything is sensible. After merging, added options to set DSCP bits for the new udpstream test like all the existing tests had.

Continued working on documentation and tidied up the sample program showing how to read data from a RabbitMQ queue and extract the AMP messages.

Had a quick visit to Lightwire on Thursday which generated some more interesting ideas, especially for me around automation of test target selection. Some of these line up nicely with wishlist items I already had, so hopefully I might be able to find time to work on those features soon.

13

Apr

2016

Arrived back in the weekend and spent sometime catching up on things I had missed. I spent the majority of the week working through the suggestions from supervisors on the OpenFlow packet_in and out paper. I spent some time considering where my PhD is going, and what the best starting place would be.

I also spent an afternoon working through a performance related libtrace DPDK issue, reported on github. Which now means we use default values provided by the device, which is a new option in more recent versions of DPDK. I also confirmed that DPDK 2.0 appears to work. Others still appear to be working on porting libtrace to newer versions of DPDK.

11

Apr

2016

I made changes to the RheaFastpath code to reduce the number of rules installed and I've started working on IPv6 route conversion to OpenFlow rules. So far, I've established connectivity between an IPv6 host on the OF switch and its mapped port on the Rhea VS(dp0). I've encountered a bug where IPv6 routes are not being dropped in the code and I'll be fixing that this week and getting started with mapping recieved routes to installed OpenFlow rules.

11

Apr

2016

Marked the 513 libtrace assignments. Some students performed very well and I was glad to see that the investigative task proved to be very doable.

Started working on adding the ability to filter events and event groups on the amp-web dashboard. Most of my effort so far has been in producing a mock-up of the interface, which I showed to Nathan and Chris on Thursday afternoon. Started replacing some hard-coded filtering settings with a dynamic template that uses user preferences stored in a database on Friday.

Fixed a few little netevmon issues that cropped when trying to restart netevmon on prophet prior to starting work on the dashboard filtering, mostly in relation to ensuring that the 'purge event database' option works sensibly.