I've updated my script this week to run the latest tests as well as adding support for multiple devices and openflow versions allowing for things to be as automated as possible.
I've run the benchmarks on the Pica and started running the tests on the Brocade however it has not been entirely stable and has failed during testing.
I've been looking closely at the getting encrypted openflow working on the Brocade and HP, the Brocade refuses any client certificates I try to give it however I have been able to disable validation and force an AES cipher suite to get this working. The HP on the other hand will accept client certificates but the negotiation still always seems to fail, I'm unsure why I thinking the issue is with certificate verification on the switch.
Next week I'm going to have the PhD conference so will divert some attention back to that, and continue to run tests in the background.
Worked with Brad to update a test amplet and the first 2 production
amplets from Debian Lenny to Wheezy. Everything has gone well so far,
though some of the older machines have a lot of cruft to tidy up (some
have already been through multiple Debian upgrades in the past!).
Hopefully we can get the rest of the machines sorted over the next few
Built new 32bit amplet2 client packages for deployment on the NZ AMP
mesh as the machines are updated. Extracted all the current
configuration from the database on erg to use as a configuration guide
while updating them.
Spent some time getting all the AMP server components
(website/events/storage/etc) installed on the new Lightwire server. This
is the first time that most of these components have been installed in
this configuration and the first time on Jessie, so the process wasn't
particularly smooth. Everything is now installed and running without any
clients, so the next step will be to see if I can configure a new client
using the new web interface.
Tested and fixed my vanilla PF_RING libtrace code. I've been able to get comparable performance with the pfcount tool included with the PF_RING release so I'm fairly happy with that. Started working on adding support for the ZC version of the PF_RING driver, which uses an entirely different API.
Helped Harris get his head around how NNTSC works so that he could add support for the Ceilometer data. Set myself up with an OpenStack VM so that I can start working on the web graphs to display the data now that it is in a NNTSC database. Also spent a bit of time writing up an explanation of how netevmon works so that Harris can start looking into running our detectors against the Ceilometer data.
Worked with Brendon on Friday to get NNTSC and netevmon installed and running on the lamp machine.
I've worked on tidying the reactive test and updating the output to include the metrics I want along with adding control options similar to the other tests. Along with this I've also attempted to run all the tests on the HP which did not require any major modifications. However working through configuration of the HP takes some time.
A handful of bugs have been fixed, including one where packets with incorrect IP checksums were being created which made packets take a rate limited path on the HP. I've worked through some negotiation issues in the libfluid library which I have submitted as a pull request. This allows testing to be run on both versions of OpenFlow without having to reconfigure the switch in some cases.
Brad added a new select dropdown widget that includes filtering of the
option list, and I spent some time adding missing functionality to it.
Keyboard navigation should all work as expected now -
pageup/pagedown/home/end all move around the list, and tab will select
and move on to the next input element. I also integrated this with all
of the dropdowns I've added for scheduling and site management which
involved making sure all the onchange events were properly hooked up and
that they properly followed visibility changes as dynamic forms are updated.
Spent some time tidying up labels, styling, etc on the scheduling pages
to make sure they are consistent with each other, and showing the right
level of information. Found and fixed a few instances where similar
fields were named differently between meshes and sites, leading to
missing data being displayed.
Wrote some scripts to do some basic exploration of the Ceilometer data to check which collections and series are most suitable for using with netevmon. I've found that around 30-35% of the series for CPU utilisation, network byte rates and disk read/write rates are at least long enough to be worth using -- this works out to ~600 series for each metric, so we'll have a reasonable sample size. The spacing between measurements is more of a concern, as it is very inconsistent. There are some parts of NNTSC and netevmon that assume a fairly constant measurement rate, so these will need to be re-evaluated.
Started adding PF_RING support to libtrace. For a start, I'm just working with the standard PF_RING driver (not the ZC extension) and I've written code that should work with the old API. Once I've tested that, I'll start adding native parallel support using one thread per receive queue in the driver.
Also spent a bit of time planning a paper on parallel libtrace. I'm anticipating the main narrative to be about how we've achieved better potential performance by adding parallelism (depending on the workload and the number of threads), while still maintaining the key design goals of library (e.g. abstraction of complexity, format agnosticism, etc.). We'll show that the same parallel libtrace code can achieve better performance across multiple input formats, i.e. DAG, ring, PF_RING (once complete).
Spent most of my time working on input validation when editing schedules
and sites, and making sure that buttons and fields were enabled
appropriately based on user actions. Also updated some of the templates
to use the longer display names where possible (rather than short
internal names), and link them to the appropriate configuration pages.
Added permissions to the security model to allow separation between
users that are allowed to view the data and those that are allowed to
Confirmed that the data I was getting from the throughput test was the
same from both the 32bit and 64bit amplet clients. Initially they were
reporting quite different data, but after comparing TCP settings
discovered that the 64bit VM had been tuned somewhat - after applying
the same values to the 32bit VM they now agree.
Spent the early part of my week reading over Dan's and Darren's revised Honours reports and offering a final batch of suggestions.
Continued poking at libprotoident and the unknown traffic on various Web ports. Finally managed to get Blade and Soul (a Chinese MMO) installed and running and was able to confirm that it was responsible for some of my unknown flows.
Started turning my attention towards our STRATUS research this week. Initially, we are going to look at general metrics that we can extract from cloud infrastructure and see if any of our existing event detection techniques are useful for finding anomalous behaviour. For a start, we are using data collected by the Ceilometer module on the Waikato OpenStack instance. Spent some time bringing Harris up to speed on NNTSC and netevmon so that he can experiment with the data within our system. In the meantime, I'm going to take a closer look at the data that we've collected to see which series will be most suitable to focus on in the short term.
Gave more details about our STRATUS work / goals to the designers who will be producing a poster about our research for the upcoming STRATUS forum.
Also played with a service called ThisData which claimed to offer something similar to what we have envisioned from STRATUS. ThisData is certainly pretty, but doesn't really seem to offer much more than daily revision control for your cloud data.
I have spent the last few weeks working on my thesis. Last night I submitted, which means my honours project has come to a close.
I need to remove ubiquiOS so that I can release the project as open-source. I will license this project under a 3-clause BSD and make it freely available on my github.
Spent most of the week continuing to work on the test scheduling web
interface. The lists of meshes and sites are now the primary entry
points, and if you click through then you have access to the meta data
about the site/mesh and the specific schedule that applies. These can
all be edited to change the names displayed in the results interface,
and schedules that are updated are made available for amplet clients to
The layout and flow are mostly settled now, though will likely be
updated after more frequent use. I've got the base functionality working
and have started adding some of the nice features that help make sure
the right data gets added, or inform the user what is expected. Slow and