Have run all common queries and their equivalents on both PostgreSQL and InfluxDB and made a table of results. Only a few gains were made by InfluxDB, but these were in some of the most common queries, and were reasonably significant.
I have noticed that InfluxDB queries seem to have a lower speed limit of about 2.5 milliseconds. I've also noticed that the Influx Database itself is taking a much bigger portion of the CPU than PostgreSQL during testing. This means that my testing may be partially limited by CPU.
Also used run length encoding to save space on traceroute data in InfluxDB and added unique ids to each as path, with the help of a second table for storing unique ids and paths. This is sort of using Influx for something it isn't designed for (as a relational DB), but it seems to be working for the limited purpose of reading a dictionary of already encountered paths and ids into memory before beginning to insert new data.
Updated the amplet client packages to set appropriate permissions on the
key directories and to set the umask correctly so that rabbitmq is able
to use downloaded keys.
Worked with Brad to get Andrew some useful AMP data so that he can
perform some comparison tests between the existing AMP database and the
new ones that he is investigating.
Lots of little fixes to things in the AMP scheduling interface, bringing
fetching values properly), making sure redundant information isn't being
presented, a little bit of styling. Tried to be smarter in a few places
with selecting default values so that something useful will get
displayed in the matrix.
Started looking at systemd in order to get init scripts working in
Jessie for netevmon and nntsc. The existing init scripts are able to be
used mostly as-is, though some recent changes to netevmon have meant a
few things needed tidying up.
Per Richard's suggestion during the weekly meeting I've added a version of the addflows test with packet-in's but without reactive install i.e. adding flows as fast as possible while receiving packet ins. This was a minor modification to the existing test code to allow these arguments to be set in this way.
I've run the tests including this updated one on one of Josh's switches.
I've also been continuing working through processing the results and adding statistical measures such as 95% CI's using scipy and in general getting all the numbers you could possibly want. I've found the odd test that has failed to run correctly, however these appear to be very rare - I'm still not sure of the root cause.
Last week, I worked on port registration and mapping between OF switches and Rhea VM. I got the port registration part working but at my weekly progress meeting today, it was suggested that I should look into adding ports to the OVS virtual switch based on the number ports reported by the OF switch using sub-process calls.
I would work be working on port registration(take 2) and examining the OF message creation process next.
Spent the first half of the week cleaning up some of my code and refactoring. I have begun benchmarking some queries speeds for InfluxDB against our current database. I have been provided with a list of common queries so am making my way through those and comparing them against equivalent queries on InfluxDB. Preliminary results seem to show that PostgreSQL is generally faster for making most queries, but the queries that utilise the Continuous Queries that InfluxDB provides speed things up quite a bit and often beat PostgreSQL speeds significantly. Continuous queries are like views which continually aggregate data and store it.
Finished up the demo for STRATUS forum and helped Harris put together both a video and a live website.
Spent a bit of time trying to fix some unintuitive traceroute events that we were seeing on lamp. The problem was arising when a normally unresponsive hop was responding to traceroute, which was inserting an extra AS transition into our "path".
Rebuilt DPDK and Ostinato on 10g-dev2 after Richard upgraded it to Jessie so that I can resume my parallel libtrace development and testing once he's done with his experiments.
Installed and tested a variety of Android emulators to try and setup an environment where JP and I can more easily capture mobile app traffic. Turned out Bluestacks on my iMac ended up being the most useful, as the others I tried either lacked the Google Play Store (so finding and installing the "official" apps would be hard) or needed more computing power than I had available.
Made some changes to the amplet client in response to things I observed
while installing test clients for the Lightwire machine. Changed the log
level of some informational messages to avoid filling logfiles,
rearranged startup to create the pidfile earlier to work better with
puppet and added some more smarts to guessing the ampname when one isn't
supplied. Also rearranged some directory structure to better represent
the python modules involved.
Found and fixed a few bugs in various things on the server side as well.
Values from the new dropdowns weren't being fetched appropriately in
some cases, percentage loss was sometimes calculated incorrectly and
incomplete traceroute paths weren't being stored correctly.
Got the event detection systems up and running on the Lightwire machine,
which was delayed due to issues with embedded R behaving slightly
differently in the Jessie version. Also spent some time with Shane
chasing up some unusual looking events and unusual merging of event groups.
Brad and I finished updating the last of the reachable amplets to Debian
Wheezy, which brings us up to 13 monitors all running the new code now.
I spent last week examining the vandervecken code to understand how the interfaces on the OpenFlow switches and the OVS switch in the Vandervecken VM are mapped together.
This week, my focus is to code the mapping and port association functionalities needed for the new RouteFlow.
I've re-run a couple of tests this week firstly the add-flow tests on the pica8 because these had a threading bug in them. I then re-run the HP tests as these were unintentionally run on 100Mbit ports so for consistency purposed these were moved to the 1Gbit ports. I also run the tests against OVS.
I've been working through processing the results, verifying the results and looking for any anomalies that I might need to take into account. I found some very low figures from the Brocade tests in some cases, these could be related to it becoming unstable and crashing (either the cause or if it is a gradual leak just timing).
Some aspects of the tests that I have been looking at include receiving packets out of order along with rates and latencies.
Had a meeting with Tony McGregor about the introduction. Took away some critique sheets and used these to make changes and improvements.
Continued to read through the rest of the chapters so that I can, pass on a draft and get some critiques from Matthew and Richard on the later chapters, in particular.