Brendon Jones's blog
Short week due to holidays and illness.
Made a few more fixes to the ampweb packages that were installed last week to help fix some issues we noticed during installation, including updating default configuration files to have more sensible values.
Spent some time updating documentation for the amplet client and writing man pages for some binaries that were missing them.
Spent most of the week finishing up packaging all the server-side components for AMP. Had to add a couple of patches as part of the package build process to set logging and pidfile locations, which I should go back and try to fix in a more correct manner. Spent some time installing and uninstalling packages to make sure postinst scripts were working properly to create users, databases, etc. Updated the Lightwire AMP server with the new packages, which went fairly smoothly but did have some issues with missing python dependencies (not in Debian, or too old) that were only required in some code paths.
Also updated the New Zealand AMP mesh to the newest client version. Went pretty smoothly which was nice.
Fixed up the rabbit queues/bindings that I had broken when I accidentally used a queue resource in place of a routing key when exploring the erlang commands. Figured out the syntax I needed to properly create the binding to an existing queue and now have all the configuration sorted.
Spent the rest of the week packaging the remaining software that we deploy on the AMP servers - ampweb, ampy, nntsc, ampsave, netevmon. Quite a few of them have progressed a lot since they were last packaged, so lots of dependencies needed to be updated, users created, databases created and populated, etc. Almost have all the packages built and ready for a test deployment now.
Spent some time working on issues a new user had installing the amplet client on a new Ubuntu machine. The Jessie packages almost work fine, though there seems to be a change in behaviour (or a bug) in the libcurl-gnutls library which prevents curl from working with any of our SSL sites. In the end after a lot of chasing things around, the easiest workaround seems to be to build against the OpenSSL flavour of libcurl instead.
Got the amplet client building on Centos again so that I could update a server used as an endpoint for cooperative tests. It is now running throughput and udpstream tests to get sample data for prophet, as well as testing out the new client build.
Started to look over all the old Debian packaging I had done for the server side software to get it all working for the current versions. Updated packages for the simple parts (pywandevent, libnntsc, amppki). Got stuck writing erlang to declare queues, exchanges and bindings on the server because the default tools can't do that (and I don't want to install extra tools). Currently have declarations working fine but in creating the bindings I've created a broken queue that breaks the web interface.
Spent most of the week working on installing the server-side components of AMP, which took a lot longer than I thought it would. Ran into issues with deploying ampweb when it's not at the root of the website - a lot of URLs were absolute and URL parsing was expecting a particular layout or number of elements which was no longer the case. Fixed all the obvious ones, but more were still showing up a few days later and have also been fixed.
Found and fixed a few small edge cases in recent ampweb/ampy features that hadn't been tested with the sort of data I was wanting to look at. Also had issues with trying to test the changes I had made to make sure they worked elsewhere, as the influx database on prophet was misbehaving and making it difficult to fetch data in some circumstances.
Spent some time tidying up control messages and configuration when scheduling tests that require cooperation from the server. As part of the previous changes the port number was no longer being sent to tests, which meant it could only operate using the default port - this is now fixed and works for both scheduled and standalone tests. Also fixed up some parameter parsing when running standalone tests where empty parameter lists were not being created properly.
Wrote some basic unit tests for the udpstream test and it's control messages. Fixed a possible memory leak when failing to send udpstream packets. Made sure documentation and protobuf files agreed on default values of test parameters.
Started to install the server-side components of AMP on another machine for a test deployment so that I can use the documentation I write as I go to help build/update the packaging for the most recent versions.
Added a latency measure to the udpstream test by reflecting probe packets at the receiver. The original sender can combine the RTT information with jitter and loss to calculate Mean Opinion Scores, which was slightly annoying as (depending on the test direction) the remote end of the test now has to collate and send back partial result data. Updated the ampsave function to reflect the new data reported by the test.
Updated the display of tcpping test information in the scheduling website to reflect the new packet size options. Worked with Shane to update the lamp deployment to the newest version of all the event detection and web display/management software.
Tidied up some more documentation and sent it to a prospective AMP user. Will hopefully get some feedback next week as they try to install it and I can see which areas of the documentation are still lacking.
Lots of minor fixes this week. Fixed the commands to properly kill the entire process group when stopping the AMP client using the init scripts. Still need a cleaner way to do this as part of the main process. Updated the AMP schedule fetching to follow HTTP redirects, which was required to make it work on the Lightwire deployment. Fixed the tcpping test to properly match response packets when the initial SYN contains payload. Different behaviour was observed in some cases where RSTs would acknowledge a different sequence number compared to a SYN ACK, and only one of these was being checked for.
Updated all the tests to report the DSCP settings that they used. They are not currently saved into the database, but they are being sent to the collector now.
Set the default packet interval of the udpstream test to 20ms, which is closer to VoIP than the global AMP minimum interval that it was using. Also wrote most of the code for the test to calculate Mean Opinion Scores based on the ITU recommendations, just need to add a latency measure to complete the calculation.
Did some reading around calculating mean opinion scores for VoIP and started to add code to the udpstream test to calculate it both the Cisco way and the ITU E-model way. Neither of them explicitly take into account jitter which seems unusual, my best guess so far is that they count jitter as part of the delay. Other models I've found do include jitter as part of the delay calculation.
Spent some time writing more documentation about installing and configuring an amplet client. Install process, configuration options and schedule file options all get a first draft description, hopefully enough to help people install monitors with minimal assistance but I expect they will need to be expanded. Updated example configuration files to agree with the new documentation.
Various small fixes, including updating the standalone icmp and tcpping tests to print human readable icmp errors rather than printing the type and code, and using Python .egg format in the ampsave packages.
Merged my scheduling parts of the website back into the main branch so that others can start using the features I've added.
Worked with Brad to get access to the Netspace amplet and bring it slightly more up to date (site firewalls had been interfering with us making changes until now). That's the last amplet that was reporting to erg now upgraded.
Merged my amplet client control socket changes back into the main branch without too much apparent trouble. Need to do some more testing to make sure everything is sensible. After merging, added options to set DSCP bits for the new udpstream test like all the existing tests had.
Continued working on documentation and tidied up the sample program showing how to read data from a RabbitMQ queue and extract the AMP messages.
Had a quick visit to Lightwire on Thursday which generated some more interesting ideas, especially for me around automation of test target selection. Some of these line up nicely with wishlist items I already had, so hopefully I might be able to find time to work on those features soon.