Brendon Jones's blog
Found and fixed a bug in the test scheduling while dealing with some user queries around test scheduling. The default value for test frequency (used if not explicitly specified) had the wrong units and so caused tests to be scheduled more frequently than intended. Also fixed a couple of places where wrapping could occur, and wrote some unit tests to cover those code paths.
Removed an unused option and related code paths from the traceroute test that weren't adding a lot except some hideous looking code. While looking at this I tightened up the timers being used for sending/timing out packets to make sure that timeouts were correctly based on the oldest outstanding probe, and that probes were being sent as close to the desired rate as possible. Also added some knobs for changing how many targets are probed at once, and now randomise the initial TTL probed to try not to hammer any nearby hops quite so hard.
Removed some old and unused files from the amplet2 repository. Updated more documentation to be accurate with the current state of things.
Fixed up another couple of minor issues that had been reported. Fixed the loss timer in the tcpping test to start after the last packet is sent so that long interpacket delays are possible if desired. Tidied up a regex to match certificate filenames more accurately. Made more documentation updates. Tried to improve packaging to make sure that default configuration files were as usable as possible without manual edits.
Built new packages and pushed them out to one of our test deployments. Worked through a few issues with getting the HTTP tests and target meshes lined up properly so that they display. Still need to figure out the correct way to fix this so that users don't need to worry about this special case.
Spent some time reworking my Debian package build system after accidentally building with incorrect source due to some release candidate versioning suffixes being missed. The new system will also better deal with release specific Debian directories.
Made lots of small changes based on things that had been reported by users or that I had noticed behaving incorrectly in the last week. Fixed a cap on a retry timer that was alternating between two different values. Updated the HTTP test to always store the full URL including scheme, even if the user didn't explicitly specify it. Updated some error messages to try to be more useful and accurate.
Fixed the apache2 configuration in the amppki packages to work properly once everything is installed properly in the correct system locations. There were issues around the python path being incorrect and not able to find the libraries, as well as naming collisions with the ampweb WSGI processes.
Spent some time with Shane trying to track down the cause of some missing data in the web graphs. Found the cause of the missing DNS data (wrong column names being used) and why some sites didn't have path length data available (it's only sourced from one style of traceroute test).
Tried to expose through the web interface the ability to force the address family to use when resolving test targets. This was a bit more complicated than expected, due to new targets getting automatically added to the database and it including the various suffixes used internally to represent address families.
Put together some new server packages to test the new changes and started working through verifying that they worked, ahead of another release.
Spent some time running the amplet2 client in a few different ways on my test machines, with all the tests running to different sorts of targets. Found a few more slight issues with tests that I investigated and fixed, including one where a test server would be listening on all interfaces and addresses even if the client that spawned it was bound to something specific.
Tidied up some memory management issues as reported by valgrind. Freeing the easy and obvious allocations before exiting makes the actual leaks or errors a lot easier to find. Also did some general tidying of data structures, removing some duplication that is no longer required and putting some limits on others.
Built some new server packages with updates from the last few weeks and installed them on one of our test deployments. Ran into some issues with dependencies not being correct, as ubuntu/wheezy don't have them all packaged.
Backported the current librabbitmq-c packages from stretch to use with our wheezy and jessie amplet2 packages, to replace the ones previously patched to include external authentication support. Fixed a few uses of deprecated/changed functions with the new version and deployed them on some test sides to check that they still worked as expected.
Found and fixed a few edge cases where failure to connect to remote test servers was causing crashes rather than gracefully exiting. Tidied up a bit more documentation, compiler warnings. Had another look at boot dependency ordering, this time around upstart. Couldn't find an easy way to make my init scripts work nicely, but forcibly delaying the start of my init scripts when run under upstart will do the trick until we move on to systemd.
Exposed configuration options for the udpstream test in the web interface so that it can now be scheduled. Made a few other small bug fixes here as well, including updating install documentation/scripts and allowing IPv6 addresses as valid names.
Investigated boot ordering in sysvinit and systemd to try to fix a problem observed in one deployment where sometimes the amplet client is starting too soon and not having access to dns (and occasionally rabbitmq). Attempted some fixes and built new packages, but have yet to hear if the changes made any improvement.
Rewrote the icmp and dns tests to use libwandevent to manage sending and receiving probe packets, which brings them into line with many of the other tests and removes some code that may have had uncertain licensing. Should be able to factor out a lot more of the similarities between these simple tests (icmp, dns, tcpping) at a later date.
Lots of minor tidy ups, quietened some log messages that weren't very relevant. Updated sample configuration files, code documentation and some licensing.
Continued tidying up parts of the amplet2-client code that I had been meaning to look at for a while, but not had the time. Reworked some of the server handling to need less information passed around to configure interfaces, addresses etc, as this was already available in other ways. Made the way that the tests use this information a lot more consistent. Split some of the server connecting code into SSL vs non-SSL sections so that they could be more easily reused rather than duplicating a lot of work each time.
Spent too much time trying to determine why some very simple code was giving incorrect results when I removed debugging output. Compiling without optimisations would also fix it. Ended up changing the way I called it to stop the compiler optimising it out.
Investigated and replaced a few small sections of code that had come from various sources, with implementations using more compatible licenses.
Lots of small improvements to try to polish everything slightly. Quietened some warnings that were being printed during normal test operation. Tidied up some memory usage to make valgrind happier. Updated the amplet2 Debian packaging to fix a few warnings that lintian had thrown up. Removed or tidied up some portions of code that were no longer used, or had been mostly duplicated to get slightly different effects.
Spent some time trying to make the output from the standalone tests consistent with each other, and removing duplicated code involved in printing various parts of the usage statements that all tests share. Tidied up the getopt code around it as well, so that all the tests have a consistent ordering which should make it easier to add new options in the future.
Worked with Shane to install new amp server side packages on our websites that we maintain.
Fixed a heap of smaller bugs in the amplet2 client that had been ignored for a while, including making the sample skeleton tests work again, properly logging errors when failing to configure rabbitmq, and dealing with EADDRINUSE in test server code.
Removed some deprecated code and comments from the amplet2 client and rearranged source files to more sensible locations as part of a general tidy-up ahead of release. Double checked to confirm that the licenses of the libraries we use will allow us to do so in the way we want.
Built new Debian packages for netevmon, ampweb and ampy, ready for deployment next week.
Updated the control protocol for starting tests and servers to expect and report messages about success/failure, rather than simply closing the connection when things go wrong. This means that you can now get a log message informing you that you aren't allowed to perform the action rather instead of wondering why it didn't work. Also spent some time tidying up some of the confusion between controlling specific test instances and compared with the high level client control, and renaming functions to more clearly describe what they do.
Fixed up a few bugs around amplet packaging including removing lots of unused cruft from the Jessie init scripts. Fixed up a few bugs in ampweb, including some css issues that had been annoying me, and schedule last-modified times not being updated for certain change types. Schedule last-modified time is now tracked per site rather than per schedule item, which makes it easier to flag sites that need to fetch new configuration (especially for sites that have been removed from a mesh).