Brendon Jones's blog
Added configuration modals for dns, smokeping, munin and some lpi data,
so that multiple data series (of the same type) can easily be viewed at
once and compared. Refactored the initial modal implementation used by
icmp and traceroute data to be much cleaner and easier to integrate the
new data types.
Updated the legend labels describing the currently displayed data to use
the more detailed information Shane has made available. Included in this
is a line number that is now used to fix the colouring order, making
sure that the line colour matches what the label describes.
Spent some time reworking small details on the newest AMP Debian
packages to install and run properly when installed by puppet on the new
Decided that we needed to simplify the database schema for storing
traceroute data, so spent some time working on that. The new schema
works better with the existing aggregation functions and is faster to
query. Moved all the existing data to the new schema.
Merged in the rainbow traceroute graphs that Brad created and got them
using data from the new traceroute data. Moved the default view of
combined traceroutes to use smokeping rather than a basic line graph to
better show what is happening with multiple addresses.
General tidyup of code that had got a bit crufty, removed some sections
that were duplicated or no longer required. Started work on moving the
DNS test to use views.
Finished up the code to turn a single stream id into a view, for use
with events where we want to see the anomalous data. Merged all the view
changes back into the main branch, which highlighted a few broken cases
with things I hadn't considered (netevmon). Worked with Shane to get
those all sorted and working fairly quickly again.
Put together a nice query that will aggregate traceroute data to the
most common path within a binning period. Added a function to fetch this
in NNTSC which works fine for periodic data, but ran into some
difficulties extending this to a single, most recent block of data. It
shouldn't be difficult to get this working - hopefully a fresh look at
it on Monday will get it sorted.
First week with our new summer students, so spent some time working with
them and getting them all set up.
Updated the views database to be slightly more complex, making it much
easier to add or remove line groups to/from a view. Also wrote the
supporting code that actually enables users to do this - the label
describing each line group can now be clicked on to remove it from the
graph. The matrix will now generate any views that it needs when the
page is loaded so this no longer needs to be done by hand.
Started to add a streams-to-view interface so that events can be plotted
easily. Events are based on single streams rather than groups of
streams, so need to be viewable individually.
Spent a bit of time tidying up the new AMP packages to be more
consistent with how files and directories are named. Logging, config,
etc should all use the same name rather than 2-3 different ones. Fixed a
couple of small bugs in tests/reporting that Shane found while adding
them to NNTSC.
Lots of small fixes to things that use the new view interface. Fixed a
few more caching problems where the list of stale streams to fetch was
being ignored and instead all streams were being fetched. Updated the
tooltips on the matrix to use the new API and to split IPv4 and IPv6
results. Updated traceroute graphs to use the new API.
Replaced most of the dropdowns for the amp-icmp data with a modal
bootstrap window that allows selecting which streams (or groups of
streams) to display. Wrote most of the code to insert these views into
the database if they have not been seen before, and to fetch them out
when required. There are a couple of edge cases around determining
members of a view that look like they will require a slight redesign of
the database to accommodate.
All percentile and aggregation data is now fetched as view labels rather
than by stream id, which means the database does the heavy lifting when
calculating stats across multiple streams. Tests that only display data
from a single stream are converted into the view format and use mostly
the same code path. Started work on the database to store the view
descriptions so that they are no longer hard-coded for testing.
Fixed a couple of bugs that meant the same time period could be queried
and displayed twice in some graphs. Fixed some caching problems where it
couldn't differentiate between no cached data and cached empty data.
Continued working through writing instructions for installing the AMP
client and server software onto a machine running NNTSC. During testing
I found a few cases where the URL handling for the matrix display was
expecting a hardcoded prefix, so refactored that to work properly with
other locations and removed a lot of messy duplicated code. Found a few
parts of the database code that were not set up properly as they had
been written since it was last installed.
Spent some time looking for obvious slow points in the display of AMP
graphs and fixed some locations in the smokeping style graphs where
redundant work was being done to generate line colours. A large portion
of the time fetching data is spent sorting it (using disk), so had a
quick look at how to best keep this in memory. Giving postgresql a lot
more memory to work with cut the sort time from around 3s to 100ms, but
this might not be scalable. Ideally we can find ways to limit the amount
of data that actually needs to be sorted.
Started to work on describing graphs and fetching data based on "views"
that contain a number of streams, rather than having ampweb list all the
streams that need to be fetched. Updated the amp-icmp smokeping style
graphs to work with a simple fixed mapping of view to stream ids for
Fixed a couple of bugs in the event grouping code that meant it was
running much slower than it should when groups got large. It should now
be a lot smarter about excluding attributes from the grouping process if
there is no way that using them could result in better groups.
Had a good meeting with Lightwire on Wednesday and got good feedback
about our software. Spent some time talking with Nathan trying to fix
issues they were having with it, and putting together
packages/instructions so that they can install AMP alongside their other
monitoring. This is looking much more complicated than it should be, so
will have to see how much of this can be taken care of in pre/post
install scripts etc. Most of the work is in setting up the server
though, so only needs to be performed once.
Finished reformatting the data to remove some mess and unnecessary
layers of nesting that had crept in while trying different things. It
should now be set up to deal properly with representing multiple lines,
split up or grouped by however the backend wants to do so. Updated all
the tests to use the new data format.
Spent an afternoon with Shane and Brad designing how we are going to
represent graphs with multiple lines, in a way that will let us merge
and split data series based on how the user wants to view the data.
Tidied up the autogenerated colours for the smokeping graphs to use
consistent series colours across the summary and detail views, while
also being able to use the default smokeping colouring if there is only
a single series being plotted.
Moved the multiple series line graphs back to using the smokegraph
module, but with colouring based on the series rather than to indicate
loss. This appears to work well for the smaller data series that I've
tested on, though I have yet to get a sensibly aggregated set of data
for those graphs with very large numbers of streams.
The new graphs with arbitrary numbers of data series had caused event
labels to be triggered on mouseover for almost all series except the
first, which I fixed. Only a dummy series will trigger mouse events, so
that it doesn't try to display information about every single data point
on the graph. Through profiling I also found many extraneous loops and
checks for events that could be prevented by properly disabling events
on the summary graph as well.
Also spent some time reading and critiquing honours reports, not long to go!