Brendon Jones's blog
Kept working with Chromium to try to get complete information on object
fetch timings. It looks like I should be able to get full timing
information for every object if I can set the Timing-Allow-Origin
header. Currently stymied by the library crashing in its memory freelist
implementation when I try to modify HTTP response headers.
Had a closer look into the behaviour of wget to try to confirm some test
methodology used by some other data sources I'm looking at. Turns out
that wget actually measures only the amount of time spent reading from
the socket and ignores everything else, reporting a very misleading time
Investigated further into some MTU issues we were seeing to confirm the
behaviour we were reporting. Something in the path only has a 1400 byte
MTU but doesn't always send packet too big messages, which is causing
lots of connection failures.
Spent some time proofreading reports.
Made a new dump of up to date data for analysis, including all test
types this time. Spent some time talking to Ray about it.
Generated some graphs to try to show the comparison in latency between
two connections. Some connections that should be quite similar are a few
milliseconds different to the same target at the same time, but quite
consistent across all targets and all times. Connections that are known
to be different also exhibited a lot of similar latency to some targets,
but across multiple targets and over time there are clear differences.
Spent some more time trying to get to grips with the embedded Chromium
library and how to implement my own versions of the URL fetching functions.
Spent some more time looking into using embedded Chromium as part of the
HTTP test. I've managed to successfully extract all of the
navigation/resource timing information from the browser after the page
has loaded which is very useful. Getting access to headers looks like it
will require implementing my own resource handlers and processing
requests manually, but should be doable. Also, I still haven't managed
to completely decouple the browser from GTK - something is still trying
to initialise it even though there is no need for it and nothing is ever
drawn to the screen.
Tidied up some more configuration parsing and parts of the main loop in
the amplet client, removing the need for a few more global variables
that were convenient at the time of writing.
Helped Brad configure a new measurement machine to be sent out.
Reconfigured some of the existing machines to swap the management ports
around so we can test them without the reporting traffic interfering.
Short week as I was off sick on Monday and Tuesday.
Spent some time looking into using a headless web testing environment
as an alternative to the current HTTP test. This would give us
don't (due to them being generated programmatically or obfuscated). Not
all of the headless testing software appears to give full access to the
events that I'm interested in, while some are written such that they
will be awkward to integrate into an AMP test. Currently looking at
embedded Chromium as most likely to be useful.
Started refactoring some of the configuration parsing code in amplet to
remove some unnecessary globals and remove some cruft from the main loop
that didn't really need to be there.
Updated the website authentication to make it easy to toggle on and off,
as we don't want to protect the public site. Merged this and the rest of
the recent changes (raw data fetching etc) back into the develop branch.
Spent some time looking into what appear to be periodic MTU issues on
one of our test connections that are preventing the throughput test from
running. Confusing matters is that I'm not sure how well the route cache
deals with network namespaces - it sometimes appears as if it is all
shared between all connections, but sometimes it doesn't. It's possible
these symptoms would go away with a newer kernel version (route cache
was removed, better network namespace support).
Fixed the URL parsing to allow partial specification of the desired
data. If the URL is incomplete then the user is returned a list of valid
values for the next portion. Default values are automatically selected
if there is only a single possible value. If the URL is missing all
parameters then the user is presented with documentation giving a basic
overview of the API.
Added some smarts to deal with all the different data columns that the
amp-latency tests can return (icmp, dns, tcpping are all slightly
different). This keeps a consistent order of columns and makes sure that
the labels all line up appropriately with the data.
Updated the way data is fetched to be in a more sensible json format
that can easily be converted to CSV so that both formats can be supported.
Spent some time checking that normal behaviour was not impacted by some
small changes I had made to ampy, and tidying up a couple of places
where changes had accidentally affected graph drawing.
Continued to work on the raw data interface to fetch AMP data through
the website. It took some time to find the appropriate place to deviate
from the normal aggregated fetching used for the graphs, but now with
minimal code changes there is now a path that will follow the full data
fetching used by standalone programs (e.g. netevmon).
Fetching now works for data described by a stream id, following almost
the same path as usual for graphs. To allow some degree of data
exploration and easy generation of URLs it's also important to deal with
data described by the human readable stream properties. I'm currently in
the process of converting a URL with stream properties into a stream id,
and alerting the user to missing properties that are required to define
Updated the HTTP test to not include time spent fetching objects that
eventually timed out, as all that was doing was recording the curl
timeout duration. Instead, we need to report the number of failed
objects, last time an object was successfully/unsuccessfully fetched,
and possibly try to update the timeouts to match those commonly used by
Switched the meaning of "in" and "out" for throughput tests, as
somewhere along the way this got switched. This involved updating
existing data in the database as well as the code that saves the data.
Added a bit more information to log messages to help identify the
specific amplet client that was responsible, as it was becoming
confusing in situations with multiple clients running on the same machine.
Started adding an interface to download raw data from the graph pages.
Partway through it was taking longer than expected, so took a slight
detour and wrote a standalone tool to dump the data from NNTSC.
Thursday was my first day back after my break, so spent some time
catching up on things that had happened while I was away.
Shane and Brad had found some unusual data being reported, so I looked
into that and updated schedules to help solve some of the problems. Also
exposed some more tuning knobs so that we can change inter-packet delay
when sending probes (we were sending too fast in some cases) and merged
in some fixes that Shane had written.
Built some new Debian packages with these changes and pushed them out,
which appears to have immediately improved the quality of the data we
Configured the third throughput test target and updated the test
schedule to properly include all three throughput test targets. Went
through all the results to make sure that all are reporting - found and
fixed a couple where incorrect HTTP targets had been set and redirects
were happening. Double checked that some unusual throughput results were
correct (they appear to be).
Spent some time investigating some connections that appeared to up, but
wouldn't forward my data. The modem seems to think everything is fine
and there is nothing obviously wrong at my end, so I've asked for it to