Released a new version of libtrace on Tuesday that contains the most recent batch of bug fixes. Started moving the libtrace wiki from trac to github; only the tool pages are left to migrate.
Updated netevmon to support the new family-based streams in NNTSC. Since this new approach results in one time series per stream (as opposed to multiple streams having to be aggregated into each time series), this greatly simplified the anomalyfeed script. Added event detection for changes in AS paths which operates in much the same way as the old IP path event detection.
Started adding the ability to specify a subset of streams / collections for event detection in netevmon, rather than automatically running against all streams. The streams / collections of interest are provided via a config file and a SIGHUP will cause the file to be re-read and any necessary changes made. This also
meant I had to add unsubscribe support to the NNTSC exporter, so that it would stop sending live updates for streams that had been removed from the config file.
Continued to work on the interface for scheduling tests. As well as
adding new tests to a site, you can now modify an existing test. Full
details on a specific test can be viewed in a modal window very similar
to that used to create the test, and options/scheduling can be modified
there. Extra destinations can be added and existing destinations can be
removed, and the test itself can be completely deleted.
Added backend support to deal with all the above - including
adding/deleting test destinations, deleting tests, modifying test
arguments, modifying test schedules.
it was very similar and didn't warrant being entirely separate, but had
diverged enough to be annoying. The templates for the modals will need a
similar job done on them.
I started on preparing the slides for the internal PhD conference. I made some new slides for Megatree and prepared some graphs from the CAIDA Doubletree data.
I made a repository for the PAM version of the churn paper to share with Richard. The graphs titles have been removed and put in the text of the paper instead. It currently needs to be shortened by one more page. I also found some journals which would take the full length paper and some that would do so after it had been presented at PAM. These are a possibility.
I downloaded the black hole detector data and analysed it for black holes in load balancers. This has also had the address files updated and then been restarted.
I found a few more bugs in the non event driven Doubletree simulator, so this has been restarted after carrying out the fixes. I am gradually getting a good data set for the three simulation types: Doubletree, CAIDA Doubletree and Megatree.
Updated the DPDK format in classic libtrace to support the latest library versions per Shane's request, we now have a fairly nice way with dealing with the differences in library versions. Also ported some other patches from my branch, such as supporting multiple libtrace instances each running DPDK on different interfaces. Updated documentation for the DPDK format and moved this to github. Given that I have some 10Gbit machines on the way that I'll be wanting to try with DPDK, this is good to get into again. Some of these changes I still need to pull into my own branch.
I've also been working on refactoring the combiner step in my parallel libtrace(between the perpkt threads and reporter thread) to provide an API so users can provide their own. This removes the ability to call trace_get_results(), in favour of delivery directly to the reporter function.
Libtrace 3.0.21 has been released today.
This release fixes many bugs that have been reported by our users, including:
* trace_interrupt() now works properly for int, bpf, dag and ring formats.
* fixed double-counting of accepted packets when using the event API.
* fixed incorrect filtered packet counts for bpf format.
* fixed crash when performing very large reads with libwandio.
* fixed inconsistent behaviour if a bad filter string is used with int and dag formats.
* fixed potential infinite loop when combining filters, the event API and the pcapint format.
* fixed incorrect wire lengths when using SNAPLEN config option to truncate packets captured using the int format.
The full list of changes in this release can be found in the libtrace ChangeLog.
You can download the new version of libtrace from the libtrace website.
Finished up a draft of the PAM paper, eventually managing to squeeze it into the 12 page limit.
Spent a bit of time learning about DPDK while investigating a build bug reported by someone trying to use libtrace's DPDK support. Turns out we were a little way behind current DPDK releases, but Richard S has managed to bring us more up-to-date over the past few days. Spent my Friday afternoon fixing up the last outstanding known issue in libtrace (trace_interrupt not working for most live formats) in preparation for a release in the next week or two.
After discussion with Richard I made initial steps to put the churn paper into PAM submission format. Initially this resulted in 22 pages and the limit is 10 pages. After removing two sections and related discussion, along with graph captions and implementing the use of sub-caption formatting the result was 11 pages.
An alternative to the severe restrictions of PAM is the "International Journal of Computer Networks & Communications" which has a limit of 20-25 pages and is published bi-monthly. I will need to find out more about this Journal and its suitability.
In the non event based double tree simulator, at present the method used for many sources to make use of local stop sets runs for excessively long periods of time. It is already making use of text file lists of source addresses that occur more than once, to limit array sizes of these values. So far I have not thought of other ways to make further improvements to the run times. Fortunately, the benefits of local stop sets are expected to be small with only a few runs occurring from each source. Furthermore it may be possible to examine the savings in a few thousand cases for varying numbers of traces from a given source between 2 and 12, and predict the overall savings for hundreds of thousands of cases.
Built new Debian and Centos packages for the updated libwandevent code,
and used those to build new amplet2 packages for Centos. Debian packages
still need a bit more work to build in my new environment. Deployed a
couple of the new packages to further test some of the new traceroute
reporting for Shane.
Hooked up the rest of the test arguments in the form to schedule a new
test, so they are all now properly added to the database when the form
Filtered the YAML output to only include meshes that are used in the
schedule to reduce file size. Added code to track the time that
schedules were last updated, so that I can return a 304 not modified to
clients that request the YAML when there have been no changes.
Spent Wednesday watching student honours presentations. Well done to our
students who presented.
Spent most of my week writing up a paper for PAM on the event detectors we've implemented in netevmon.
Wrote and tested a script to ease the transition from the current per-address stream format to a per-family stream format. We've already accepted that we're not going to try and migrate any existing collected data for the affected collections, so it is mostly a case of making sure we drop all the right tables (and don't drop any wrong ones).
Spent Wednesday at the student Honours conference. Our students did fairly well and were much improved on their practice talks.
Successful week this week:
I recompiled everything from 6lbr's development branch (i.e. native 6lbr, mbxxx slip-radio and mbxxx 6lbr-demo), explicitly setting PAN IDs to 0xabcd for the mbxxx builds (for consistency with other platforms). I also made sure to reset the MAC and RDC layers of each build to use the "null" drivers, which ensure that packets should be transmitted/received as quickly as the hardware allows (with no regard for power usage). This resulted in excellent quality packet transmission, with the possibility of 0% packet loss over several minutes of pinging the devices, even for fragmented packets. The RTT is a quite respectable 60-70ms for pings split into two fragments.
Using the development branch for the mbxxx platform meant that the memory footprint increased again, which was a big problem for the 6lbr-demo app. To offset this, I completely disabled TCP (since it isn't needed by CoAP), and halved most of the buffers used by RPL. Thanks to gcc's magical -fpack-struct=1 there is now a working CoAP server on the device that can return its uptime in seconds!