The Network Measurement Ecosystem
We've been doing a lot of collaborative work with our ISP partners lately and one thing that has become increasingly apparent to me is the disconnect between what ISPs expect from measurement / monitoring software and what researchers typically have the time and energy to implement.
More specifically, researchers are very good at developing new or improved measurement techniques but they are not so great at developing the necessary infrastructure around the measurements to make it easy for ISPs to deploy and use the new techniques in a production environment. As a result, the ISPs tend to fall back on tried and true monitoring software (e.g. Smokeping) even though our conversations with operators suggest that they would prefer more than just the simple metrics and graphs that such tools provide.
The act of performing a network measurement is only a small part of what I term the network measurement ecosystem, i.e. the complete set of components necessary for a network measurement technique to be deployed in a live production network, such as an ISP network. Only when all components of the ecosystem are in place will a network operator feel comfortable with the idea of using a particular tool or technique to measure and monitor their network. Measurement tools that do not have a complete ecosystem surrounding them will seldom be deployed in a production network, regardless of how useful or relevant that tool is to the operator.
In my view, the network measurement ecosystem consists of five key components which are described below:
Scheduling: It must be possible for a user to schedule regular continuous measurements. Once scheduled, the measurements should occur automatically without requiring any further user interaction. The system can also support ad-hoc or manual measurements for troubleshooting specific problems.
Collection: This is the process that runs the measurement itself and produces a recordable result. Most of the development of this component will have occurred during the research phase, although some additional polish may still be necessary to ensure the collection software is robust and reliable.
Storage: This component parses the results produced by the collection process and writes them into a queryable database to allow future access to the recorded data. This data can then be used to produce graphs or reports on request.
Accessibility: The ecosystem must include a standard API for fetching data from the storage backend, ideally without the user having to know the query language used by the database. An interface should also be present that allows users to receive a live feed of measurement results as they are reported by the collector; this can be used for real-time analytics, e.g anomaly detection.
Visualisation: The visualisation component will provide a graphical front-end that allows users to explore and interact with the collected measurement data. This would include the ability to draw custom graphs or view tables that summarise important aspects of the data.
Obviously, developing all of these components independently for a single measurement technique will require a significant amount of extra time and effort and few research projects can afford to do so. However, this generally means that while the project may be successful, the resulting measurement approach may see little use outside of the academic community.
However, what if we can instead build relatively generic ecosystems that provide all of the necessary components without being tied to a specific metric or tool? This could greatly reduce the amount of effort required to turn a research tool into a complete measurement system that would appeal to network operators.
PerfSONAR is an example of measurement infrastructure that attempts to provide an ecosystem that can be reused and extended to support new metrics. PerfSONAR can be a bit awkward to work with though, especially if you are simply looking to integrate a new metric into the PerfSONAR ecosystem. As a result, we elected to develop our own ecosystem for use with the AMP project.
Much of our development work has been aimed at ensuring that we have an ecosystem around the AMP measurement platform that is easy for us to extend as we develop new measurements, while still being appealing to our ISP partners by being easy to install and use. However, the ecosystem that we have created is not restricted to just AMP as a source of measurement data; it has also been used with our other measurement projects, so we are confident that the ecosystem that we have developed can be used by other researchers to make their tools more appealing to other users.
In terms of software, our ecosystem consists of several distinct logical components that can be used separately or as an interconnected whole, depending on how much of our ecosystem is required. Roughly, our software maps to the ecosystem model as described below:
Scheduling: amp-web for schedule management, amplet for enforcing the schedule on individual nodes
Collection: amplet and ampsave
Accessibility: libnntsc-client and ampy
In particular, the NNTSC, ampy and amp-web components have been designed with the ecosystem model in mind and can be easily extended to support new metrics. This has been hugely beneficial to us thus far and we look forward to being able to use our ecosystem to rapidly develop and deploy new measurement methodologies that can directly benefit our ISP partners.