User login

The Network Measurement Ecosystem

21

Jun

2016

We've been doing a lot of collaborative work with our ISP partners lately and one thing that has become increasingly apparent to me is the disconnect between what ISPs expect from measurement / monitoring software and what researchers typically have the time and energy to implement.

More specifically, researchers are very good at developing new or improved measurement techniques but they are not so great at developing the necessary infrastructure around the measurements to make it easy for ISPs to deploy and use the new techniques in a production environment. As a result, the ISPs tend to fall back on tried and true monitoring software (e.g. Smokeping) even though our conversations with operators suggest that they would prefer more than just the simple metrics and graphs that such tools provide.

The act of performing a network measurement is only a small part of what I term the network measurement ecosystem, i.e. the complete set of components necessary for a network measurement technique to be deployed in a live production network, such as an ISP network. Only when all components of the ecosystem are in place will a network operator feel comfortable with the idea of using a particular tool or technique to measure and monitor their network. Measurement tools that do not have a complete ecosystem surrounding them will seldom be deployed in a production network, regardless of how useful or relevant that tool is to the operator.

In my view, the network measurement ecosystem consists of five key components which are described below:

Scheduling: It must be possible for a user to schedule regular continuous measurements. Once scheduled, the measurements should occur automatically without requiring any further user interaction. The system can also support ad-hoc or manual measurements for troubleshooting specific problems.

Collection: This is the process that runs the measurement itself and produces a recordable result. Most of the development of this component will have occurred during the research phase, although some additional polish may still be necessary to ensure the collection software is robust and reliable.

Storage: This component parses the results produced by the collection process and writes them into a queryable database to allow future access to the recorded data. This data can then be used to produce graphs or reports on request.

Accessibility: The ecosystem must include a standard API for fetching data from the storage backend, ideally without the user having to know the query language used by the database. An interface should also be present that allows users to receive a live feed of measurement results as they are reported by the collector; this can be used for real-time analytics, e.g anomaly detection.

Visualisation: The visualisation component will provide a graphical front-end that allows users to explore and interact with the collected measurement data. This would include the ability to draw custom graphs or view tables that summarise important aspects of the data.

Obviously, developing all of these components independently for a single measurement technique will require a significant amount of extra time and effort and few research projects can afford to do so. However, this generally means that while the project may be successful, the resulting measurement approach may see little use outside of the academic community.

However, what if we can instead build relatively generic ecosystems that provide all of the necessary components without being tied to a specific metric or tool? This could greatly reduce the amount of effort required to turn a research tool into a complete measurement system that would appeal to network operators.

PerfSONAR is an example of measurement infrastructure that attempts to provide an ecosystem that can be reused and extended to support new metrics. PerfSONAR can be a bit awkward to work with though, especially if you are simply looking to integrate a new metric into the PerfSONAR ecosystem. As a result, we elected to develop our own ecosystem for use with the AMP project.

Much of our development work has been aimed at ensuring that we have an ecosystem around the AMP measurement platform that is easy for us to extend as we develop new measurements, while still being appealing to our ISP partners by being easy to install and use. However, the ecosystem that we have created is not restricted to just AMP as a source of measurement data; it has also been used with our other measurement projects, so we are confident that the ecosystem that we have developed can be used by other researchers to make their tools more appealing to other users.

In terms of software, our ecosystem consists of several distinct logical components that can be used separately or as an interconnected whole, depending on how much of our ecosystem is required. Roughly, our software maps to the ecosystem model as described below:

Scheduling: amp-web for schedule management, amplet for enforcing the schedule on individual nodes
Collection: amplet and ampsave
Storage: NNTSC
Accessibility: libnntsc-client and ampy
Visualisation: amp-web

In particular, the NNTSC, ampy and amp-web components have been designed with the ecosystem model in mind and can be easily extended to support new metrics. This has been hugely beneficial to us thus far and we look forward to being able to use our ecosystem to rapidly develop and deploy new measurement methodologies that can directly benefit our ISP partners.

Groups:

15

Aug

2016

network measurement ecosystem

Hi Shane,
I agree with your point that ISPs need to have an entire ecosystem around the measurement, and the 5 components you list are bang on. But I don't think building an entire ecosystem is the answer - rather make a modular system that allows any one component to be reused. With open interfaces, it would be easier to integrate a new scheduling & colllection component into an existing data storage and visualisation system, than introduce an entirely new data storage and visualisation into existing workflow. At least for this ISP, we're all about minimising tools.
chris

16

Aug

2016

re: network measurement ecosystem

I agree with you completely, Chris. I originally wrote this blog post from the academic perspective, where we love to come up with new stuff from scratch and then never go anywhere with it because adding all of the rest of the stuff is hard and/or boring. In that case, having a whole ecosystem available and ready to go can make a huge difference.

The opposite perspective, as you mention, is one where you've already invested a lot of time and energy into an existing monitoring system and prefer something that can be easily integrated into that. Don't worry -- we're definitely thinking about that side of it too :)

You'll notice that each of the five components I mention are implemented as completely separate pieces of software. This is to support the idea that you mention of being able to pick and choose which parts you need to complete your own ecosystem.

If you just want, for instance, the AMP measurements but want to retain the rest of your monitoring system, then the ampsave library will provide a standard interface that you can use to link the two components without having to take on our whole ecosystem. Write a little bit of wrapping code to convert the AMP results into the format expected by your storage backend and you're done! The same theory applies to most of the other cross-component communications: NNTSC provides a client library that can be used to query the database without needing ampy, ampy returns basic JSON objects that are basically self-documenting and can be easily parsed, etc.

Thanks for your interest in my post and I hope this reply reassures you that we're not going too far down the wrong track!