First to catch up on the previous weeks, I created my brief and proposal (see attached for more info).
This week my goal was to become familiar with using the tool Mallet which I will use to analyse log files.
There aren't many tutorials on Mallet but I Found some good ones on using the command line interface (CLI) and I should be able to accomplish most of what I want to do just with the CLI; I will use it for now to start my testing and will possibly move onto the java API if I need to perform something more complicated than what is capable with the CLI.
I then looked further into Mallet's java API and there aren't really any tutorials on how to use it but if I need to accomplish something more complicated a combination of the example files and docs should be sufficient to make a useful program.
The next step is to get some log files from Syslog or var/log and start converting them to Mallet's file format ".mallet" (which could be a single file or an entire directory) and then run the data through some machine learning methods.
Continued developing code to group events by common AS path segments. Managed to add an "update tree" function to the suffix tree implementation I was using and then changed it to use ASNs rather than characters to reduce the number of comparisons required. Also developed code to query NNTSC for an AS path based on the source, destination and address family for a latency event, so all of the pieces are now in place.
In testing, I found a problem where live NNTSC exporting would occasionally fall several minutes behind the data that was being inserted in the database. Because this would only happen occasionally (and usually overnight), debugging this problem has taken a very long time. Found a potential cause in a unhandled E_WOULDBLOCK on the client socket so I've fixed that and am waiting to see if that has resolved the problem.
Did some basic testing of libtrace 4 for Richard, mainly trying to build it on the various OS's that we currently support. This has created a whole bunch of extra work for him due the various ways in which pthreads are implemented on different systems. Wrote my first parallel libtrace program on Friday -- there was a bit of a learning curve but I got it working in the end.
Started work on the RPi adaption of base station:
Working on getting the required tools and libraries in the latest version of Raspbian in order to use the 802.15.4 interface. So far I have installed the libnl library and iwpan tools form source, yet to test if they are working with the interface.
The RPi board needs to be set up for ease of access so I don't have to continuously switch peripherals from PC to RPi, needs static IP (working for the moment but breaks when a network connection is shared to the board) and vncserver on boot. This is setup using tightvnc for the moment with a viewer on a win7 laptop.
Next is to setup a sniffing program (with usb dongle) and attempt to use the add-on board to transmit packages to test if the add-on is accessible.
The earlier half of this week was tied up with other paper commitments. I've now been set up with access to comet along with some snapshots of VMs I will be using to test controller and topologies (OF, Mininet). I've had a bit of a play with some example topogies and controller applications.
I've also managed to get hold of Craig Osbornes code, which appears to have some good concepts and some code I will be able to recycle and adapt for my own project.
My next step is to design the type of topology that is ideal for an ISP environment which also works with other deployments such as University Campuses. Primary factors are ability to scale and be redundant. AAA services also need to be considered, not so much accounting, but the other two.
My first point of contact for this was Chris Browning from Lightwire, but turns out he is going on leave for the next few weeks, so I'll likely consult Scott Raynel to see what current topology is like, see if he knows what Chris' direction was etc. My next point of contact is likely to be Brad Cowie.
I also need to seek out how provisioning of hardware, physical or virtual, works as so far it's been pretty Python code.
Made the HTTP test more consistent with the other tests when reporting
test failure due to name resolution or similar (rather than the target
failing to respond). Also added the option to suppress parsing the
initial object as HTML and to just fetch it.
Found and fixed a problem with long interface names being used inside my
network namespaces. Linux appears to allow longer interface names than
dhclient can deal with, so I've had to shorted some of my more
descriptive interface names.
Spent some time measuring the quantity of test data to get an estimate
of how much database storage will be required for the new test clients.
Also looked at how much throughput test data is likely to be used
(multiple TB of data a month), and possible locations that might be
suitable to test to without hitting rate limits or interfering with
Continued to check various parts of the process chain to make sure that
they perform robustly when bits go away, networking is lost, machine
It was decided to simulate an extra TTL value, using the IS0 event driven Internet simulator, to achieve a better level of optimisation of this parameter. Several runs were initiated and the data files are being collected.
Work has begun on the Megatree chapter. A description of Megatree was written and some of the appropriate graphs have been accumulated in the thesis document. Megatree avoids discovering the same load balancer more than once. When a load balancer divergence point is encountered in a new trace stored information indicates where possible convergence points might be in terms of hop count and IP address.
The collection of per destination ICMP traces is still underway. Two types are being collected at sequentially in time. They are gathered with seven or 128 flow IDs. Of interest is if they count the same number of load balancer divergence points and what amount of traffic is involved in doing that.
This report will cover the progress to date. I've also attached the proposal I submitted in case anyone is interested in reading that.
In order to make a start there are two areas that need to be considered. The first, and likely most important, is to consider how the provisioning flow is going to work. IEEE 802.15.4 has some build in AES security stuff as well as different channels (much like Wi-Fi has channels). At the link layer there needs to be a way to negotiate some kind of connection which will involve scanning for a channel both nodes can talk on as well as some kind of key exchange to encrypt the communication. Above that, in the networking layer, we probably want some kind of authentication and encryption to ensure that nodes are who they say they are. Finally, the application layer will need to support some kind of key store so users can add the new node before trying to connect it to the network.
The other side that needs to be considered is which platform the project should be built on. The CC2538 hardware has already been decided. Not only does it have IEEE 802.15.4 radios, it is also supported by all the major WS platforms I could find. The other convinient aspect is we already have development boards for it. The software, at this stage, will either by RIOT OS or ContikiOS. The issue with ContikiOS is that I've seen Brad and Isabelle working with it and it seems like there are a number of challenges along the way. RIOT OS is lacking a gateway software which would make it difficult to get any of the application layer stuff working. A gatteway is also useful for sniffing network traffic, but that can be done with any of the nodes and a serial port.
I think I'll start by reading through the IEEE 802.15.4 standard to gain some understanding of how the channels and encryption work. Those are likely to be key to the project. I may as well take the oppourtunity to draw some diagrams at the same time (which will help with report writing!).
Back after a week on holiday. Spent a decent chunk of time catching up on emails, mostly from students having trouble with the 513 libtrace assignment.
Continued tweaking and testing the new eventing code. Discovered an issue where the "live" exporter was operating several hours behind the time data was arriving. Looks like there is a bottleneck with one of the internal queues when a client subscribes to a large number of streams, but still investigating this one.
prophet started to run out disk space again, so had to stop our test data collection, purge some old data and wait for the database to finish vacuuming to regain some disk. Discovering that we had a couple of GBs of rabbit logs wasn't ideal either.
While fixing the prophet problem, did some reading and experimenting with suffix trees created from AS paths with the aim of identifying common path segments that could be used to group latency events. There doesn't appear to be a python suffix tree module that does exactly what I want, but I'm hoping I can tweak one of the existing ones. The main thing I'm missing is the ability to update an existing suffix tree after concatenating a new string rather than having to create a whole new tree from scratch.
Wrote my proposal. Have changed the project slightly to get away from the idea that it is strictly about NFV and making a vBNG.
Installed and configured nntsc and postgres on the new test machine in
order to keep a local copy of all the data that will be collected. This
also has the ability to duplicate the incoming data and send it off to
another rabbitmq server for backup/visualisation.
Made some minor changes to the amplet client for the new test machine
that required building new packages, which are now available in a custom
repository to keep them separate from the others. Installed the amplet
client on the new test machine and configured it to run multiple
clients, testing the network namespace code.
Continued to test some of the infrastructure code around
starting/stopping amplet clients, creating network namespaces etc. Found
and worked around a few small problems where tools were not properly
namespace aware, or would not create files belonging to a namespace (but
would happily use them if they already existed).