User login

Search Projects

Project Members

Storage of Network Monitoring and Measurement Data

I am designing and building a system to allow for storage and retrieval of large amounts of network measurement and monitoring data. I need to make a flexible system that will be capable of dealing with a wide range of data such as polled data and data from flows as well as being fast enough to cope with high rates of live network data. The end goal is to provide this information to an anomaly detecting algorithm that can detect changes in the network and alert system administrators of the exact problem as well as presenting the information using graphs via a web interface.

01

Oct

2012

Pretty full on with assignments this week and starting to work through writing my report. I hacked up a webpage, with a little help from Joe, so people can see our progress http://wand.net.nz/~no15/index.html . Hoping to get most of the WAND 520 students on there too.

Had a play with pypy (JIT Complier for python). Should be handy for getting more performance out of my collector. Also thought about creating an importer specifically for bulk importing data as running the collector can be slow for imports of existing data.

19

Sep

2012

Another busy week with Assignments. It's getting to that crunch time of the year, and what with thhe 520 final write up I suspect it's going to be a pretty hard slog.

Attended the HPC Roadshow on Thursday along with Brad, Brendon, and Shane. We met up with Jamie there too. Some interesting talks and it's good to build knowledge about subjects outside my usual domain.

09

Sep

2012

Had my honours presentation this week. Overall I was pretty happy with how things went. I felt The feedback I received during my practice talk made a major difference to the final presentation. I'd like to thank Richard (my supervisor), Scott (from Lightwire), Tony, Brendon, Shane, and all the other WAND people who helped critique my original presentation.

Looking at my 520 now I need to find some time to finish up the final major component, the connector. Then testing and benchmarking to see what I need to do performance wise. Making some rough notes for the final report is probably not a bad idea either.

05

Sep

2012

Spent last week preparing for my 520 presentation. Practice talks went well.

29

Aug

2012

Last week wasn't very productive. I was mostly tied up with lectures and revising. I did give Shane my project to start using. He quickly got the hang of how things worked and wrote a module to push data into a database. Hopefully I'll find some time to do some initial benchmarking on reading the data to see how things actually perform.

20

Aug

2012

Didn't get any 520 work done last week. Pretty flat out playing catch up at the moment after my time off. Slowly getting back on top of assignments and lectures.

12

Aug

2012

Haven't done a weekly report in a while due to being sick. Luckily I kept working on my 520 so I'm not horribly behind.

Firstly I made some pretty dramatic changes to the main code structure to improve performance. This seems to have paid off, making improvements to not only the performance but also the flow of the internal code.

All the database implementation is done now. I've made all the module code as agnostic as possibly so any module can be loaded without changes to the main code base. In the end I used sqlalchemy for the database abstraction layer. The framework is incredibly powerful and makes it easy to add other features at a later date.

The forwarding of the data had to be re-engineered when I restructured the main code so this needs to be re-implemented. Hopefully that's not going to be too much of a task at this point.

Due to being sick I have a fair amount of University work to catch up on so I'm not sure how much I'll be able to get done next week.

21

Jul

2012

Started pulling things together over the past few weeks. The modules are now all integrated into the main application. Some of the forking and threading code needs some TLC but things are at least functional at this point. The main problem is scalability. I'm starting a thread for each data source which falls over when you follow a few hundred data sources. I need to be more clever about identifying resources and ensuring I'm not reading from them more than once for multiple streams. I may also require some changes/additions to the AMP api to allow me to request multiple data items in a single request.

Next Brendon wants me to stream the data (from multiple sources) in order. I'm just restructuring code so I can get at the time codes together, however I've been sick on and off this week which hasn't helped progress. After that I'll try and get the database implementation going.

03

Jun

2012

This week I didn't manage any more on my 520. I was ill on Tuesday which pretty much put me out of action for a few days. So in the end I barely got anything done all week.

Last week was more productive however. I manage to get a daemon going which accepts connections from a unix domain socket. At this point it expects an ID and timestamp and then feeds back all data for the source from that timestamp onwards. It then leaves the connection open and sends any new data as it arrives. I also implemented a simple command line and a way to restart the modules without restarting the whole server which should avoid having to restart the whole daemon all the time.

21

May

2012

Actually starting to make some progress on my 520 again. Turns out that's what evenings are for.

Tidied up my rrd implementation and fixed a bug where some of the data was ignored due to me misinterpreting the timestamp somewhere along the way. Started dealing with returning the new data each time the program is run. At the moment I'm just putting the last timestamp I read and the filename into a text file.

I'm now working on my Interim Report. Any guidance or suggestions are most welcome.

15

May

2012

Another week of fairly full on assignments (OpenCL and Wireless Sensor Nodes) which prevented me making progress on my 520. Trying to make an effort to spend more time at uni next week.

08

May

2012

Didn't really get anything done last week. I've been working flat out with assignments. Let's hope next week goes better.

01

May

2012

Spent a fair bit of my time doing my assignment for 513 last week. I did get some starting code towards building the databases while house sitting for my parents at the end of the week.

23

Apr

2012

I made some serious progress last week. I've now got my input plugins completed. I had to ditch the library I was using for parsing the RRDs as it was returning data in a totally broken way and I didn't have time to fix it. Instead I wrote a simple parser of the xml from rrdtool dump which seems to work nicely. I also wrote up an input plugin for amp which had a really nice API to work from. I started learning what I need to do for the database side of things. I had a chat to Christopher as has done some investigation into this area already and I may be able to incorporate some of his work into the project.

17

Apr

2012

Sadly I didn't get much done last week. I spent most of the time finishing other assignments and catching up on various things. I did have a look at how I was going to store the data in the database, particularly in respect to datatypes and table layouts. I'm hoping to work on my 520 full time this week.

10

Apr

2012

Finally wrote a script to pull apart an RRD. I've tried to be as flexible as possible. One thing that is going to be interesting is working out what schema the program inserting the data into the RRD is using. Column titles for the few RRDs I've pulled apart so far go from difficult to decipher to just plain useless (col1,col2,etc).

I also had a play with listening on a socket and passing the data over it. Next task on the list is to come up with a nice flexible protocol for transferring the data to the server.

03

Apr

2012

Ended up spending way more time than I should have on an assignment last week.

I did however spend some time researching different storage options and read an interesting paper on a storage solution google invented called dremel.

Annoyingly I have slowly fallen behind schedule. I will need to commit a large amount of time over the holidays to get back on track with my project.

26

Mar

2012

Spent most of last week writing my proposal which you can either find here:
http://goo.gl/fA1QI
or attached to this post. I managed to find quite a few documents commenting on the lack of scalability of RRD Tool and one from the original designer stating it's actual use case which was pretty helpful.

I did also get some time last week to learn how RRDs work (with some help from Brendon). I can now use python to pull apart RRDs. Sadly the documentation around pulling raw data out of RRDs is somewhat scarce so I'm having to work it out a lot as I go. Hopefully by next week I will have a nice script to pull values from Smokeping RRDs.

20

Mar

2012

I was sick for the most part of last week so wasn't able to get much done. I did manage to get some planning done towards the end of the week. I spent some time discussing with Brendon about my project. In particular what has been attempted before and what is new ground. We've decided to avoid modifying existing applications where possible and nicely slot in around them. This requires a modular and flexible structure around getting input data from things like Smokeping, collectd, and RRDBot.