User login

Search Projects

Project Members

Web-based Application Protocol Monitor

Knowing the application protocol of packets is useful in a number of ways, notably in traffic shaping to optimize performance while achieving reduced latency and also, in analysing Internet usage patterns. Current techniques rely on Deep Packet Inspection which examines the entire payload of a packet when it passes through an inspection point which introduces a number of concerns, particularly with regard to privacy and security. As an alternative, the WAND group has developed a library called “Libprotoident” that identifies the application protocol(e.g HTTP, Skype, etc) using only a small amount of packet payload. However, Libprotoident only produces textual output which is difficult to explore and interpret.

Hence, the objective of my project is to create a web-based application protocol monitor to visualise the results produced by Libprotoident, which includes producing real-time interactive graphs that supports functions such as zooming and panning, and also allow users to customize the protocols displayed.

This honours project has been completed, but the work towards a web-based application protocol monitor has now been merged into the Cuz project.

29

Jul

2013

Table partitioning is now up and running inside of NNTSC. Migrated all the existing data over to partitioned tables.

Enabled per-user tracking in the LPI collector and updated Cuz to deal with multiple users sensibly. Changed the LPI collector to not export counters that have a value of zero -- the client now detects which protocols were missing counters and inserts zeroes accordingly. Also changed NNTSC to only create LPI streams when the time series has a non-zero value occur, which avoids the problem of creating hundreds of streams per user which are entirely zero because the user never uses that protocol.

Added ability to query NNTSC for a list of streams that had been added since a given stream was created. This is needed to allow ampy to keep up to date with streams that have been added since the connection to NNTSC was first made. This is not an ideal solution as it adds an extra database query to many ampy operations, but I'm hoping to come up with something better soon.

Revisited and thoroughly documented the ShewhartS-based event detection code in netevmon. In the process, I made a couple of tweaks that should reduce the number of 'unimportant' events that we have been getting.

15

Jul

2013

Made a number of minor changes to my paper on open-source traffic classifiers in response to reviewer comments.

Modified the NNTSC exporter to inform clients of the frequency of the datapoints it was returning in response to a historical data request. This allows ampy to detect missing data and insert None values appropriately, which will create a break in the time series graphs rather than drawing a straight line between the points either side of the area covered by the missing data. Calculating the frequency was a little harder than anticipated, as not every stream records a measurement frequency (and that frequency may change, e.g. if someone modifies the amp test schedule) and the returned values may be binned anyway, at which point the original frequency is not suitable for determining whether a measurement is missing.

Added support for the remaining LPI metrics to NNTSC, ampy and amp-web. We are now drawing graphs for packet counts, flow counts (both new and peak concurrent) and users (both active and observed), in addition to the original byte counts. Not detecting any events on these yet, as these metrics are very different to what we have at the moment so a bit of thought will have to go into which detectors we should use for each metric.

05

Jul

2013

Added support for the Libprotoident byte counters that we have been collecting from the red cable network to netevmon, ampy and amp-web. Now we can visualise the different protocols being used on the network and receive event alerts whenever someone does something out of the ordinary.

Replaced the dropdown list code in amp-web with a much nicer object-oriented approach. This should make it a lot easier to add dropdown lists for future NNTSC collections.

Managed to get our Munin graphs showing data using a Mbps unit. This was trickier than anticipated, as Munin sneakily divides the byte counts it gets from SNMP by its polling interval but this isn't very prominently documented. It took a little while for myself, Cathy and Brad to figure out why our numbers didn't match those being reported by the original Munin graphs.

Chased down and fixed a libtrace bug where converting a trace from any ERF format (including legacy) to PCAP would result in horrendously broken timestamps on Mac OS X. It turned out that the __BYTE_ORDER macro doesn't exist on BSD systems and so we were erroneously treating the timestamps as big endian regardless of what byte order the machine actually had.

Migrated wdcap and the LPI collector to use the new libwandevent3

Changed the NNTSC exporter to create a separate thread for each client rather than trying to deal with them all asynchronously. This alleviates the problem where a single client could request a large amount of history and prevent anyone else from connecting to the exporter until that request was served. Also made NNTSC and netevmon behave more robustly when a data source disappears -- rather than halting, they will now periodically try to reconnect so I don't have to restart everything from scratch when I want to apply changes to one component.

Finally, my paper on comparing the accuracy of various open-source traffic classifiers was accepted for WNM 2013. There's a few minor nits to possibly tidy up but it shouldn't require too much work to get camera-ready.

26

Jun

2013

There is currently an increasing demand for accurate and reliable traffic classification techniques. Libprotoident is a library developed at the WAND Network Research Group (WAND) that uses four bytes of payload for the classification of flows. Testing has shown that Libprotoident achieves similar classification accuracy to other approaches, while also being more efficient in terms of speed and memory usage. However, the primary weakness of Libprotoident is that it lacks the visualisation component required to encourage adoption of the library.

This report describes the implementation of a reliable real-time collector for Libprotoident that will form the back-end component to support a web-based visualisation of the statistics produced by the library. The collector has been designed and implemented to support the classification of flows and exporting of application usage statistics to multiple clients over the network in separate threads, whilst operating asynchronously so as to achieve high performance when measuring multi-gigabit networks.

Author(s): 
Meenakshee Mungro

24

Mar

2013

Got a full draft of the report, with a final version of Chapters 2-5. 9 chapters spread over 56pages of (hopefully) dry enough material.

Plan for the next 2 days is to edit, edit and then edit some more.

Glad it'll be over soon.

18

Mar

2013

Spent the week fixing up chapters 2-5 and emailed Richard a copy on Friday night(including chapter 6). Didn't get to work on any new chapters during the weekend, had assignments to catch up on.

Plan for this week is to finish chapter 7 by Wednesday, and start on chapter 8 and 9.

12

Mar

2013

Spent the last week working on the report. Have a draft version of Chapters 2-5 that have been checked atleast once by Shane, and added most of the content for Chapter 6.

The plan for the coming week is to finish chapter 6, get it checked asap and give Richard a copy of chapters 2-6.
Then, I have a decently sized chapter to work on(7 - Threaded Network Export) and 2 smaller ones(8-Testing/Evaluation and 9-Future Work/Conclusion). There's also the intro that I need to edit at some point.

11

Mar

2013

Added a data parser module to NNTSC to process the tunnel user count data that we got from Lightwire. Managed to get the data going all the way through to the event detection program which spat out a ton of events. Spent a bit of time combing through them manually to see whether the reported events were actually worth reporting -- in a lot of cases they weren't, so I've refined the old Plateau and Mode algorithms a bit to hopefully resolve the issues. I also employed the Plunge detector on all time series types, rather than just libprotoident data, and this was useful in reporting the most interesting behaviours in the tunnel user data (i.e. all the users disappearing).

Added a couple of new features to the libtrace API. The first was the ability to ask libtrace to give you the source or destination IP address as a string. This is quite handy because normally processing IP addresses in libtrace involves messing around with sockaddrs which is not particularly n00b-friendly. The second API feature was the ability to ask libtrace to calculate the checksum at either layer 3 or 4 based on the current packet contents. This was already done (poorly) inside the tracereplay tool, but is now part of the libtrace API. This is quite useful for checksum validation or if you've modified the packet somehow (e.g. modified the IP addresses) and want to recalculate the checksum to match.

Also spent a decent bit of time reading over chapters from Meenakshee's report and offering plenty of constructive criticism.

05

Mar

2013

Spent the first half of the week working on the collector. Implemented exporting expired flow records and designed another protocol header and subheader for these records. Cleaned up some repetitive code and added a function to export the ongoing flow buffer when the timer expires(before checking for new ongoing flows). Also added some documentation.

Started working on the report in the middle of the week and so far, have a draft version of the first 4 chapters(excluding the intro). Shane has checked a couple of them already so the plan for the coming week is to tidy up those chapters and get as much writing done as possible.

26

Feb

2013

Shane suggested sending the protocol names once only to reduce the amount of redundant data sent each time and also, save on fifo space and bandwidth requirements. I designed a new protocol subheader for exporting protocol details(id, name, name_len) and these are sent to a client as soon as it connects to the server. Then, I had to chage the old exporting code and get rid of parts adding the name and name length and add in the appropriate code for the protocol IDs.

Then, I started working on exporting expired flow records to clients every X seconds(where X = 3mins/value chosen by user). I created a subheader for expired protocol records, and a structure for an expired flow record. Each time a flow expired, it was sent to be exported and its data added to the appropriate buffer. The buffer was then written to the FIFO when it filled up.

After I made sure that expired flow records were being exported correctly, I setup a timer which would export these records every X seconds, regardless of whether it was full or not.

Also got my Background chapter back from Shane. and started making the proposed changes.

18

Feb

2013

Spent a major part of the week reading up on and adding threading and using libfifo with the collector.

First, I added support for using Libfifo in order to write the buffer to a memory-backed FIFO with a default size of 100MB(which can be changed via options). Then, I wrote out the FIFO to each of the clients through their fd by using the functions provided in the Libfifo API. Tested and got it working like before.

Initially, clients that connected to the server were sent statistics every X seconds(where X is a number specified in the options). Concurrency issues would arise when clients would try connecting/disconnecting during a stat export, which implies that the client list would need to be updated while the exporting process was iterating over it. After discussing this with Shane, we decided to use threading and to create mutexes around the client list when it was being read from/written to. The server can now handle disconnects/new connections while exporting statistics without crashing.

Spent Friday at home and got started on writing the Background and Libprotoident chapters of the report(ch. 2 & 3 respectively). Worked during the weekend too and nearly done with the Background and almost half-way through the 3rd chapter.

Plan for the next week is to get the background section's draft done asap and move it to LaTex and get it checked before the end of the week if possible. I also have a list of features that I need to tackle in the collector.

13

Feb

2013

Spent the first few days of the week working on my presentation and then spent the whole Friday taking care of some tickets.

Previously the server was not handling disconnects from clients, so it would still try to send data to the file descriptors. I fixed that first and then worked on not sending statistics for deprecated(NULL) protocols, which would enable saving on bandwidth and effort.

For the next week, I need to tackle threading. Which I am not looking forward to.

11

Feb

2013

Made some significant modifications to the structure of NNTSC so that it can be packaged and installed nicely. It is now no longer dependent on scripts or config files being in specific locations and handles configuration errors robustly rather than crashing into a python exception. Still got a few bugs and tidy-ups still to do, particularly relating to processes hanging around even after killing the main collector.

Managed to get some tunnel user counts from Scott at Lightwire to run through the event detection code. Added a new module to NNTSC for parsing the data, but have not quite got the data into the database for processing yet.

Spent a decent chunk of time helping Meenakshee write and practice her talk for Thursday. Once the talk was done, we got back into the swing of development by fixing some obvious problems with the current collector.

02

Feb

2013

Spend the whole week implementing BGP and Table Dump V2 parsing for the new types, got it working at the end. Also have integrated data from BGP table lookups into Meenakshi's LPI collector(important fields being source and dest AS, next hop addr and AS) and it is now stored as part of the flow structure. Tried to get a working trie and prefix tree lookup for IPv6 prefixes, encountered a lot of problems in handling 128bit addresses and the lookup is still not complete, errors when the address has sets of consecutive zeroes represented as "::". Probably will need a dedicated set of IPv6 prefixes to get this working.

21

Jan

2013

Decided to replace the PACE comparison in my L7 Filter paper with Tstat, a somewhat well-known open-source program that does traffic classification (along with a whole lot of other statistic collection). Tstat's results were disappointing - I was hoping they would be a lot better so that the ineptitude of L7 Filter would be more obvious, but I guess this does make libprotoident look even better.

Fixed a major bug in the lpicollector that was causing us to insert duplicate entries in our IP and User maps. Memory usage is way down now and our active IP counts are much more in line with expectations. Also added a special PUSH message to the protocol so that any clients will know when the collector is done sending messages for the current reporting period.

Spent a fair chunk of time refining Nathan to a) just work as intended, b) be more efficient and c) be more user-friendly / deployable. I've got it reading data properly from LPI, RRDs and AMP and exporting data in an appropriate format for our event detection code to be able to read.

Started toying with using the event detection code on our various inputs. Have run into some problems with the math used to determine whether a time series is relatively constant or not - this is used to determine which of our detectors should be run against the data.

Got the bad news that the libprotoident paper was rejected by TMA over the weekend. A bit disappointed with the reviews - felt like they were too busy trying to find flaws with the 4-byte approach rather than recognising the results I presented that showed it to be more accurate, faster and less memory-intensive than existing OSS DPI classifiers. Regardless, it is back to the drawing board on this one - looks like it might be the libtrace paper all over again.

14

Jan

2013

Spent most of my week working with Meenakshee's LPI collector. The first step was to move it out of libprotoident and into its own project, complete with trac - this meant that future libprotoident releases are not dependent on the collector being in a usable state. Added support to the collector to track the number of local IP addresses "actively" using a given protocol. This is in addition to the current counter that simply looks at the number of local IP addresses involved in flows using a given protocol - an IP receiving BitTorrent UDP traffic but not responding would not count as actively using the protocol (i.e. the new counter), but would count as having been involved in a flow for that protocol (i.e. the old counter).

After meeting with Lightwire, it was suggested that a LPI collector that could give a protocol breakdown per customer would be very useful. As a result, I added support for this to the collector. In terms of the increased workload, the actual collection process seems to manage ok, but exporting this data over the network to the Nathan database client doesn't work so well.

Added some basic transaction support to Nathan's code, so that all of the insertions from the same LPI report are now inserted using a single transaction. Ideally, though, we need to be able to create transactions that cover multiple LPI reports - perhaps by extending the LPI protocol to be able to send some sort of "PUSH" message to the client to indicate that a batch of reports is complete.

Went over the collector with callgrind to find bottlenecks and suboptimal code. Found a number of optimisations that I could make in the collector, such as caching the name strings and lengths for the supported protocols rather than asking libprotoident for them each time we want to use them. I also had a frustrating battle with converting my byteswap64 function into a macro - got there in the end thankfully.

Finished up the draft of my L7 Filter paper.

07

Jan

2013

Just a lonely two day week while everyone else was still on holiday.

Released a new version of libtrace (3.0.16) - now Richard's ring buffer code is out amongst the wide world and hopefully our users won't find too many bugs in it.

Got back into writing my paper on L7 Filter. Most of the content is there now, although I'm not entirely convinced that the way I have structured the paper is quite right. It's much more readable the way I have it now, but it looks more like a bulleted list than a technical paper.

Meenakshee's LPI collector worked pretty well running on some trace files over the break, which was pleasing. Next step is to get it working on our newly functional ISP capture point. Tested the capture point out by running some captures over the weekend - aside from a bug in the direction tagging everything looks good, so we have at least one working capture point.

17

Dec

2012

Spent the first half of the week working on my protocol implementation on the server, and tested it by adding the necesary code to parse the bytes received in the client. It can now send flow records to the client in the same format as the lpi_live output. There are a number of features to add to it, but I'll work on those after I get back.
ALso started working on adding some new counters for the number of protocols used by local and external IPs for a reporting period. Not working entirely, but I'm leaving on holidays for 5 weeks and will try to get some work done while away.

Will be back on the first week on Feb and also plan to start on the report when possible.

17

Dec

2012

Started writing a paper on my L7 Filter results - managed to get through an introduction and background before running out of steam.

Developed a module for Nathan's data collector that connects to Meena's LPI collector, receives data records, parses them and writes appropriate entries into a postgresql database. Ran into a bit of a design flaw in Nathan's collector - streams (i.e. the identifying characteristics for a measurement) have to be pre-defined before starting the collector. This doesn't work too well with LPI, where there are 250 protocols x 10 metrics x however many monitors one is running. Even worse, the number of protocols will grow with new LPI releases and we don't want to have to stop the collector to add code describing the resulting new streams.

Managed to hack my way around Nathan's code enough to add support for adding new streams whenever a new protocol / metric / monitor combination is observed by my module. Seems to work fairly well (at the second attempt - the first one ran into horrible concurrency problems due to a shared database connection).

Tried deploying the LPI collector at our ISP box, only to find that they've been playing with their core network a lot recently and now we don't see any useful traffic :(

10

Dec

2012

Spent the whole week working on the collector and a simple client to test it.
Shane helped with working out a packet format which would be used to send details about flows over a network. After working out the format, I started gradually developping a script(lpicp_export.cc) which formats the data according to the required packet structure. Currently, it adds a header, the name of the monitor(or "unnamed" if not specified), a subheader.
After that, I started working on a client which would read in the bytes and parse the values to extract the information sent by the server.

The plan for next week is to make the exporting script send out values from the counters and have the client parse the bytes received as before, as well as looking into using threads to have a separate thread writing out data to the DB while the program reads in values from a trace/other input source.