Weekly Report - 17/4/15




Started using mallet on the files that I have collect. Tested it on the entire directory at once and it created a .mallet file quickly and a topic model in 1 hour 30 mins, I then used it on the Bearwall logs which I unzipped and took a lot longer over 3 hours which makes me believe that it ignores the zipped files.

Then I looked at the topic keys it had generated and it had stripped out all the numbers and only kept words so looking at the topics didn't really prove to show anything useful.

So next step is to look into other programs and methods that are better suited towards log files because it would be more useful to see it grouping events together from multiple files of different applications. Though that's not to say the topic modeling the Mallet is suited towards won't be helpful I will also need to look into if there is an option to retain numbers etc. and if so re-test.