Weekly Report - 3/4/15




Went and saw Brad and got /var/log log files from ns1 zipped up to start testing with so I can decide whether or not I need Syslog logs at the next stage since it is simpler to use the logs already available than to set up Syslog on a machine so these will be a great start.

Unzipped them and now trying to find the best way to combine them into a .mallet file. Tried to execute it on the whole folder but after a couple hours it was still going; When I have time I will leave it running for awhile because it may just take awhile to do 400MB of logs, for the moment I'll use subfolders while waiting. From the examples I went through they had all their files in the .txt format but when testing on a single folder Mallet seems to be able to decompress and read files in the plain file format.

I ran mallet on the single sub folder /var/log/kernel to create a simple topic model which worked well but didn't really have enough info to tell anything interesting so I will be looking for some bigger subsets to test on until I can get the entire log to combine.

Now I will start researching and testing different document clustering algorithms for finding patterns within these logs.