tokenizes articles and creates vocab file
/corpora is where the tokenized files go
/vocabs is where vocab files go
builds single matrix, puts it into /smats
24582 819.4 per day for alt.atheism.
perl script to calculate traffic on a newsgroup by checking news
server on campus