gather

defunct
Newsgroup Gathering
/Data/ngroups
gather script
tokenizes articles and creates vocab file
/corpora is where the tokenized files go
/vocabs is where vocab files go

buildsmats
builds single matrix, puts it into /smats

————————————–
locutus:/Data/ngcorps 40>traf
alt.atheism
24582 819.4 per day for alt.atheism.
perl script to calculate traffic on a newsgroup by checking news
server on campus