dist

/Data/corp/dist -d -m 1 /Data/tmp/t9 < infile > outfile  (or standard I/O)

slim or twoyear slim2

-d uses distances, not correlations

-m Minkowski value 1=city block; 2=Euclidean

 

4.6                           3208

mean wds              raw count

between                 of pairs

items                       (in order)

always use the command as written above to get distances (not correlations)

and to use the city block metric.

 

588.1987          engine  6          coffee  6  1163706.0   523073.0

577.3949          coffee  6            belt  4   523073.0   304122.0

663.2789            belt  4        trousers  8   304122.0    35228.0

764.6849        trousers  8         chapter  7    35228.0   549283.0