UnigramTextClassifier Compound List

Here are the classes, structs, unions and interfaces with brief descriptions:

std::UnigramTextClassifier

A text classifier based on single characters. The basic idea: texts from the same class will tend to have character (byte) frequencies that are similar. In information theoretical terms, texts from the same class should require the same number of bits to encode them in a perfect encoding. We don't actually have to create the encoding, just use the number of bits. The basic methods are learn (read a corpus and count the frequencies), dump (save the frequencies to a stream) and read, read the frequencies from a stream

Generated on Fri Aug 8 15:44:40 2003 for UnigramTextClassifier by

1.3.3