#include <UnigramTextClassifier.h>
Public Member Functions | |
UnigramTextClassifier () | |
Constructor. Constructor for UnigramTextClassifier. Name of classification defaults to 'Unknown.'. | |
UnigramTextClassifier (const string classification) | |
Constructor. Constructor for UnigramTextClassifier. | |
frequency_map | freqs () |
unsigned long | corpus_total () |
unsigned long | total () |
string | classification () |
void | setClassification (string &classification) |
void | UnigramTextClassifier::learn (istream &in) |
Learn the frequencies of characters in a corpus. Learn the frequencies of characters in a corpus; may be called multiple times. | |
void | UnigramTextClassifier::learn (char *in) |
Learn the frequencies of characters in a corpus. Learn the frequencies of characters in a corpus; may be called multiple times. | |
void | UnigramTextClassifier::dump (ostream &out) |
Dump the frequencies of characters in a corpus. Dump the frequencies of characters in a corpus. | |
void | UnigramTextClassifier::dump (char *out) |
Dump the frequencies of characters in a corpus. Dump the frequencies of characters in a corpus. | |
void | UnigramTextClassifier::read (istream &in) |
Read the frequencies of characters in a corpus. Learn the frequencies of characters in a corpus; may be called multiple times. | |
void | UnigramTextClassifier::read (char *in) |
Read the frequencies of characters in a corpus. Learn the frequencies of characters in a corpus; may be called multiple times. | |
float | UnigramTextClassifier::score (istream &in) |
float | UnigramTextClassifier::score (char *in) |
float | UnigramTextClassifier::bits_required (unsigned char ch) |
float | UnigramTextClassifier::bits_required (istream &in) |
float | UnigramTextClassifier::bits_required (char *in) |
Private Member Functions | |
float | UnigramTextClassifier::lg (float n) |
float | UnigramTextClassifier::info_value (float n) |
string | UnigramTextClassifier::ctime_string () |
Private Attributes | |
frequency_map | _freqs |
unsigned long | _corpus_total |
unsigned long | _total |
string | _classification |
|
Constructor. Constructor for UnigramTextClassifier. Name of classification defaults to 'Unknown.'.
|
|
Constructor. Constructor for UnigramTextClassifier.
|
|
|
|
|
|
|
|
|
|
|
|
How many bits would it take to encode the characters a file?
|
|
How many bits would it take to encode the characters a stream?
|
|
How many bits would it take to code a character?
|
|
internal current time stream |
|
Dump the frequencies of characters in a corpus. Dump the frequencies of characters in a corpus.
|
|
Dump the frequencies of characters in a corpus. Dump the frequencies of characters in a corpus.
|
|
internal information value function -lg(n) |
|
Learn the frequencies of characters in a corpus. Learn the frequencies of characters in a corpus; may be called multiple times.
|
|
Learn the frequencies of characters in a corpus. Learn the frequencies of characters in a corpus; may be called multiple times.
|
|
internal base-2 logarithm |
|
Read the frequencies of characters in a corpus. Learn the frequencies of characters in a corpus; may be called multiple times.
|
|
Read the frequencies of characters in a corpus. Learn the frequencies of characters in a corpus; may be called multiple times.
|
|
What's the score? How many bits would it take to encode the characters a file?
|
|
What's the score? How many bits would it take to encode the characters a file?
|
|
internal name of classifer |
|
internal total number of characters in corpus |
|
internal character->frequency map |
|
internal total number of characters in text |