LBJ2.learn
Class NaiveBayes

java.lang.Object
  extended by LBJ2.classify.Classifier
      extended by LBJ2.learn.Learner
          extended by LBJ2.learn.NaiveBayes
All Implemented Interfaces:
java.io.Serializable, java.lang.Cloneable

public class NaiveBayes
extends Learner

Naive Bayes is a multi-class learner that uses prediction value counts and feature counts given a particular prediction value to select the most likely prediction value. More precisely, a score sv for a given prediction value v is computed such that esv is proportional to

P(v) Prodf P(f|v)
where Prod is a multiplication quantifier over f, and f stands for a feature. The value corresponding to the highest score is selected as the prediction. Feature values that were never observed given a particular prediction value during training are smoothed with a configurable constant that defaults to e-15.

This Learner learns a discrete classifier from other discrete classifiers. Features coming from real classifiers are ignored. It is also assumed that a single discrete label feature will be produced in association with each example object. A feature taking one of the values observed in that label feature will be produced by the learned classifier.

This algorithm's user-configurable parameters are stored in member fields of this class. They may be set via either a constructor that names each parameter explicitly or a constructor that takes an instance of Parameters as input. The documentation in each member field in this class indicates the default value of the associated parameter when using the former type of constructor. The documentation of the associated member field in the Parameters class indicates the default value of the parameter when using the latter type of constructor.

See Also:
NaiveBayes.NaiveBayesVector, Serialized Form

Nested Class Summary
protected static class NaiveBayes.Count
          A Count object stores two doubles, one which holds a accumulated count value and the other intended to hold the natural logarithm of the count.
protected  class NaiveBayes.NaiveBayesVector
          Keeps track of all the counts associated with a given label.
static class NaiveBayes.Parameters
          Simply a container for all of NaiveBayes's configurable parameters.
 
Field Summary
static int defaultSmoothing
          The default conditional feature probability is edefaultSmoothing.
protected  java.util.LinkedHashMap network
          One NaiveBayes.NaiveBayesVector for each observed prediction value.
protected  double smoothing
          The exponential of this number is used as the conditional probability of a feature that was never observed during training; default defaultSmoothing.
 
Fields inherited from class LBJ2.learn.Learner
extractor, labeler
 
Fields inherited from class LBJ2.classify.Classifier
containingPackage, name
 
Constructor Summary
NaiveBayes()
          Default constructor.
NaiveBayes(double smooth)
          Initializes the smoothing constant.
NaiveBayes(NaiveBayes.Parameters p)
          Initializing constructor.
NaiveBayes(java.lang.String n)
          Initializes the name of the classifier.
NaiveBayes(java.lang.String name, double smooth)
          Initializes the name and smoothing constant.
NaiveBayes(java.lang.String n, NaiveBayes.Parameters p)
          Initializing constructor.
 
Method Summary
 FeatureVector classify(java.lang.Object example)
          Prediction value counts and feature counts given a particular prediction value are used to select the most likely prediction value.
 java.lang.Object clone()
          Returns a deep clone of this learning algorithm.
 void forget()
          Clears the network.
 void learn(java.lang.Object example)
          Trains the learning algorithm given an object as an example.
 ScoreSet scores(java.lang.Object example)
          The scores in the returned ScoreSet are the posterior probabilities of each possible label given the example.
 void setLabeler(Classifier l)
          Sets the labeler.
 void setSmoothing(double s)
          Sets the smoothing parameter to the specified value.
 void write(java.io.PrintStream out)
          Writes the algorithm's internal representation as text.
 
Methods inherited from class LBJ2.learn.Learner
doneLearning, getExtractor, getLabeler, learn, save, setExtractor
 
Methods inherited from class LBJ2.classify.Classifier
allowableValues, binaryRead, binaryRead, binaryRead, binaryRead, binaryWrite, binaryWrite, classify, discreteValue, discreteValueArray, getCompositeChildren, getInputType, getOutputType, realValue, realValueArray, test, toString, valueIndexOf
 
Methods inherited from class java.lang.Object
equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

defaultSmoothing

public static final int defaultSmoothing
The default conditional feature probability is edefaultSmoothing.

See Also:
Constant Field Values

smoothing

protected double smoothing
The exponential of this number is used as the conditional probability of a feature that was never observed during training; default defaultSmoothing.


network

protected java.util.LinkedHashMap network
One NaiveBayes.NaiveBayesVector for each observed prediction value.

Constructor Detail

NaiveBayes

public NaiveBayes()
Default constructor.


NaiveBayes

public NaiveBayes(double smooth)
Initializes the smoothing constant.

Parameters:
smooth - The exponential of this number is used as the conditional probability of a feature that was never observed during training.

NaiveBayes

public NaiveBayes(NaiveBayes.Parameters p)
Initializing constructor. Sets all member variables to their associated settings in the NaiveBayes.Parameters object.

Parameters:
p - The settings of all parameters.

NaiveBayes

public NaiveBayes(java.lang.String n)
Initializes the name of the classifier.

Parameters:
n - The classifier's name.

NaiveBayes

public NaiveBayes(java.lang.String name,
                  double smooth)
Initializes the name and smoothing constant.

Parameters:
name - The classifier's name.
smooth - The exponential of this number is used as the conditional probability of a feature that was never observed during training.

NaiveBayes

public NaiveBayes(java.lang.String n,
                  NaiveBayes.Parameters p)
Initializing constructor. Sets all member variables to their associated settings in the NaiveBayes.Parameters object.

Parameters:
n - The name of the classifier.
p - The settings of all parameters.
Method Detail

setSmoothing

public void setSmoothing(double s)
Sets the smoothing parameter to the specified value.

Parameters:
s - The new value for the smoothing parameter.

setLabeler

public void setLabeler(Classifier l)
Sets the labeler.

Overrides:
setLabeler in class Learner
Parameters:
l - A labeling classifier.

learn

public void learn(java.lang.Object example)
Trains the learning algorithm given an object as an example.

Specified by:
learn in class Learner
Parameters:
example - An example of the desired learned classifier's behavior.

forget

public void forget()
Clears the network.

Specified by:
forget in class Learner

scores

public ScoreSet scores(java.lang.Object example)
The scores in the returned ScoreSet are the posterior probabilities of each possible label given the example.

Specified by:
scores in class Learner
Parameters:
example - The object to make decisions about.
Returns:
A set of scores indicating the degree to which each possible discrete classification value is associated with the given example object.

classify

public FeatureVector classify(java.lang.Object example)
Prediction value counts and feature counts given a particular prediction value are used to select the most likely prediction value.

Specified by:
classify in class Classifier
Parameters:
example - The example object.
Returns:
A single discrete feature, set to the most likely value.

write

public void write(java.io.PrintStream out)
Writes the algorithm's internal representation as text.

Specified by:
write in class Learner
Parameters:
out - The output stream.

clone

public java.lang.Object clone()
Returns a deep clone of this learning algorithm.

Overrides:
clone in class Classifier
Returns:
A shallow clone.