LBJ2.parse
Class FoldParser

java.lang.Object
  extended by LBJ2.parse.FoldParser
All Implemented Interfaces:
Parser

public class FoldParser
extends java.lang.Object
implements Parser

Useful when performing k-fold cross validation, this parser filters the examples coming from another parser. Conceptually, the examples from the original parser are first split into k "folds" (or partitions) depending on the selected splitting strategy. A particular fold is then selected as the pivot, and this parser can be configured either to return all and only the examples from that fold, or all and only the examples from other folds.

The k folds are referred to by their indexes, which are 0, 1, ..., k - 1. This index is used to select the pivot fold.

See Also:
FoldParser.SplitStrategy

Nested Class Summary
static class FoldParser.SplitStrategy
          Immutable type representing the way in which examples are partitioned into folds.
 
Field Summary
protected  int exampleIndex
          Keeps track of the index of the next example to be returned.
protected  int examples
          The total number of examples coming from parser.
protected  int fold
          Keeps track of the current fold; used only in manual splitting.
protected  boolean fromPivot
          Whether examples will come from the pivot fold or not.
protected  int K
          The total number of folds.
protected  int lowerBound
          A lower bound for an index relating to the pivot fold.
protected  Parser parser
          The parser whose examples are being filtered.
protected  int pivot
          The examples from this fold are exclusively selected for or excluded from the set of examples returned by this parser.
protected  int[] shuffled
          Used only by the random splitting strategy to remember which example indexes are in which folds.
protected  int shuffleIndex
          An index pointing into shuffled.
protected  FoldParser.SplitStrategy splitStrategy
          The way in which examples are partitioned into folds.
protected  int upperBound
          An upper bound for an index relating to the pivot fold.
 
Constructor Summary
FoldParser(Parser parser, FoldParser.SplitStrategy split, int pivot, boolean f)
          Constructor for when you know neither how many examples are in the data nor K, i.e., how many folds are in the data.
FoldParser(Parser parser, int K, FoldParser.SplitStrategy split, int pivot, boolean f)
          Constructor for when you don't know how many examples are in the data.
FoldParser(Parser parser, int K, FoldParser.SplitStrategy split, int pivot, boolean f, int e)
          Full constructor.
 
Method Summary
protected  boolean filter(java.lang.Object example)
          Convenient for determining if the next example should be returned or not.
 int getK()
          Retrieves the value of K, which may have been computed in the constructor if the splitting strategy is manual.
protected  void increment(java.lang.Object example)
          Changes state to reflect retrieval of the next example from the parser.
 java.lang.Object next()
          Retrieves the next example object.
 void reset()
          Sets this parser back to the beginning of the raw data.
 void setFromPivot(boolean f)
          Sets the value of fromPivot, which controls whether examples will be taken from the pivot fold or from all other folds.
 void setPivot(int p)
          Sets the pivot fold, which also causes parser to be reset.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

parser

protected Parser parser
The parser whose examples are being filtered.


K

protected int K
The total number of folds.


splitStrategy

protected FoldParser.SplitStrategy splitStrategy
The way in which examples are partitioned into folds.


pivot

protected int pivot
The examples from this fold are exclusively selected for or excluded from the set of examples returned by this parser.


fromPivot

protected boolean fromPivot
Whether examples will come from the pivot fold or not.


examples

protected int examples
The total number of examples coming from parser.


exampleIndex

protected int exampleIndex
Keeps track of the index of the next example to be returned.


fold

protected int fold
Keeps track of the current fold; used only in manual splitting.


lowerBound

protected int lowerBound
A lower bound for an index relating to the pivot fold. The index variable in question may either be exampleIndex or shuffleIndex.


upperBound

protected int upperBound
An upper bound for an index relating to the pivot fold. The index variable in question may either be exampleIndex or shuffleIndex.


shuffled

protected int[] shuffled
Used only by the random splitting strategy to remember which example indexes are in which folds.


shuffleIndex

protected int shuffleIndex
An index pointing into shuffled.

Constructor Detail

FoldParser

public FoldParser(Parser parser,
                  int K,
                  FoldParser.SplitStrategy split,
                  int pivot,
                  boolean f)
Constructor for when you don't know how many examples are in the data. Using a constructor that allows specification of the number of examples in the data only saves computation when the splitting strategy is either sequential or random.

Parameters:
parser - The parser whose examples are being filtered.
K - The total number of folds; this value is ignored if the splitting strategy is manual.
split - The way in which examples are partitioned into folds.
pivot - The index of the pivot fold.
f - Whether to extract examples from the pivot.

FoldParser

public FoldParser(Parser parser,
                  FoldParser.SplitStrategy split,
                  int pivot,
                  boolean f)
Constructor for when you know neither how many examples are in the data nor K, i.e., how many folds are in the data. This constructor can only be used when the splitting strategy is manual. Using a constructor that allows specification of the number of examples in the data only saves computation when the splitting strategy is either sequential or random.

Parameters:
parser - The parser whose examples are being filtered.
split - The way in which examples are partitioned into folds.
pivot - The index of the pivot fold.
f - Whether to extract examples from the pivot.

FoldParser

public FoldParser(Parser parser,
                  int K,
                  FoldParser.SplitStrategy split,
                  int pivot,
                  boolean f,
                  int e)
Full constructor.

Parameters:
parser - The parser whose examples are being filtered.
K - The total number of folds; this value is ignored if the splitting strategy is manual.
split - The way in which examples are partitioned into folds.
pivot - The index of the pivot fold.
f - Whether to extract examples from the pivot.
e - The total number of examples coming from parser, or -1 if unknown.
Method Detail

getK

public int getK()
Retrieves the value of K, which may have been computed in the constructor if the splitting strategy is manual.


setFromPivot

public void setFromPivot(boolean f)
Sets the value of fromPivot, which controls whether examples will be taken from the pivot fold or from all other folds.

Parameters:
f - The new value for fromPivot.

setPivot

public void setPivot(int p)
Sets the pivot fold, which also causes parser to be reset.

Parameters:
p - The index of the new pivot fold.

reset

public void reset()
Sets this parser back to the beginning of the raw data. This means arranging for all relevant state variables to be reset appropriately as well, since the value of pivot may have changed.

Specified by:
reset in interface Parser
See Also:
setPivot(int)

filter

protected boolean filter(java.lang.Object example)
Convenient for determining if the next example should be returned or not.

Parameters:
example - The next example object.
Returns:
true iff the next example should be returned.

increment

protected void increment(java.lang.Object example)
Changes state to reflect retrieval of the next example from the parser.

Parameters:
example - The previous example object.

next

public java.lang.Object next()
Retrieves the next example object.

Specified by:
next in interface Parser
Returns:
The next object parsed from the input data.