|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.ObjectLBJ2.parse.LinkedChild
LBJ2.nlp.Sentence
public class Sentence
This representation of a sentence simply stores the entire text of the
sentence in a string. This may include any newlines present in the input,
depending on the parser (e.g., SentenceSplitter will leave them
in). However, this class also provides methods to convert that string to
other representations.
| Field Summary | |
|---|---|
private boolean[] |
inURL
Indicates whether the corresponding index in the text has been determined to be part of a URL; used by partOfURL(int). |
private static java.lang.String[] |
protocols
URL prefixes; used by partOfURL(int). |
java.lang.String |
text
The actual text of the sentence. |
private static java.lang.String[] |
topLevelDomains
Domain name suffixes; used by partOfURL(int). |
| Fields inherited from class LBJ2.parse.LinkedChild |
|---|
end, next, parent, previous, start |
| Constructor Summary | |
|---|---|
Sentence(java.lang.String t)
Constructs a sentence from its text. |
|
Sentence(java.lang.String t,
int s,
int e)
Constructor that sets the character offsets of this sentence. |
|
| Method Summary | |
|---|---|
private void |
myAdd(java.util.LinkedList l,
int i,
java.lang.String description)
For debugging purposes, it's useful to insert print statements here. |
private boolean |
partOfURL(int index)
Does a simple check to determine if the symbol at the specified index in the specified string is likely to be part of a URL. |
java.lang.String |
toString()
The string representation of a Sentence is just its text. |
LinkedVector |
wordSplit()
Creates and returns a LinkedVector representation of this
sentence in which every LinkedChild is a Word. |
| Methods inherited from class LBJ2.parse.LinkedChild |
|---|
clone |
| Methods inherited from class java.lang.Object |
|---|
equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
| Field Detail |
|---|
private static final java.lang.String[] protocols
partOfURL(int). The values in this array
need to be sorted by decreasing order of length to make the regular
expressions that use them work properly.
private static final java.lang.String[] topLevelDomains
partOfURL(int). The values in
this array need to be sorted by decreasing order of length to make the
regular expressions that use them work properly.
private boolean[] inURL
partOfURL(int).
public java.lang.String text
| Constructor Detail |
|---|
public Sentence(java.lang.String t)
t - The text of the sentence.
public Sentence(java.lang.String t,
int s,
int e)
t - The text of the sentence.s - The offset at which this child starts.e - The offset at which this child ends.| Method Detail |
|---|
private void myAdd(java.util.LinkedList l,
int i,
java.lang.String description)
l - The list to add to.i - The item to add.description - A string describing why the addition is happening.public LinkedVector wordSplit()
LinkedVector representation of this
sentence in which every LinkedChild is a Word.
Offset information is respected and propagated.
LinkedVector representation of this sentence.Wordprivate boolean partOfURL(int index)
index - The index of the symbol in question.
true if and only if the specified symbol
appears to be part of a URL.public java.lang.String toString()
Sentence is just its text.
toString in class java.lang.Object
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||