|
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: INNER | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||
java.lang.Object
|
+--superwaba.ext.xplat.xml.XmlTokenizer
|
+--superwaba.ext.xplat.xml.XmlReader
|
+--superwaba.ext.xplat.html.HtmlReader
HtmlReader extends XmlReader in order to:
| Fields inherited from class superwaba.ext.xplat.xml.XmlReader |
converter,
tagNameHashId |
| Constructor Summary | |
HtmlReader()
Constructor |
|
| Method Summary | |
void |
foundEndTagName(byte[] buffer,
int offset,
int count)
Method called when an end-tag has been found. |
protected void |
foundReference(byte[] input,
int offset,
int count)
Method called when a reference been found in content. |
void |
foundStartTagName(byte[] buffer,
int offset,
int count)
Method called when a start-tag has been found. |
protected int |
getTagCode(byte[] b,
int offset,
int count)
Method to compute the tag code identifying a tag name. |
| Methods inherited from class superwaba.ext.xplat.xml.XmlReader |
foundAttributeName,
foundAttributeValue,
foundCharacter,
foundCharacterData,
foundComment,
foundEndEmptyTag,
foundEndOfInput,
getContentHandler,
parse,
parse,
parse,
parse,
setAttributeListFilter,
setContentHandler,
setNewlineSignificant |
| Methods inherited from class superwaba.ext.xplat.xml.XmlTokenizer |
disableReferenceResolution,
foundDeclaration,
foundInvalidData,
foundProcessingInstruction,
foundStartOfInput,
getAbsoluteOffset,
isDataCDATA,
resolveCharacterReference,
setCdataContents,
setStrictlyXml,
tokenize,
tokenize,
tokenize,
tokenize |
| Methods inherited from class java.lang.Object |
clone,
equals,
finalize,
getClass,
hashCode,
notify,
notifyAll,
toString,
wait,
wait,
wait |
| Constructor Detail |
public HtmlReader()
| Method Detail |
protected void foundReference(byte[] input,
int offset,
int count)
It can be either a named or numeric character reference, or an entity reference. Given the several syntaxes of reference, no verification is made a priori on the validity of the "name" of the reference.
For conveniency, a static method:
XmlTokenizer.resolveCharacterReference(byte[],int,int)
allows to convert the character reference into its UCS-2 encoded value.
| Note: | foundReference is called only if
XmlTokenizer.disableReferenceResolution(boolean disable)
has been called first, with disable
set to true.
If not, then foundReference is never called,
and XmlTokenizer.foundCharacter(char) is called instead.
This design permits to easily handle simple XML documents —
only predefined named character entities, and numeric character entities
— and documents which have
user-defined internal/external entities.
This is explained below.
|
When working with a set of externally defined entities,
issue disableReferenceResolution(true)
to turn off automatic reference resolution.
Then, your code in foundReference could
make a quick check to see if the found reference is numeric.
If it is numeric — it starts with a # character —
call resolveCharacterReference;
if it is not a numeric reference, checks if the reference belongs
to the known list of
defined entities for the parsed document.
If it does, do the substitution; if not, call
resolveCharacterReference, because it could be one of the
XML Predefined Entities
By default, each character reference is naturally
reported via XmlTokenizer.foundCharacter(char),
which, again, supersedes
the foundReference notification.
Derived class may override this method.
input - byte array containing the reference nameoffset - position of the first character of the reference name
in the arraycount - number of bytes the reference name is made ofXmlTokenizer.setStrictlyXml(boolean toSet)
protected int getTagCode(byte[] b,
int offset,
int count)
This is the value which is passed to ContentHandler's for reporting a tag name. Derived class may override it.
b - byte array containing the bytes to be hashedoffset - position of the first byte in the arraycount - number of bytes to be hashed
public final void foundStartTagName(byte[] buffer,
int offset,
int count)
Derived class may override this method.
input - byte array containing the name of the tag that startedoffset - position of the first character of the tag name in the arraycount - number of bytes the tag name is made of
public final void foundEndTagName(byte[] buffer,
int offset,
int count)
Derived class may override this method.
input - byte array containing the name of the tag that endedoffset - position of the first character of the tag name in the arraycount - number of bytes the tag name is made of
|
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: INNER | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||