xsmeral.semnet.crawler.model
Class EntityDocument

java.lang.Object
  extended by xsmeral.semnet.crawler.model.EntityDocument

public class EntityDocument
extends Object

Container for documents retrieved by HTMLCrawler, passed to a scraper (wrapper).
Contains base URL for resolving links, absolute URL of this document, a descriptor of the contained entity and a TagNode containing parsed document tree.

See Also:
HtmlCleaner

Constructor Summary
EntityDocument()
           
EntityDocument(String baseURL, String url, EntityDescriptor entDesc, TagNode document)
          Initializes all fields
 
Method Summary
 boolean equals(Object obj)
           
 String getBaseURL()
          Returns the base URL of the host, where this document originated.
 TagNode getDocument()
          Returns the TagNode (HtmlCleaner) containing the document tree.
 EntityDescriptor getEntityDescriptor()
          Returns the entity descriptor describing this document.
 String getUrl()
          Returns absolute URL of the document.
 int hashCode()
           
 void setBaseURL(String baseURL)
           
 void setDocument(TagNode document)
           
 void setEntityDescriptor(EntityDescriptor entDesc)
           
 void setUrl(String url)
           
 String toString()
           
 
Methods inherited from class java.lang.Object
clone, finalize, getClass, notify, notifyAll, wait, wait, wait
 

Constructor Detail

EntityDocument

public EntityDocument()

EntityDocument

public EntityDocument(String baseURL,
                      String url,
                      EntityDescriptor entDesc,
                      TagNode document)
Initializes all fields

Method Detail

getDocument

public TagNode getDocument()
Returns the TagNode (HtmlCleaner) containing the document tree.


setDocument

public void setDocument(TagNode document)

getBaseURL

public String getBaseURL()
Returns the base URL of the host, where this document originated.


setBaseURL

public void setBaseURL(String baseURL)

getUrl

public String getUrl()
Returns absolute URL of the document.


setUrl

public void setUrl(String url)

getEntityDescriptor

public EntityDescriptor getEntityDescriptor()
Returns the entity descriptor describing this document.


setEntityDescriptor

public void setEntityDescriptor(EntityDescriptor entDesc)

equals

public boolean equals(Object obj)
Overrides:
equals in class Object

hashCode

public int hashCode()
Overrides:
hashCode in class Object

toString

public String toString()
Overrides:
toString in class Object