xsmeral.semnet.crawler.model
Class URLEntry

java.lang.Object
  extended by xsmeral.semnet.crawler.model.URLEntry

public class URLEntry
extends Object

The main entity used by URLManager. Contains:


Field Summary
static int DAY_SEC
          Number of seconds in a day.
static int DEF_ENTITY_UPDATE
          Default update frequency for entity URLs.
static int DEF_SOURCE_UPDATE
          Default update frequency for source URLs.
static short SCORE_MIN
          The lower bound of score.
static int UPDATE_DAY
          Daily update frequency.
static int UPDATE_MONTH
          Monthly (30 days) update frequency.
static int UPDATE_WEEK
          Weekly update frequency.
static int UPDATE_YEAR
          Yearly (365 days) update frequency.
 
Constructor Summary
URLEntry()
           
URLEntry(int id, String host, String path, Date lastVisited, int visitCount, int updateFreq, boolean entity, String pattern, boolean working, short score)
          This constructor is meant to be called only by URLManager, which populates it with values from DB.
URLEntry(String host, String path, Date lastVisited, int visitCount, int updateFreq, boolean entity, String pattern, boolean working, short score)
          This constructor is used to create instances that are to be added to DB using URLManager.
 
Method Summary
 boolean equals(Object obj)
           
 String getHost()
          Returns host path of the URL.
 int getId()
          Returns ID (generated by DB).
 Date getLastVisited()
          Returns Date indicating when this URL has been last visited by the crawler.
 String getPath()
          Returns path part of the URL, relative to the host name.
 String getPattern()
          Returns a regex pattern (as a string) that matches path of this URL and identifies the corresponding EntityDescriptor.
 short getScore()
          Returns a number indicating likelihood of the URL to work.
 int getUpdateFreq()
          Returns update frequency in seconds.
 String getUrl()
          Returns the full URL.
 int getVisitCount()
          Returns number of times this URL has been visited by the crawler.
 int hashCode()
           
 boolean isEntity()
          Indicates whether this URL represents an entity or a source URL.
 boolean isWorking()
          Indicates whether this URL is working (whether there were any errors during last visit by crawler).
 void setEntity(boolean entity)
           
 void setHost(String host)
           
 void setId(int id)
           
 void setLastVisited(Date lastVisited)
           
 void setPath(String path)
           
 void setPattern(String pattern)
           
 void setScore(short score)
           
 void setUpdateFreq(int updateFreq)
           
 void setVisitCount(int visitCount)
           
 void setWorking(boolean working)
           
 String toString()
           
 
Methods inherited from class java.lang.Object
clone, finalize, getClass, notify, notifyAll, wait, wait, wait
 

Field Detail

DEF_ENTITY_UPDATE

public static final int DEF_ENTITY_UPDATE
Default update frequency for entity URLs.

See Also:
Constant Field Values

DEF_SOURCE_UPDATE

public static final int DEF_SOURCE_UPDATE
Default update frequency for source URLs.

See Also:
Constant Field Values

DAY_SEC

public static final int DAY_SEC
Number of seconds in a day.

See Also:
Constant Field Values

UPDATE_DAY

public static final int UPDATE_DAY
Daily update frequency.

See Also:
Constant Field Values

UPDATE_WEEK

public static final int UPDATE_WEEK
Weekly update frequency.

See Also:
Constant Field Values

UPDATE_MONTH

public static final int UPDATE_MONTH
Monthly (30 days) update frequency.

See Also:
Constant Field Values

UPDATE_YEAR

public static final int UPDATE_YEAR
Yearly (365 days) update frequency.

See Also:
Constant Field Values

SCORE_MIN

public static final short SCORE_MIN
The lower bound of score. When URL entry reaches this score, it is marked as non-working.

See Also:
Constant Field Values
Constructor Detail

URLEntry

public URLEntry()

URLEntry

public URLEntry(int id,
                String host,
                String path,
                Date lastVisited,
                int visitCount,
                int updateFreq,
                boolean entity,
                String pattern,
                boolean working,
                short score)
This constructor is meant to be called only by URLManager, which populates it with values from DB. This constructor should be used instead

See Also:
URLManager

URLEntry

public URLEntry(String host,
                String path,
                Date lastVisited,
                int visitCount,
                int updateFreq,
                boolean entity,
                String pattern,
                boolean working,
                short score)
This constructor is used to create instances that are to be added to DB using URLManager.

See Also:
URLManager
Method Detail

getId

public int getId()
Returns ID (generated by DB). The ID is valid only if this instance was retrieved from URLManager.

See Also:
URLManager

setId

public void setId(int id)

getLastVisited

public Date getLastVisited()
Returns Date indicating when this URL has been last visited by the crawler.


setLastVisited

public void setLastVisited(Date lastVisited)

getUpdateFreq

public int getUpdateFreq()
Returns update frequency in seconds.


setUpdateFreq

public void setUpdateFreq(int updateFreq)

getUrl

public String getUrl()
Returns the full URL.


getHost

public String getHost()
Returns host path of the URL.


setHost

public void setHost(String host)

getPath

public String getPath()
Returns path part of the URL, relative to the host name.


setPath

public void setPath(String path)

getVisitCount

public int getVisitCount()
Returns number of times this URL has been visited by the crawler.


setVisitCount

public void setVisitCount(int visitCount)

isEntity

public boolean isEntity()
Indicates whether this URL represents an entity or a source URL.


setEntity

public void setEntity(boolean entity)

getPattern

public String getPattern()
Returns a regex pattern (as a string) that matches path of this URL and identifies the corresponding EntityDescriptor.

See Also:
EntityDescriptor

setPattern

public void setPattern(String pattern)

isWorking

public boolean isWorking()
Indicates whether this URL is working (whether there were any errors during last visit by crawler).


setWorking

public void setWorking(boolean working)

getScore

public short getScore()
Returns a number indicating likelihood of the URL to work.


setScore

public void setScore(short score)

equals

public boolean equals(Object obj)
Overrides:
equals in class Object

hashCode

public int hashCode()
Overrides:
hashCode in class Object

toString

public String toString()
Overrides:
toString in class Object