|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectxsmeral.semnet.crawler.HostManager
public class HostManager
Host manager for HTMLCrawler
.
Manages HostDescriptor
s, EntityDescriptor
s, mapping
of hosts to their IDs and persisting hosts in DB.
Mapping hosts to IDs is mainly a performance measure.
Nested Class Summary | |
---|---|
static class |
HostManager.Mapper
Mapper is responsible for mapping hosts to IDs. |
Constructor Summary | |
---|---|
HostManager(RDBLayer db)
Creates manager instance for the specified DB layer. |
Method Summary | |
---|---|
int |
addHost(String address)
Adds the host with the specified address to the DB. |
void |
close()
Closes the DB connection |
Collection<HostDescriptor> |
getDescriptors()
Returns descriptors of the managed hosts. |
EntityDescriptor |
getEntityDescriptor(int hostId,
String pattern)
Returns the EntityDescriptor associated with this URL pattern
or null if the pattern does not represent an entity in the provided host |
Map<Pattern,EntityDescriptor> |
getEntityDescriptorMap(int hostId)
Returns the map between entity URL patterns and their entity descriptors for the host with given ID, or null if the ID is not mapped to any managed host. |
HostDescriptor |
getHostDescriptor(int id)
Returns descriptor of the host associated with the given ID or null if the ID is not mapped to any managed host. |
Collection<Integer> |
getHostIds()
Returns IDs of the managed hosts. |
HostMapper |
getMapper()
Returns the mapper instance associated with this HostManager. |
static HostMapper |
getMapper(RDBLayer db)
Returns a mapper instance for the specified DB. |
Pattern |
getPattern(int hostId,
String relativeUrl)
Returns the Pattern (entity or source) that matches the given relative URL or null if no match is found. |
Map<Pattern,Integer> |
getSourceURLMap(int hostId)
Returns the map between source URL patterns and their update frequencies for the host with given ID, or null if the ID is not mapped to any managed host. |
boolean |
isEntity(int hostId,
Pattern pattern)
Indicates, whether the specified pattern represents an entity in the given host. |
boolean |
isSource(int hostId,
Pattern pattern)
Indicates, whether the specified pattern represents a source URL in the given host. |
void |
loadHosts(Collection<HostDescriptor> hosts)
Initializes the manager with given set of hosts. |
static void |
main(String[] args)
Provides a CLI for simple management of hosts. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
---|
public HostManager(RDBLayer db) throws SQLException
SQLException
- In case of a problem with the DB layer.Method Detail |
---|
public void loadHosts(Collection<HostDescriptor> hosts) throws SQLException
SQLException
- In case of a problem with the DB layer.public static HostMapper getMapper(RDBLayer db)
HostManager.Mapper(db)
public HostMapper getMapper()
public final int addHost(String address) throws SQLException
address
- String containing the URL of the host
SQLException
- If a SQL command failspublic Collection<HostDescriptor> getDescriptors()
public Collection<Integer> getHostIds()
public HostDescriptor getHostDescriptor(int id)
public Map<Pattern,Integer> getSourceURLMap(int hostId)
HostDescriptor
public Map<Pattern,EntityDescriptor> getEntityDescriptorMap(int hostId)
public Pattern getPattern(int hostId, String relativeUrl)
hostId
- ID of the host to search for patterns.relativeUrl
- URL to matchHostDescriptor
,
EntityDescriptor
public boolean isEntity(int hostId, Pattern pattern)
hostId
- ID of the given hostpattern
- The pattern to checkEntityDescriptor
public boolean isSource(int hostId, Pattern pattern)
hostId
- ID of the given hostpattern
- The pattern to checkHostDescriptor
public EntityDescriptor getEntityDescriptor(int hostId, String pattern)
EntityDescriptor
associated with this URL pattern
or null if the pattern does not represent an entity in the provided host
hostId
- ID of the hostpattern
- The URL patternpublic void close()
RDBLayer
public static void main(String[] args)
list
,remove
and reset
.
More details are provided upon running the class without arguments.
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |