A B C D E F G H I J L M N O P Q R S T U V W X

S

SCORE_MIN - Static variable in class xsmeral.semnet.crawler.model.URLEntry
The lower bound of score.
scrape(EntityDocument) - Method in class xsmeral.semnet.scraper.AbstractScraper
Scrapes one document and outputs any number of facts.
ScraperWrapper - Class in xsmeral.semnet.scraper
This processor works as a router, dispatching entity documents to scrapers.
ScraperWrapper() - Constructor for class xsmeral.semnet.scraper.ScraperWrapper
 
SesameWriter - Class in xsmeral.semnet.sink
Writes Statements to a Sesame database.
SesameWriter() - Constructor for class xsmeral.semnet.sink.SesameWriter
 
SesameWriter(RepositoryConnection) - Constructor for class xsmeral.semnet.sink.SesameWriter
Initializes with the specified connection.
setAttached(List<Configuration>) - Method in class xsmeral.semnet.manager.ProcessingJob
 
setBaseURL(String) - Method in class xsmeral.semnet.crawler.model.EntityDocument
 
setBaseURL(String) - Method in class xsmeral.semnet.crawler.model.HostDescriptor
 
setCharset(String) - Method in class xsmeral.semnet.crawler.model.HostDescriptor
 
setConnTimeout(int) - Static method in class xsmeral.semnet.crawler.util.ConnectionManager
Corresponds to URLConnection.setConnectTimeout(int)
setCrawlDelay(Integer) - Method in class xsmeral.semnet.crawler.model.HostDescriptor
 
setDBLayer(RDBLayer) - Method in class xsmeral.semnet.crawler.model.CrawlerConfiguration
The relational DB layer used by the crawler for state persistence (URL storage)
setDescription(String) - Method in class xsmeral.semnet.manager.ProcessingJob
 
setDocument(TagNode) - Method in class xsmeral.semnet.crawler.model.EntityDocument
 
setEntity(boolean) - Method in class xsmeral.semnet.crawler.model.URLEntry
 
setEntityDescriptor(EntityDescriptor) - Method in class xsmeral.semnet.crawler.model.EntityDocument
 
setEntityDescriptors(Collection<EntityDescriptor>) - Method in class xsmeral.semnet.crawler.model.HostDescriptor
 
setFakeReferrer(boolean) - Method in class xsmeral.semnet.crawler.HTMLCrawler
 
setFakeReferrer(boolean) - Method in class xsmeral.semnet.crawler.model.CrawlerConfiguration
Indication, whether the HTTP Referer header should be set to the base URL of the host
setFollowRedirects(boolean) - Static method in class xsmeral.semnet.crawler.util.ConnectionManager
Corresponds to HttpURLConnection.setFollowRedirects(boolean)
setGlobalCrawlDelayMinimum(int) - Method in class xsmeral.semnet.crawler.HTMLCrawler
 
setGlobalCrawlDelayMinimum(int) - Method in class xsmeral.semnet.crawler.model.CrawlerConfiguration
Minimal crawl delay in milliseconds
setHost(String) - Method in class xsmeral.semnet.crawler.model.URLEntry
 
setHostDesc(HostDescriptor) - Method in class xsmeral.semnet.crawler.model.EntityDescriptor
Sets the owning HostDescriptor.
setHosts(Collection<HostDescriptor>) - Method in class xsmeral.semnet.crawler.model.CrawlerConfiguration
Hosts crawled by the crawler
setId(int) - Method in class xsmeral.semnet.crawler.model.URLEntry
 
setIgnoresPolicy(boolean) - Method in class xsmeral.semnet.crawler.HTMLCrawler
 
setLastVisited(Date) - Method in class xsmeral.semnet.crawler.model.URLEntry
 
setMapping(Mapping) - Method in class xsmeral.semnet.mapper.StatementMapper
 
setName(String) - Method in class xsmeral.semnet.crawler.model.HostDescriptor
 
setName(String) - Method in class xsmeral.semnet.manager.ProcessingJob
 
setPath(String) - Method in class xsmeral.semnet.crawler.model.URLEntry
 
setPattern(String) - Method in class xsmeral.semnet.crawler.model.URLEntry
 
setPolicyIgnored(boolean) - Method in class xsmeral.semnet.crawler.model.CrawlerConfiguration
Indication of adherence to the Robots Exclusion Protocol
setProcessorChain(List<Configuration>) - Method in class xsmeral.semnet.manager.ProcessingJob
 
setReadTimeout(int) - Static method in class xsmeral.semnet.crawler.util.ConnectionManager
Corresponds to URLConnection.setReadTimeout(int)
setRepository(Repository) - Method in class xsmeral.semnet.sink.RepositoryFactory
Called by the factory, sets the initialized repository.
setScore(short) - Method in class xsmeral.semnet.crawler.model.URLEntry
 
setSourceFirst(boolean) - Method in class xsmeral.semnet.crawler.model.HostDescriptor
 
setSourceURLPatterns(Map<Pattern, Integer>) - Method in class xsmeral.semnet.crawler.model.HostDescriptor
 
setThreadsPerHost(int) - Method in class xsmeral.semnet.crawler.model.CrawlerConfiguration
Number of crawling threads per host
setUpdateFreq(int) - Method in class xsmeral.semnet.crawler.model.URLEntry
 
setUrl(String) - Method in class xsmeral.semnet.crawler.model.EntityDocument
 
setUserAgent(String) - Static method in class xsmeral.semnet.crawler.util.ConnectionManager
Same as HttpURLConnection.setRequestProperty("User-Agent", userAgent)
setVisitCount(int) - Method in class xsmeral.semnet.crawler.model.URLEntry
 
setWorking(boolean) - Method in class xsmeral.semnet.crawler.model.URLEntry
 
sourceFirst() - Method in interface xsmeral.semnet.crawler.URLManager.OrderClause
Orders by entity field, so that source URLs are returned first.
sourceFirst() - Method in class xsmeral.semnet.crawler.URLManager.QueryBuilderImpl
 
StatementMapper - Class in xsmeral.semnet.mapper
An object processor, a filter, that maps URIs in subject or predicate to different URIs.
StatementMapper() - Constructor for class xsmeral.semnet.mapper.StatementMapper
 
StatementMapper(Mapping) - Constructor for class xsmeral.semnet.mapper.StatementMapper
 
StdErrWriter - Class in xsmeral.semnet.util
Simple processor that outputs anything to the error output.
StdErrWriter() - Constructor for class xsmeral.semnet.util.StdErrWriter
 
StdOutWriter - Class in xsmeral.semnet.util
Simple processor that outputs anything to the standard output.
StdOutWriter() - Constructor for class xsmeral.semnet.util.StdOutWriter
 

A B C D E F G H I J L M N O P Q R S T U V W X