|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectorg.jasig.portlet.athletics.dao.ScreenScrapingAthleticsDaoImpl
public class ScreenScrapingAthleticsDaoImpl
ScreenScrapingAthleticsDaoImpl provides a reusable athletics DAO implementation targeted for collecting information from HTML content. Whenever possible, results should instead be retrieved from some more well-formatted web service or other high quality data source. This implementation uses OWASP AntiSamy to clean and validate external HTML pages, then uses an XSLT to transform the HTML page into the portlet's default athletics feed XML structure, at which point the data can be deseriablized.
| Field Summary | |
|---|---|
protected org.apache.commons.logging.Log |
log
|
| Constructor Summary | |
|---|---|
ScreenScrapingAthleticsDaoImpl()
|
|
| Method Summary | |
|---|---|
protected String |
getCleanedHtmlContent(String html)
Clean and validate raw HTML, returning valid XML. |
AthleticsFeed |
getFeed()
Return an athletics feed representing all current news stories and competitions for all known sports. |
protected String |
getHtmlContent(String url)
Get the raw HTML content for a specified URL. |
Sport |
getSport(String sportKey)
Return details, news stories, and competitions for an individual sport. |
protected Sport |
getSportForXml(String xml)
Deserialize athletics feed XML into a Sport object. |
Map<String,String> |
getSportUrls()
Get the mapping of URLs by sport. |
protected String |
getXml(String cleanHtml)
Transform clean and valid HTML into the portlet's default athletics format XML feed. |
protected void |
postProcessSport(Sport sport)
Optional post-processing method allows subclasses to add custom cleanup logic after deserialization. |
void |
setPolicy(org.springframework.core.io.Resource config)
Set the AntiSamy policy file to be used to clean and validate external HTML. |
void |
setSportUrls(Map<String,String> urlMap)
Set the mapping of URLs for each sport. |
void |
setXslt(org.springframework.core.io.Resource xslt)
Set the XSLT to be used to transform the cleaned and validated HTML to the portlet's default XML strucutre. |
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Field Detail |
|---|
protected org.apache.commons.logging.Log log
| Constructor Detail |
|---|
public ScreenScrapingAthleticsDaoImpl()
| Method Detail |
|---|
public void setXslt(org.springframework.core.io.Resource xslt)
xslt -
public void setPolicy(org.springframework.core.io.Resource config)
throws org.owasp.validator.html.PolicyException,
IOException
config -
org.owasp.validator.html.PolicyException
IOExceptionpublic void setSportUrls(Map<String,String> urlMap)
urlMap - public Map<String,String> getSportUrls()
public AthleticsFeed getFeed()
IAthleticsDao
getFeed in interface IAthleticsDaopublic Sport getSport(String sportKey)
IAthleticsDao
getSport in interface IAthleticsDao
protected String getHtmlContent(String url)
throws org.apache.http.client.ClientProtocolException,
IOException
url -
org.apache.http.client.ClientProtocolException
IOException
protected String getCleanedHtmlContent(String html)
throws org.owasp.validator.html.ScanException,
org.owasp.validator.html.PolicyException
html -
org.owasp.validator.html.ScanException
org.owasp.validator.html.PolicyException
protected String getXml(String cleanHtml)
throws TransformerException,
IOException
cleanHtml -
TransformerException
IOException
protected Sport getSportForXml(String xml)
throws JAXBException
xml -
JAXBExceptionprotected void postProcessSport(Sport sport)
sport -
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||