- All Implemented Interfaces:
- Configurable, Tool
public class DBCountPageView
extends Configured
implements Tool
This is a demonstrative program, which uses DBInputFormat for reading
the input data from a database, and DBOutputFormat for writing the data
to the database.
The Program first creates the necessary tables, populates the input table
and runs the mapred job.
The input data is a mini access log, with a <url,referrer,time>
schema.The output is the number of pageviews of each url in the log,
having the schema <url,pageview>.
When called with no arguments the program starts a local HSQLDB server, and
uses this database for storing/retrieving the data.
This program requires some additional configuration relating to HSQLDB.
The the hsqldb jar should be added to the classpath:
export HADOOP_CLASSPATH=share/hadoop/mapreduce/lib-examples/hsqldb-2.0.0.jar
And the hsqldb jar should be included with the -libjars
argument when executing it with hadoop:
-libjars share/hadoop/mapreduce/lib-examples/hsqldb-2.0.0.jar