Package org.dspace.app.statistics
Class LogAnalyser
java.lang.Object
org.dspace.app.statistics.LogAnalyser
This class performs all the actual analysis of a given set of DSpace log
files. Most input can be configured; use the -help flag for a full list
of usage information.
The output of this file is plain text and forms an "aggregation" file which can then be used for display purposes using the related ReportGenerator class.
- Author:
- Richard Jones
-
Method Summary
Modifier and TypeMethodDescriptionstatic String[]analyseQuery(String query) Take a search query string and pull out all of the meaningful information from it, giving the results in the form of a String array, a single word to each elementstatic Stringgenerate the analyser's output to the specified out filestatic Stringget the current config file namestatic File[]getLogFiles(String logDir) get an array of file objects representing the passed log directorystatic LogLinegetLogLine(String line) split the given line into it's relevant segments if applicable (i.e. the line matches the required regular expression.static IntegergetNumItems(Context context) get the total number of items in the archive at time of execution, ignoring all other constraintsstatic IntegergetNumItems(Context context, String type) get the number of items in the archive which were accessioned between the provided start and end dates, with the given value for the DC field 'type' (unqualified)static Integerincrement the value of the given map at the given key by one.static voidmain method to be run from command line.static LocalDateTake the standard date string requested at the command line and convert it into a Date object.static StringprocessLogs(Context context, String myLogDir, String myFileTemplate, String myConfigFile, String myOutFile, LocalDate myStartDate, LocalDate myEndDate, boolean myLookUp) using the pre-configuration information passed here, analyse the logs and produce the aggregation filestatic voidRead in the current config file and populate the class globals.static voidreadConfig(String configFile) Read in the given config file and populate the class globals.static voidsetParameters(String myLogDir, String myFileTemplate, String myConfigFile, String myOutFile, LocalDate myStartDate, LocalDate myEndDate, boolean myLookUp) set the passed parameters up as global class variables.static voidset up the regular expressions to be used by this analyser.static StringunParseDate(LocalDate date) Take the date object and convert it into a string of the form YYYY-MM-DDstatic voidusage()print out the usage information for this class to the standard out
-
Method Details
-
main
main method to be run from command line. See usage information for details as to how to use the command line flags (-help)- Parameters:
argv- the command line arguments given- Throws:
Exception- if errorSQLException- if database error
-
processLogs
public static String processLogs(Context context, String myLogDir, String myFileTemplate, String myConfigFile, String myOutFile, LocalDate myStartDate, LocalDate myEndDate, boolean myLookUp) throws IOException, SQLException, SearchServiceException using the pre-configuration information passed here, analyse the logs and produce the aggregation file- Parameters:
context- the DSpace context object this occurs undermyLogDir- the passed log directory. Uses default if nullmyFileTemplate- the passed file name regex. Uses default if nullmyConfigFile- the DStat config file. Uses default if nullmyOutFile- the file to which to output aggregation data. Uses default if nullmyStartDate- the desired start of the analysis. Starts from the beginning otherwisemyEndDate- the desired end of the analysis. Goes to the end otherwisemyLookUp- force a lookup of the database- Returns:
- aggregate output
- Throws:
IOException- if IO errorSQLException- if database errorSearchServiceException- if search error
-
setParameters
public static void setParameters(String myLogDir, String myFileTemplate, String myConfigFile, String myOutFile, LocalDate myStartDate, LocalDate myEndDate, boolean myLookUp) set the passed parameters up as global class variables. This has to be done in a separate method because the API permits for running from the command line with args or calling the processLogs method statically from elsewhere- Parameters:
myLogDir- the log file directory to be analysedmyFileTemplate- regex for log file namesmyConfigFile- config file to use for dstatmyOutFile- file to write the aggregation intomyStartDate- requested log reporting start datemyEndDate- requested log reporting end datemyLookUp- requested look up force flag
-
createOutput
generate the analyser's output to the specified out file- Returns:
- output
-
getLogFiles
get an array of file objects representing the passed log directory- Parameters:
logDir- the log directory in which to pick up files- Returns:
- an array of file objects representing the given logDir
-
setRegex
set up the regular expressions to be used by this analyser. Mostly this exists to provide a degree of segregation and readability to the code and to ensure that you only need to set up the regular expressions to be used once- Parameters:
fileTemplate- the regex to be used to identify dspace log files
-
getConfigFile
get the current config file name- Returns:
- The name of the config file
-
readConfig
Read in the current config file and populate the class globals.- Throws:
IOException- if IO error
-
readConfig
Read in the given config file and populate the class globals.- Parameters:
configFile- the config file to read in- Throws:
IOException- if IO error
-
increment
increment the value of the given map at the given key by one.- Parameters:
map- the map whose value we want to increasekey- the key of the map whose value to increase- Returns:
- an integer object containing the new value
-
parseDate
Take the standard date string requested at the command line and convert it into a Date object. Throws and error and exits if the date does not parse- Parameters:
date- the string representation of the date- Returns:
- a date object containing the date, with the time set to 00:00:00
-
unParseDate
Take the date object and convert it into a string of the form YYYY-MM-DD- Parameters:
date- the date to be converted- Returns:
- A string of the form YYYY-MM-DD
-
analyseQuery
Take a search query string and pull out all of the meaningful information from it, giving the results in the form of a String array, a single word to each element- Parameters:
query- the search query to be analysed- Returns:
- the string array containing meaningful search terms
-
getLogLine
split the given line into it's relevant segments if applicable (i.e. the line matches the required regular expression.- Parameters:
line- the line to be segmented- Returns:
- a Log Line object for the given line
-
getNumItems
public static Integer getNumItems(Context context, String type) throws SQLException, SearchServiceException get the number of items in the archive which were accessioned between the provided start and end dates, with the given value for the DC field 'type' (unqualified)- Parameters:
context- the DSpace context for the actiontype- value for DC field 'type' (unqualified)- Returns:
- an integer containing the relevant count
- Throws:
SQLException- if database errorSearchServiceException- if search error
-
getNumItems
get the total number of items in the archive at time of execution, ignoring all other constraints- Parameters:
context- the DSpace context the action is being performed in- Returns:
- an Integer containing the number of items in the archive
- Throws:
SQLException- if database errorSearchServiceException- if search error
-
usage
public static void usage()print out the usage information for this class to the standard out
-