Tool for monitoring replication between remote data centers without connectivity beyond Riak K/V replication. The tool is configured by file and will alarm when the difference between a local timestamp and a timestamp written into Riak K/V at a remote location exceeds the configured threshold. The tool has the ability to log the alarm to the console, a log file, and SNMP V1 trap. Out of tolerance alarms will be generated whenever the configured tolerance is exceeded however an in tolerance alarm will only be generated when the value is in tolerance and the previous value was out of tolerance.
Using Maven:
mvn package
Builds the project and creates a distributable tarball in the
target/directory
Example app.properties file:
riak.hosts.count=1 riak.hosts.1.hostName=localhost riak.port=18098 riak.useProtocolBuffers=false riak.mdcStatsBucket=MDCStats dataCenter.id=US_East remoteDataCenter.count=1 remoteDataCenter.1.id=US_West remoteDataCenter.latencyThreshold=30000 alarmHandler.console.enable=true alarmHandler.log.enable=true alarmHandler.log.filename=/var/log/mdc_monitor.log alarmHandler.snmp.enable=true alarmHandler.snmp.hostName=localhost alarmHandler.snmp.port=162 alarmHandler.snmp.community=public
riak.hosts.count Specifies the number of local Riak nodes.
riak.hosts.NUMBER.hostName Specifies the hostname of a local Riak Node
where NUMBER is the increasing count of nodes.
riak.port Specifies the port on which to connect to Riak.
riak.useProtocolBuffers True/False. If false, HTTP will be used.
riak.mdcStatsBucket Specifies the bucket in Riak where timestamps will
be stored.
dataCenter.id Specifies the local ID of the local data center which will
be referenced by other data centers.
remoteDataCenter.count Specifies the number of remote data centers which
will be monitored.
remoteDataCenter.NUMBER.id Specifies the id of a remote data center where
NUMBER is the increasing count of remote data centers.
remoteDataCenter.latencyThreshold Specifies the number of milliseconds
beyond which an alarm will be generated by the tool.
alarmHandler.console.enable Enable console output.
alarmHandler.log.enable Enable log file output.
alarmHandler.log.filename Specify the log filename for the tool.
alarmHandler.snmp.enable Enable SNMP logging.
alarmHandler.snmp.hostName Specify the host name of the SNMP Manager.
alarmHandler.snmp.port Specify the trap port of the SNMP Manager.
alarmHandler.snmp.community Specify the community used in the SNMP trap.
java -jar riak-mdc-mon-0.0.1.jar -p <app.properties> [options]
Options:
-p <app.properties> Set the properties file, defined above, to configure
the tool. Required.
-t Test alarm generation with out of tolerance alarms based on the
configured values.
-n Test alarm generation with in tolerance alarms based on the configured
values.
Since this tool simply runs the check once then exits so you'll want to run
this run it with a cron job. Example crontab entry to run the program
every 5 minutes:
*/5 * * * * java -jar /path/to/riak-mdc-mon/riak-mdc-mon-0.0.1.jar -p /path/to/riak-mdc-mon/conf/app.properties
Run the program in normal mode:
java -jar riak-mdc-mon-0.0.1.jar -p /path/to/app.properties
Run the program in test mode, testing out of tolerance alarms:
java -jar riak-mdc-mon-0.0.1.jar -p /path/to/app.properties -t
Run the program in test mode, testing in tolerance alarms:
java -jar riak-mdc-mon-0.0.1.jar -p /path/to/app.properties -n
{Date/Time}\t{RemoteDataCenterId}\t{MeasuredLatency}\t{LatencyThreshold}\n
The MIB defining in-tolerance and out-tolerance traps sent are defined the MIB file located in the conf directory. These changes will eventually be merged into main Riak SNMP MIB but are distributed separately currently.