Skip to content

Risk Data Collectors REST API

albertosiena edited this page Apr 29, 2015 · 4 revisions

The risk data collectors (RDC) are a set of software components of the RISCOSS platform. They provide to the platform the service of retrieving data from other online services. The main purpose of these RDCs is to support contributors willing to write custom data collectors to integrate them into the platform.

The RDCs are published in GitHub in the riscoss-data-collector repository (https://github.com/RISCOSS/riscoss-data-collector). The repository is structured in several folder and corresponding projects.

==== riscoss-rdr ====

Contains the functions and utility classes to interact with the risk data repository (RDR). Specifically, the riscoss-rdr folder contains the following classes:

  • ##Evidence##. Utility class to encapsulate the "evidence" data type used by the risk analysis engine.

  • ##Distribution##. Utility class to encapsulate the "distribution" data type used by the risk analysis engine.

  • ##RiskData##. Class that encapsulate the basic data to be sent to the platform's RDR; the RiskData class contains the following fields: of the retrieved data (e.g., "number_of_open_issues") of the data retrieval id, to which the data has to be associated retrieved.

  • ##RDR##. Contains the function to create a JSON structure from a set of RiskData instances, and also to send the JSON structure over the network, to store it directly into the RDR

==== riscoss-rdc-api ====

Contains a set of utility classes to support the addition of custom RDCs. Specifically, the riscoss-rdc-api folder contains the following classes:

  • ##RDC##: it's an interface that encapsulates the minimal functionalities that should be implemented by every rdc.

Notice that RDCs that implement the RDC interface are unaware of the way they are used. In other words, an RDC implementation can be used as both, an RDC called internally by the platform, or an external RDC, which pushes its data onto the RDR's REST api. To differentiate the two cases, the RDCRunner class is used.

  • ##RDCParameter##: it's a utility class to communicate to the platform the following information about each single parameter: name of the parameter description of the parameter example of value of the parameter, to help the user in specifying it with the proper syntax default value of the parameter, and also to specify whether a given parameter is mandatory to be specified (if the default value is null) or can be omitted (if the default value is not null).

  • ##RDCRunner##: it's a utility class that provides a standardized entry point. Specifically, the provided entry point is capable to:

  • read the parameters from the command line as well as from the standard input (stdin)

  • list the registered RDCs; this is done by specifying the -info argument on the command line

  • retrieve the parameter names and types required by a given RDC; this is done by specifying the -info argument

  • activate a given RDC asking for the data retrieval service; this is done by specifying the -rdc=<rdc_name> on the command line

  • print on the standard output (stdout) the output of all the above functions; this is done by specifying the -print flag on the command line

  • send the retrieved data over the network, directly to the RDR's REST API; this is done by specifying the -rdr= argument on the command line.

All results from the configuration requests are wrapped in the beginning tag -BEGIN CONFIGURATION DATA- and the ending tag -END CONFIGURATION DATA-. Anything before the beginning tag or after the ending tag will be ignored. Between these tags shall be a JSON object.

==== riscoss-rdc-app ====

It's a standard implementation of the RDCRunner. The RDCApp is aware at compile time of all the RDCs developed and deployed in the riscoss-rdc-app repository. It's standard behavior is to register all the available RDCs by the RDCFactory and run the standard entry point from the RDCRunner class.