Skip to content

Onyx Task Bundle for Implementing Data Processing Tasks in R


Notifications You must be signed in to change notification settings


Folders and files

Last commit message
Last commit date

Latest commit



23 Commits

Repository files navigation

Onyx Logo + R Logo = onyx-r

Onyx Task Bundle for Implementing Data Processing Tasks in R

Build Status


onyx-r provides an Onyx task bundle for running data processing tasks in R.

A typical use case is running R models (created via statistical or machine learning algorithms) in Onyx job workflows, at scale:

  1. A data scientist exports a model as an RData file.
  2. An Onyx developer configures an onyx-r task to load the model on job submit time and use it to create predictions when bundles of Onyx segments arrive at the task.

Architecture Overview

Each Onyx peer runs an Rserve instance, each virtual peer holds a connection to its local Rserve instance. onyx-r tasks are configured at job submit time through pure Clojure data in the Onyx catalog. onyx-r tasks are implemented as pure R functions that take an Onyx segment as input and return a modified Onyx segment as output. For this to work seamlessly, onyx-r automatically translates between Clojure and R data structures. onyx-r tasks must be configured with the name of the R segment processing function to call.

When an onyx-r task is prepared for execution on a virtual peer through Onyx lifecycles, the task can be provided with R code to source, R data (in RData format exported from R via save) to load and Clojure values to assign to R variables. These configuration options are also supplied by the user at job submit time through the Onyx catalog.

Quick Start Guide

First, install Rserve on each Onyx peer as described at:


onyx-r is available in Clojars. Add this :dependency to your Leiningen project.clj:

[sourcewerk/onyx-r "0.1.0-SNAPSHOT"]

Running the Tests

Start a local Rserve server as documented at:

Then type lein test to runn all tests for onyx-r.

onyx-r Task Options

The following Clojure code block shows how to configure an onyr-r task through add-task:

    :rfun ; name of the Onyx task 
    "rfun" ; name of the R function to call
    {:source ["rfun <- function(segment) list(segment = segment, assigned = c(bar, baz), loaded = testData)"] ; R code to source when the task is prepared for execution on a virtual peer
     :load [(onyx-r.util/slurp-bytes "testData.RData")] ; RData to load when the task is prepared for execution on a virtual peer
     :assign {:bar 42
              :baz "Hallo, Onyx!"}} ; R variables to assign when the task is prepared for execution on a virtual peer 

onyx-r.util/slurp-bytes loads (RData) files into a Byte array, as expected by onyx-r's :load parameter.

Demo Code

The supplied demo jobs show how to use onyr-r's features in context:


Copyright © 2016 sourcewerk GmbH

Distributed under the Eclipse Public License, the same as Clojure and Onyx.


Commercial support is available through sourcewerk GmbH:


Email: [email protected]


Onyx Task Bundle for Implementing Data Processing Tasks in R







No releases published


No packages published