Currently, H2O features that produces vectors (Seq[Double]) have arity checking. This should become an option that is on by default, but that can be turned off.
See https://github.com/eHarmony/aloha/blob/master/aloha-h2o/src/main/scala/com/eharmony/aloha/models/h2o/json/H2oModelJson.scala#L109
Current Code
case class DoubleSeqH2oSpec(name: String, spec: String, size: Int, defVal: Option[Seq[Double]]) extends H2oSpec {
type A = Seq[Double]
def ffConverter[B] = f => DoubleSeqFeatureFunction(f, size)
def refInfo = RefInfo[Seq[Double]]
protected def sizeErr: String = s"feature '$name' output size != $size"
// NOTE: override here and wrap spec in Option to avoid adding implicit Option lift for Seq[Double]
override def compile[B](semantics: Semantics[B]): Either[Seq[String], FeatureFunction[B]] = {
val wrappedSpec = s"Option($spec).map{x => require(x.size == $size, " + s""""$sizeErr"); x}"""
semantics.createFunction[Option[Seq[Double]]](wrappedSpec, Option(defVal))(RefInfo[Option[Seq[Double]]]).right.map(f =>
ffConverter(f.andThenGenAggFunc(_ orElse defVal)))
}
}
What to do
- Add a
arityChecking: Boolean = true parameter to the case class
- Update the
compile function to synthesize the proper code.
- Update the JSON Format to be able to read and write the option
- Write one test.
Motivation
When embeddings are produced from the specification file, they are produced after the existence of the model definition. If an embedding of the specified arity can't be produce, the model will err because the data produced is the wrong arty. In this situation, Aloha did the right thing, but we want to allow it to be flexible for real-life practical usage.
Currently, H2O features that produces vectors (
Seq[Double]) have arity checking. This should become an option that is on by default, but that can be turned off.See https://github.com/eHarmony/aloha/blob/master/aloha-h2o/src/main/scala/com/eharmony/aloha/models/h2o/json/H2oModelJson.scala#L109
Current Code
What to do
arityChecking: Boolean = trueparameter to the case classcompilefunction to synthesize the proper code.Motivation
When embeddings are produced from the specification file, they are produced after the existence of the model definition. If an embedding of the specified arity can't be produce, the model will err because the data produced is the wrong arty. In this situation, Aloha did the right thing, but we want to allow it to be flexible for real-life practical usage.