Releases: RumbleDB/rumble
Rumble 1.6.3 "Yucca"
Interim release to address user requests.
- More informative error message when the wrong version of Java is used.
- More informative error messages in Jupyter notebooks for unexpected errors.
- User-defined functions can now work on parallel input. Rumble automatically detects it.
- Fixed a bug with local execution of nested order by clauses.
Rumble 1.6.2 "Yucca"
Interim release based on user feedback.
Adds a warning message in Jupyter notebooks, Python and other host languages if materialization hits the cap for the final result. Also, in the shell the warning message is now displayed after the results, making it less easy to overlook.
Rumble 1.6.1 "Yucca"
Interim release fixing multiline queries with the Rumble magic in Jupyter notebooks as well as an explicit listing of the Joda time dependency for some users who reported it was not included in their environment.
Rumble 1.6.0 Yucca
- the materialization of too many items now throws an error rather than just a warning, to avoid incorrect results
- a bug was fixed in the closure of inline functions. Now, the variable values in scope where the function is created are correctly taken.
- new functions: format-date, format-dateTime, format-time, current-date, current-dateTime, current-time, serialize
- parallelization of existing functions: flatten, intersect, descendant-objects, descendant-arrays, remove-keys, project, insert-before
- new input format: AVRO, with avro-file() functions
- global variables are now supported and dependency cycles are identified
- the shell is more colorful
- a local FLWOR with a return clause returning big sequences (aka underlying RDD/DF) is no longer materialized.
- support for => operator to pass the left-hand-side as the first parameter to a function (similar to OO-programming)
- support for simple map expressions (! operator) also in parallel
- conditional expression, switch expressions as well as comma expressions can now also run in parallel
- Rumble can run as an HTTP server for integration with any host language
Rumble 1.5.1 Southern Live Oak
This release unifies and stabilizes access to all file systems (S3, HDFS, local) for --query-path, --output-path, --log-path as well as the path passed to input functions.
Rumble 1.5.0 Southern Live Oak
Various bugfixes and stability improvements.
Support for more inputs (JSON, Parquet, ROOT, CSV, text) and sources (S3, Azure, local, HDFS).
More built-in functions.
More expressions and functions are parallelized.
Rumble 1.4.2 Willow Oak
Various bugfixes
- variable bindings are now available and visible in all nested FLWOR clauses
- when only a count aggregation is made on a non-grouping variable, use of the count now works on all following clauses (was: some error with DataFrames on Long to [B conversion).
- errors are now thrown early if an RDD evaluation is to be made within a big FLWOR expression (improvement over uninformative null pointer exception)
Rumble 1.4.1 Willow Oak
This is an interim release with bugfixes:
- invoking a count on a grouping key after a group-by now works in a let clause (was: "java.lang.Long cannot be cast to [B") in addition to return clauses. This will also be fixed in other clauses later on.
- the count clause is now more stable on large amounts of data (was: null pointer exception)
- variables passed from outside the FLWOR clause are now visible in where and return clauses (was: variable does not exist). This will also be fixed in other clauses later on.
Rumble 1.4 "Willow Oak"
- more type support: grouping and ordering on durations, dates, times, datetimes and binaries
- fixed bug in which more than one grouping key value was bound to the grouping key variable when there were equivalent, but not equal grouping keys (like 1 and 1.0)
- user-defined functions are supported (no type checking just yet)
- function items are supported (i.e., functions can be manipulated like any other items)
- support for position() and last() in predicates, which also works in parallel
- fixed bug because of which the Effective Boolean Value was not considered in parallel execution or where clauses and filters
- new structured-json-file() function to increase performance on structured JSON Lines files (i.e., using DataFrames under the hood). This is a bootstrap, and actual optimizations will follow.
- filtering on a specific position by passing (or computing) a number as a predicate now works in parallel
- in where clauses and predicates executed in parallel, the effective boolean value is correctly taken
The jar is based on Java 8 and is compatible with all more recent Java versions.
Rumble 1.3 White Oak
Various bug fixes.
New features:
- json-doc builtin function to open single JSON files locally even when the object is spread over multiple lines (returns a single item, will not automatically parallelize anything)
- parquet-file builtin function to open parquet files locally, on HDFS or on S3
- starting to introduce type support (date, dateTime, time, duration, dayTimeDuration, yearMonthDuration, base64Binary, hexBinary) as well as cast as, castable as, treat as and instance of. Should any bug be found, please let us know.