01 Jul 03:04

v1.8.0 Latest

Latest

What's Changed

Support GPU training
Support multiple version of yarn (>= 2.6)
add faq by @yuyajian in #15
fix the board history display by @jiarunying in #17
reset worker num when less than inputfile number by @liyuance in #35
xlearning container allocate port from 20000 to 30000 by default by @FANNG1 in #45
add docker support by @SuperbDong in #64
Tensorflow example hangs when using multiple workers by @lshmouse in #62

New Contributors

@yuyajian made their first contribution in #15
@jiarunying made their first contribution in #17
@liyuance made their first contribution in #35
@FANNG1 made their first contribution in #45
@SuperbDong made their first contribution in #64
@lshmouse made their first contribution in #62

Full Changelog: v1.7.2...v1.8.0

What's Changed

Tensorflow example hangs when using multiple workers by @lshmouse in #62

New Contributors

@lshmouse made their first contribution in #62

Full Changelog: v1.4...v1.8.0

Contributors

lshmouse, liyuance, and 4 other contributors

Assets 3

24 Jun 10:46

v1.8.0-beta2 Pre-release

Pre-release

What's Changed

add faq by @yuyajian in #15
fix the board history display by @jiarunying in #17
reset worker num when less than inputfile number by @liyuance in #35
xlearning container allocate port from 20000 to 30000 by default by @FANNG1 in #45
add docker support by @SuperbDong in #64
Tensorflow example hangs when using multiple workers by @lshmouse in #62

New Contributors

@yuyajian made their first contribution in #15
@jiarunying made their first contribution in #17
@liyuance made their first contribution in #35
@FANNG1 made their first contribution in #45
@SuperbDong made their first contribution in #64
@lshmouse made their first contribution in #62

Full Changelog: https://github.com/Qihoo360/hbox/commits/v1.8.0-beta2

Contributors

lshmouse, liyuance, and 4 other contributors

Assets 3

16 May 08:29

jiarunying

XLearning 1.4

Release XLearning 1.4

Major Features And Improvements

Support the application running on the docker
Support the mpi application
ClusterDef is avaliable for TensorFlow Distribution Strategy API
Allow the amount of memory to be set separately for chief and estimator worker for TensorFlow Application
Specify the Yarn node label for job execution
Multi-threads upload the output
Allow the inter-result incremental upload
Support the regular matching for input path

Bug Fixes and Other Changes

The memory usage adjustment prompt is only displayed when the application finish status is successed.

Assets 2

16 Oct 11:03

jiarunying

XLearning 1.3

Release XLearning 1.3

Major Features And Improvements

Support the lightLDA, see examples/lightLDA for use
Support the xflow, see examples/xflow for use
By submitting the configuration parameter to support the user-defined environment variable settings
Setting the last worker as estimator role of the distribute TensorFlow application if the user set the tf-evaluator as true, see examples/tfEstimators for use
Define the single worker index to save the output by set the output-index
Port reservation mechanism optimization
Local data container allocation priority mechanism
Display resource application and usage information
ps role function expansion: more convenient metrics use information rendering and output output upload

Bug Fixes and Other Changes

Container waits for the remaining machine port addresses to be stuck in the process due to the failure of the Container in distributed mode
After the worker applies, the number of redundant applications is released, and the remove request operation is added
Application failed due to excessive environment variables too long of the input in PLACEHOLDER mode
Job execution judgment failure condition control
The status code returns incorrectly when the Container successfully exits

Assets 2

05 Feb 02:56

jiarunying

XLearning 1.2

Release XLearning 1.2

Major Features And Improvements

Client print the containers status information when the state changes
add the xlearning.localresource.timeout configuration to control the local resource download
support the VisualDL, see examples/mxnetVisualDL for use
support the local cache when input strategy is inputformat with epoch greater than 1

Bug Fixes and Other Changes

Add the exception handling for process of board and metrics

Assets 2

09 Jan 07:56

jiarunying

XLearning 1.1

Release XLearning 1.1

Major Features And Improvements

worker or ps memory auto scaled when application retry after failed
application exit as fail when container allocated exceed limit time
support the user's job jar using the --jars when application submit
add the cpu metrics on the web display. Note that if hadoop version lower than 2.6.4, please see the FAQ first.
support more distribute deep learning frameworks, such xgboost, LightGBM. Specific usage details please see the FAQ.

Bug Fixes and Other Changes

fix nullPoint at the AppController
more examples especially for the distribute mode application
FAQ provides detailed instructions on how to use the new features

Assets 2