Skip to content

Release 0.2.0

Latest
Compare
Choose a tag to compare
@artyom-beilis artyom-beilis released this 04 Sep 21:30
· 2 commits to master since this release

What is new in 0.2.0

Bug/Issue Fixes

  • Fixed incorrect use of double constants in some operators
  • Fixed crash when loading models that were saved on OCL devices
  • Fixed default parameter of torch.ocl.synchronize
  • Fixed failure of printing on Intel devices with missing fp64 support

New nets Validated

Visual transformers vit_transformets and vit_x_NN ets validated

New operators implemented:

  • resize_, arange, mm, bmm, amin, amax, addmm, _native_multi_head_attention and transform_bias_rescale_qkv, round, maximum, minimum, prod, atan, dropout_native
  • lt,le,gt,ge,eq,ne for tensors
  • bitwise ^, |, &, ~
  • upsample_2d : bilinear, nearest and nearest exact, forward and backward

Fixed operators

  • Fixed softmax and log softmax support of dim that is not last dim
  • Fixed view operator and set_ storage
  • cat now supports mixed types
  • Fix handling of empty tensors with non empty storage
  • Very limited half tensor handling
  • Fixed tensor >, < ==, != scalar ops

New features:

  • Added support of profiling via torch.ocl.profile API
  • Improved benchmark scripts

Performance improvements

  • Intel Arc, UHD - enabled winograd convolution, support of OpenCL 3.0 floating point add atomics, enabled k-reduction for GEMM operators
  • NVidia - added use of native atomic float add (via PTX assembly)
  • GELU major improvements due to faulty use of double instead of float