{"payload":{"feedbackUrl":"https://github.com/orgs/community/discussions/53140","repo":{"id":458861054,"defaultBranch":"main","name":"tvm","ownerLogin":"cattidea","currentUserCanPush":false,"isFork":true,"isEmpty":false,"createdAt":"2022-02-13T16:13:47.000Z","ownerAvatar":"https://avatars.githubusercontent.com/u/58100262?v=4","public":true,"private":false,"isOrgOwned":true},"refInfo":{"name":"","listCacheKey":"v0:1650957066.776412","currentOid":""},"activityList":{"items":[{"before":"4403379e3949e3339958ee01a41b9ece9c48ea8d","after":"d1ac1c0202b3d8cb2af268ce79c2ac710554152b","ref":"refs/heads/main","pushedAt":"2024-05-13T05:45:58.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"pull[bot]","name":null,"path":"/apps/pull","primaryAvatarUrl":"https://avatars.githubusercontent.com/in/12910?s=80&v=4"},"commit":{"message":"[KVCache] Fix the aux data syncing order of paged KV cache (#16988)\n\nFix the aux data syncing order of paged KV cache","shortMessageHtmlLink":"[KVCache] Fix the aux data syncing order of paged KV cache (apache#16988"}},{"before":"825dc1ffb51c25506600136d2ec8fb336f476c84","after":"4403379e3949e3339958ee01a41b9ece9c48ea8d","ref":"refs/heads/main","pushedAt":"2024-05-12T20:45:18.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"pull[bot]","name":null,"path":"/apps/pull","primaryAvatarUrl":"https://avatars.githubusercontent.com/in/12910?s=80&v=4"},"commit":{"message":"[JVM] Automatic Compatibility of JVM AttachCurrentThread (#16987)\n\nDifferent JDK may have different signature for AttachCurrentThread.\r\nThis can cause issues for example between code for android and normal java.\r\nThis PR uses a helper class to enable compact with both.","shortMessageHtmlLink":"[JVM] Automatic Compatibility of JVM AttachCurrentThread (apache#16987)"}},{"before":"fffd168d00100101a29188dd099fd67d5c002320","after":"825dc1ffb51c25506600136d2ec8fb336f476c84","ref":"refs/heads/main","pushedAt":"2024-05-11T01:39:54.000Z","pushType":"push","commitsCount":2,"pusher":{"login":"pull[bot]","name":null,"path":"/apps/pull","primaryAvatarUrl":"https://avatars.githubusercontent.com/in/12910?s=80&v=4"},"commit":{"message":"[TOPI] Remove `blockIdx.z` in topi sort (#16977)\n\nAs `blockIdx.z` is not allowed in WebGPU, this PR split `blockIdx.z`\r\ninto `blockIdx.y` to support WebGPU","shortMessageHtmlLink":"[TOPI] Remove blockIdx.z in topi sort (apache#16977)"}},{"before":"c0a47ed13999881d2e6ea68e3904f5c613bbdb94","after":"fffd168d00100101a29188dd099fd67d5c002320","ref":"refs/heads/main","pushedAt":"2024-05-09T18:49:47.000Z","pushType":"push","commitsCount":2,"pusher":{"login":"pull[bot]","name":null,"path":"/apps/pull","primaryAvatarUrl":"https://avatars.githubusercontent.com/in/12910?s=80&v=4"},"commit":{"message":"[Unity][BYOC] Use arith.Analyzer to check batch equality of matmul in cublas (#16982)\n\n* [Unity][BYOC] Use arith.Analyzer to check batch equality of matmul in cublas","shortMessageHtmlLink":"[Unity][BYOC] Use arith.Analyzer to check batch equality of matmul in…"}},{"before":"819b0023e46dd85a5ae8ce6294e5456abaf78f3c","after":"c0a47ed13999881d2e6ea68e3904f5c613bbdb94","ref":"refs/heads/main","pushedAt":"2024-05-08T14:24:36.000Z","pushType":"push","commitsCount":2,"pusher":{"login":"pull[bot]","name":null,"path":"/apps/pull","primaryAvatarUrl":"https://avatars.githubusercontent.com/in/12910?s=80&v=4"},"commit":{"message":"[CUBLAS][FP8] Enable R.matmul + R.multiply offloading (#16974)\n\nThis commit enables offloading of the next pattern to cuBLAS:\r\n mm = R.linear(data, weights)\r\n scale = R.multiply(a_scale, w_scale)\r\n out = R.multiply(mm, scale)\r\n out = R.cast(out, dtype)","shortMessageHtmlLink":"[CUBLAS][FP8] Enable R.matmul + R.multiply offloading (apache#16974)"}},{"before":"28d32b52cbde45600dc14a41af7f5ef9b6b778c5","after":"819b0023e46dd85a5ae8ce6294e5456abaf78f3c","ref":"refs/heads/main","pushedAt":"2024-05-07T20:43:06.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"pull[bot]","name":null,"path":"/apps/pull","primaryAvatarUrl":"https://avatars.githubusercontent.com/in/12910?s=80&v=4"},"commit":{"message":"[Relax] Support nested ModuleList in nn.Module (#16971)","shortMessageHtmlLink":"[Relax] Support nested ModuleList in nn.Module (apache#16971)"}},{"before":"9cfebca136a6dd58e59deeb19690d37cc6e9426a","after":"28d32b52cbde45600dc14a41af7f5ef9b6b778c5","ref":"refs/heads/main","pushedAt":"2024-05-06T18:17:46.000Z","pushType":"push","commitsCount":2,"pusher":{"login":"pull[bot]","name":null,"path":"/apps/pull","primaryAvatarUrl":"https://avatars.githubusercontent.com/in/12910?s=80&v=4"},"commit":{"message":"[TIR] Support narrow dtype for let binding (#16947)\n\nThe current pass `ForceNarrowIndexToI32` fails to narrow dtype for let\r\nbinding. This PR fixes the issue.\r\n\r\nBTW, this PR addresses the comments in #16934","shortMessageHtmlLink":"[TIR] Support narrow dtype for let binding (apache#16947)"}},{"before":"944d180fba18660f7846eccf4ef4931284a7d38b","after":"9cfebca136a6dd58e59deeb19690d37cc6e9426a","ref":"refs/heads/main","pushedAt":"2024-05-05T21:59:32.000Z","pushType":"push","commitsCount":2,"pusher":{"login":"pull[bot]","name":null,"path":"/apps/pull","primaryAvatarUrl":"https://avatars.githubusercontent.com/in/12910?s=80&v=4"},"commit":{"message":"[TVMScript] Fix error reporting inside Macro func (#16967)","shortMessageHtmlLink":"[TVMScript] Fix error reporting inside Macro func (apache#16967)"}},{"before":"effa5d79930b1103c36d8cc53618a6dce1ba3760","after":"944d180fba18660f7846eccf4ef4931284a7d38b","ref":"refs/heads/main","pushedAt":"2024-05-04T19:46:50.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"pull[bot]","name":null,"path":"/apps/pull","primaryAvatarUrl":"https://avatars.githubusercontent.com/in/12910?s=80&v=4"},"commit":{"message":"[SVE] Add get_active_lane_mask builtin (#16965)\n\nAdds a `get_active_lane_mask` builtin and lowering to\r\n`llvm.get.active.lane.mask` intrinsic. This will be used in subsequent\r\npatches for expressing predicated buffer loads/stores in TIR. Further\r\ninformation can be found in the [RFC](https://github.com/apache/tvm-rfcs/blob/main/rfcs/0104-scalable-vectors-in-tir.md#predication).\r\n\r\nCo-authored-by: Elen Kalda \r\nCo-authored-by: Neil Hickey \r\n\r\nChange-Id: Id9d65f9f11503ad35dd0b3db4bfc81249a76f701","shortMessageHtmlLink":"[SVE] Add get_active_lane_mask builtin (apache#16965)"}},{"before":"20d769617fa6ab561d7ed2b7cd61ed2b6b4710ba","after":"effa5d79930b1103c36d8cc53618a6dce1ba3760","ref":"refs/heads/main","pushedAt":"2024-05-04T08:08:49.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"pull[bot]","name":null,"path":"/apps/pull","primaryAvatarUrl":"https://avatars.githubusercontent.com/in/12910?s=80&v=4"},"commit":{"message":"[CUBLAS] Enable offloading of R.matmul + R.dequantize (#16896)\n\nThis commit enables offloading of R.matmul + R.dequantize to cuBLAS\r\ncodegen. Dequantization scale is passed to runtime function and set to\r\nalpha parameter. If there is no dequantization, then alpha == 1.0.","shortMessageHtmlLink":"[CUBLAS] Enable offloading of R.matmul + R.dequantize (apache#16896)"}},{"before":"6252fa5802c94df522306519da94b874b3a45eda","after":"20d769617fa6ab561d7ed2b7cd61ed2b6b4710ba","ref":"refs/heads/main","pushedAt":"2024-05-02T05:01:54.000Z","pushType":"push","commitsCount":2,"pusher":{"login":"pull[bot]","name":null,"path":"/apps/pull","primaryAvatarUrl":"https://avatars.githubusercontent.com/in/12910?s=80&v=4"},"commit":{"message":"[Relax] Express dynamic arguments of strided_slice as arguments (#16826)\n\n* [Relax] Express dynamic arguments of strided_slice as arguments\r\n\r\nPrior to this commit, `relax.op.strided_slice` stored the `axes`,\r\n`begin`, `end`, and `strides` in the `CallNode::attrs`. However, the\r\nattributes are only intended to store static values. The indices used\r\nused for `relax.op.strided_slice` must frequently be in terms of\r\nsymbolic shape variables, which should not be stored in the\r\nattributes. While some utilities have special handling for\r\n`relax.op.strided_slice` (e.g. `tvm::relax::Bind`), many do\r\nnot (e.g. `tvm::relax::WellFormed` and\r\n`tvm::relax::FreeSymbolicVars`). As a result, the symbolic\r\nexpressions in `relax.op.strided_slice` will fail to be updated in\r\ngeneric utilities, and will fail to trigger safeguards when this\r\noccurs.\r\n\r\nThis commit changes the representation of `relax.op.strided_slice` to\r\nstore all arguments in the `relax::CallNode::args`, rather than the\r\n`relax::CallNode::attrs`. As mentioned in a comment from\r\nhttps://github.com/apache/tvm/pull/13987, which initially implemented\r\n`relax.op.strided_slice`, this was an intended refactor once\r\n`relax::PrimValue` was fully supported.\r\n\r\n* Undo unnecessary changes in const_int_bound\r\n\r\n* Remove unnecessary changes to rewrite_simplify\r\n\r\n* lint fixes\r\n\r\n* Fix unit tests\r\n\r\n* Improve error message\r\n\r\n* Fix additional unit tests\r\n\r\n* Mark MSC tests with xfail\r\n\r\n* remove commented-out code\r\n\r\n* Resolve failing unit test\r\n\r\n* Remove unused imports","shortMessageHtmlLink":"[Relax] Express dynamic arguments of strided_slice as arguments (apac…"}},{"before":"114ad70a22f29ec62ad3e883bae90cffc5fba254","after":"6252fa5802c94df522306519da94b874b3a45eda","ref":"refs/heads/main","pushedAt":"2024-04-30T13:45:37.000Z","pushType":"push","commitsCount":3,"pusher":{"login":"pull[bot]","name":null,"path":"/apps/pull","primaryAvatarUrl":"https://avatars.githubusercontent.com/in/12910?s=80&v=4"},"commit":{"message":"[TIR] Enhance CLZ intrinsic support (#16952)","shortMessageHtmlLink":"[TIR] Enhance CLZ intrinsic support (apache#16952)"}},{"before":"97ff7cc4f197ef0fa21093448dd3e45e6f1fd2bc","after":"114ad70a22f29ec62ad3e883bae90cffc5fba254","ref":"refs/heads/main","pushedAt":"2024-04-29T23:10:52.000Z","pushType":"push","commitsCount":13,"pusher":{"login":"pull[bot]","name":null,"path":"/apps/pull","primaryAvatarUrl":"https://avatars.githubusercontent.com/in/12910?s=80&v=4"},"commit":{"message":"[TOPI] Revert unification of conv2d NHWC hybrid scheduling for `arm_cpu` targets (#16951)\n\nThis patch partly reverts the unification of scalable and non-scalable scheduling of conv2d NHWC for `arm_cpu` targets introduced in #16899.\r\n\r\nThe non-scalable schedule for float32 splits the N axis (corresponding to number of output channels) by 16 in both the unified and the nonunified schedule versions, and then additionally splits the inner partitions by 4 in only the nonunified version to which this patch is reverting (first added in #16106). The two versions' behaviour would be equivalent if none of the padding on the N axis was removed during lowering, however we allow for that to happen as it proved to increase performance for very small convolutions.\r\n\r\nAs it stands, there seems to be a regression in cases where the datatype is float32 and the number of output channels is greater than 16, a multiple of 4, and not a multiple of 16, because even with the removed padding the nonunified schedule is able to vectorise over 4 elements, while the unified version cannot vectorise over 16 elements anymore.\r\n\r\nSince all of the conv2d NHWC hybrid topi test cases used numbers of output channels either less than 16 or divisible by 16, this patch also adds a new case which falls in the aforementioned regression area.","shortMessageHtmlLink":"[TOPI] Revert unification of conv2d NHWC hybrid scheduling for `arm_c…"}},{"before":"278a6af085d1a149bc9ae4ff4a7ac4b33fc6b6bb","after":"97ff7cc4f197ef0fa21093448dd3e45e6f1fd2bc","ref":"refs/heads/main","pushedAt":"2024-04-27T02:16:01.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"pull[bot]","name":null,"path":"/apps/pull","primaryAvatarUrl":"https://avatars.githubusercontent.com/in/12910?s=80&v=4"},"commit":{"message":"[VM][OPENCL] Take advantage of OpenCL host ptr for improved copy (#16929)\n\nWe can use OpenCL mapped pointer for these copies for\r\nimproved performance.","shortMessageHtmlLink":"[VM][OPENCL] Take advantage of OpenCL host ptr for improved copy (apa…"}},{"before":"51cfb70f868c057d0d73aa60bc96b99ce722ecd2","after":"278a6af085d1a149bc9ae4ff4a7ac4b33fc6b6bb","ref":"refs/heads/main","pushedAt":"2024-04-26T19:18:14.000Z","pushType":"push","commitsCount":2,"pusher":{"login":"pull[bot]","name":null,"path":"/apps/pull","primaryAvatarUrl":"https://avatars.githubusercontent.com/in/12910?s=80&v=4"},"commit":{"message":"[Relax][TIR] Introduce new `cumsum` op for gpu (#16934)","shortMessageHtmlLink":"[Relax][TIR] Introduce new cumsum op for gpu (apache#16934)"}},{"before":"39f2482580b57fa5b1f6c1a1dc0e6f5e823ee4c0","after":"51cfb70f868c057d0d73aa60bc96b99ce722ecd2","ref":"refs/heads/main","pushedAt":"2024-04-26T06:06:00.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"pull[bot]","name":null,"path":"/apps/pull","primaryAvatarUrl":"https://avatars.githubusercontent.com/in/12910?s=80&v=4"},"commit":{"message":"[Fix][Dlight] Fix GeneralReduction for log-sum-exp (#16923)\n\nThis PR fixes the GeneralReduction dlight rule so that it can support\r\nscheduling log-sum-exp function.\r\n\r\nPrior to this issue, the rule makes a strong assumption on the pattern\r\nof the given function, which allows scheduling softmax, but fails to\r\nschedule log-sum-exp due to pattern mismatch. This PR enhances the rule\r\nand makes it able to match the pattern of log-sum-exp and apply\r\nsubsequent scheduling.\r\n\r\nA regression test is added.","shortMessageHtmlLink":"[Fix][Dlight] Fix GeneralReduction for log-sum-exp (apache#16923)"}},{"before":"4f8c03fad393c360008f1fb208f117c66c04090c","after":"39f2482580b57fa5b1f6c1a1dc0e6f5e823ee4c0","ref":"refs/heads/main","pushedAt":"2024-04-25T19:34:37.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"pull[bot]","name":null,"path":"/apps/pull","primaryAvatarUrl":"https://avatars.githubusercontent.com/in/12910?s=80&v=4"},"commit":{"message":"[Fix] Fix SSA conversion for SizeVar retention (#16924)\n\nThis PR fixes the var construction in IRConvertSSA, which always casts\r\nSizeVar to Var. This behavior leads to expr not being able to get\r\nsimplified in the LowerIntrin pass later on. Specifically, if not using\r\nSizeVar, the LowerIntrin pass loses the information of the non-negative\r\nvar information, and cannot simply a bunch of FloorDiv/FloorMod\r\nexpressions.\r\n\r\nOne regression test for SplitHostDevice is added to ensure the retention\r\nof SizeVar. Adding the test in SplitHostDevice because this is where\r\nthe SSA conversion is used.","shortMessageHtmlLink":"[Fix] Fix SSA conversion for SizeVar retention (apache#16924)"}},{"before":"11f2253b9cc22ff354e7f13df2d5a55feae01259","after":"4f8c03fad393c360008f1fb208f117c66c04090c","ref":"refs/heads/main","pushedAt":"2024-04-24T13:48:08.000Z","pushType":"push","commitsCount":3,"pusher":{"login":"pull[bot]","name":null,"path":"/apps/pull","primaryAvatarUrl":"https://avatars.githubusercontent.com/in/12910?s=80&v=4"},"commit":{"message":"[TVMScript] Support `T.launch_thread` with i64 dtype (#16916)\n\nThis PR fixes the bug of mismatched dtype in `T.launch_thread` when the dtype is `i64`.","shortMessageHtmlLink":"[TVMScript] Support T.launch_thread with i64 dtype (apache#16916)"}},{"before":"b0143d106f53ed811ec81612b2c88bea988b4323","after":"11f2253b9cc22ff354e7f13df2d5a55feae01259","ref":"refs/heads/main","pushedAt":"2024-04-23T17:22:12.000Z","pushType":"push","commitsCount":2,"pusher":{"login":"pull[bot]","name":null,"path":"/apps/pull","primaryAvatarUrl":"https://avatars.githubusercontent.com/in/12910?s=80&v=4"},"commit":{"message":"Restore \"pytest.mark.gpu\" for RELAX tests (#16741)\n\n* [TEST] Mark RELAX GPU tests with pytest.mark.gpu\r\n\r\nMissed pytest.mark.gpu prevents tests from launch in CI.\r\n\r\nSigned-off-by: Alexander Peskov \r\n\r\n* fix\r\n\r\nSigned-off-by: Alexander Peskov \r\n\r\n* Check fp8 compute capability\r\n\r\nSigned-off-by: Alexander Peskov \r\n\r\n* fix func signature\r\n\r\nSigned-off-by: Alexander Peskov \r\n\r\n* lint\r\n\r\nSigned-off-by: Alexander Peskov \r\n\r\n---------\r\n\r\nSigned-off-by: Alexander Peskov \r\nCo-authored-by: Alexander Peskov ","shortMessageHtmlLink":"Restore \"pytest.mark.gpu\" for RELAX tests (apache#16741)"}},{"before":"57316dae1497c36ed57732a7a610018a990f1927","after":"b0143d106f53ed811ec81612b2c88bea988b4323","ref":"refs/heads/main","pushedAt":"2024-04-22T17:40:37.000Z","pushType":"push","commitsCount":2,"pusher":{"login":"pull[bot]","name":null,"path":"/apps/pull","primaryAvatarUrl":"https://avatars.githubusercontent.com/in/12910?s=80&v=4"},"commit":{"message":"[CMAKE] Make LOG_BEFORE_THROW explicit (#16914)\n\nThis PR introduces an explicit option about log_fatal_before_throw.","shortMessageHtmlLink":"[CMAKE] Make LOG_BEFORE_THROW explicit (apache#16914)"}},{"before":"a2511cc5160fa73131517515c79144bef7f4b076","after":"57316dae1497c36ed57732a7a610018a990f1927","ref":"refs/heads/main","pushedAt":"2024-04-21T22:47:16.000Z","pushType":"push","commitsCount":2,"pusher":{"login":"pull[bot]","name":null,"path":"/apps/pull","primaryAvatarUrl":"https://avatars.githubusercontent.com/in/12910?s=80&v=4"},"commit":{"message":"[Web] Support string[] in setPackedFunc() and exceptionally long arrays (#16910)\n\nThere are two changes in this PR.\r\n\r\n#### Change 1: Support `string[]` in `setPackedFunc()`\r\n\r\nPrior to this PR, we cannot pass in `string[]` from typescript to\r\na TVM PackedFunc and need to convert it to `TVMArray`\r\n(for instance in `getParamsFromCacheByName()`). This may not be the\r\nmost convenient if the PackedFunc's caller is not internal to tvmjs.\r\nThus, this PR moves the conversion to `setPackedFunc()` instead.\r\n\r\n#### Change 2: Support exceptionally long TVM arrays\r\n\r\nThe second change is dealing with exceptionally long TVM arrays.\r\nIn cases like passing in a token table, we need to pass in a long\r\n`string[]` (in Llama-3's case, of size 128000), leading to JS error\r\n`RangeError: Maximum call stack size exceeded` since we treat each\r\nstring element as an argument, shown in `this.ctx.arrayMake(...inputs)`. \r\n\r\nThis PR sets an empirical call stack limit of 30000 and chunks the\r\narray elements in `makeTVMArray()`, converting each chunk to its\r\nown TVMArray. Then we concatenate them with the newly implemented\r\n`runtime.ArrayConcat` that concatenates N TVMArrays.\r\n\r\nTested end-to-end in WebLLM.","shortMessageHtmlLink":"[Web] Support string[] in setPackedFunc() and exceptionally long arra…"}},{"before":"2978427c2a804888a0911a2dc78865871a0afcd1","after":"a2511cc5160fa73131517515c79144bef7f4b076","ref":"refs/heads/main","pushedAt":"2024-04-20T09:55:27.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"pull[bot]","name":null,"path":"/apps/pull","primaryAvatarUrl":"https://avatars.githubusercontent.com/in/12910?s=80&v=4"},"commit":{"message":"[QoL][Relax] Use SeqExpr in IR types when SeqExpr is required (#16859)\n\n* [QoL][Relax] Use SeqExpr in IR types when SeqExpr is required\r\n\r\nThe Relax IR requires the `FunctionNode::body`, `IfNode::true_branch`,\r\nand `IfNode::false_branch` to be instances of `relax::SeqExpr`.\r\nIf these Relax requirements are violated, correctly-implemented\r\ntransformations may raise exceptsion\r\n(e.g. from `Downcast` in `Downcast(func->body)->blocks`), or\r\neven segfault (e.g. when `.as` returns a nullptr in\r\n`func->body.as()->blocks`). Debugging these failure\r\nmodes is also difficult, as even the TVMScript printer relies on the\r\nbody of the function being a `SeqExprNode`.\r\n\r\nThis commit updates the C++ type of `FunctionNode::body`,\r\n`IfNode::true_branch`, and `IfNode::false_branch` to be\r\n`relax::SeqExpr` instead of `relax::Expr`. This does not impact any\r\nwell-formed Relax IR, and allows this type of ill-formed Relax IR type\r\nto be checked at compile-time. A large number of checks applied\r\nduring TVM runtime can now be removed, as they duplicate the new\r\ncompile-time check.\r\n\r\nTo maintain backwards compatibility, this commit adds a new\r\nconstructor to `relax::SeqExpr`, which accepts a single `Expr body`\r\nargument. This constructor provides either an additional reference to\r\nthe same underlying `relax::SeqExprNode`, if `body` already contains a\r\n`relax::SeqExprNode`, and otherwise wraps the body in a\r\n`relax::SeqExpr`. For implementations that previously produced\r\nwell-formed Relax IR, this change has no effect. For implementations\r\nthat previously produced ill-formed Relax IR, this change results in\r\nthe equivalent well-formed Relax IR.\r\n\r\nAlternate implementations considered:\r\n\r\n* Perform the backwards-compatibility wrapping within the\r\n `relax::Function` and `relax::If` constructors. While this would\r\n provide the intended conversion when these constructors are used,\r\n Relax transforms make frequent use of copy-on-write\r\n (e.g. `func.CopyOnWrite()->body = new_body`), which does not use the\r\n constructor. Maintaining backwards compatibility for this usage\r\n requires the implicit conversion constructor that was chosen for\r\n this PR.\r\n\r\n* Remove the Relax IR requirement for these expressions to be\r\n `SeqExpr`. While this would make Relax more internally consistent,\r\n such a change would break backwards compatibility that relies on\r\n `SeqExpr` being present. While the callsites within TVM could be\r\n updated to resolve this breakage, callsites outside of TVM\r\n (e.g. MLC-LLM) could not. Exposing the special case within the C++\r\n type, as done in this PR, maintains backwards compatibility.\r\n\r\n* Resolve breakages in unit tests\r\n\r\nAll breakage was the result of callers relying on ill-formed Relax\r\nmaintaining that specific type form of ill-formed-ness.","shortMessageHtmlLink":"[QoL][Relax] Use SeqExpr in IR types when SeqExpr is required (apache…"}},{"before":"622bd150dd331780eb41a1c67c65aae802eb9b20","after":"2978427c2a804888a0911a2dc78865871a0afcd1","ref":"refs/heads/main","pushedAt":"2024-04-19T15:59:36.000Z","pushType":"push","commitsCount":3,"pusher":{"login":"pull[bot]","name":null,"path":"/apps/pull","primaryAvatarUrl":"https://avatars.githubusercontent.com/in/12910?s=80&v=4"},"commit":{"message":"[Relax] Prevent to generate duplicate func in dispatch_sort_scan (#16904)\n\nThe current pass would generate multiple PrimFuncs even if they are\r\nstructural equal, which is because `bb.update_func` will not check\r\nwhether the new func is already in the list.\r\n\r\nThis PR apply dlight at the end of the dispatching instead of after\r\nevery function.","shortMessageHtmlLink":"[Relax] Prevent to generate duplicate func in dispatch_sort_scan (apa…"}},{"before":"de91c5ca94ae87030ac697fc49aea5f89ce375d0","after":"622bd150dd331780eb41a1c67c65aae802eb9b20","ref":"refs/heads/main","pushedAt":"2024-04-19T05:38:31.000Z","pushType":"push","commitsCount":4,"pusher":{"login":"pull[bot]","name":null,"path":"/apps/pull","primaryAvatarUrl":"https://avatars.githubusercontent.com/in/12910?s=80&v=4"},"commit":{"message":"[Relax] Handle binary operations between Tensor and PrimValue (#16827)\n\n* [Relax] Handle binary operations between Tensor and PrimValue\r\n\r\nPrior to this commit, binary operations were only defined between two\r\ntensors. This commit allows binary operations to apply between a\r\ntensor and a `relax::PrimValue`.\r\n\r\nWhen inferring the output `StructInfo`, binary operations with a\r\n`PrimValue` produce the same output as using a 0-d tensor. When\r\nlegalizing operations containing a `PrimValue`, they are lowered to\r\nprimitive TIR arguments.\r\n\r\n* Fix unit tests\r\n\r\n* Restore ICHECK for scalar TIR variable\r\n\r\n* Fix a few more unit tests\r\n\r\n* Remove handling of ObjectStructInfo\r\n\r\n* Undo commenting-out of test cases\r\n\r\n* Update for improved error messages\r\n\r\n* Fix failing unit tests\r\n\r\n* Fix unit test","shortMessageHtmlLink":"[Relax] Handle binary operations between Tensor and PrimValue (apache…"}},{"before":"d4056ca79571d4265a12beeedd1b1565953df936","after":"de91c5ca94ae87030ac697fc49aea5f89ce375d0","ref":"refs/heads/main","pushedAt":"2024-04-18T03:06:18.000Z","pushType":"push","commitsCount":14,"pusher":{"login":"pull[bot]","name":null,"path":"/apps/pull","primaryAvatarUrl":"https://avatars.githubusercontent.com/in/12910?s=80&v=4"},"commit":{"message":"[Bugfix] rocm shared memory issue on MI250 (#16901)\n\n* [Bugfix] rocm shared memory issue on MI250","shortMessageHtmlLink":"[Bugfix] rocm shared memory issue on MI250 (apache#16901)"}},{"before":"a64d1f1cc37da7f202d943c2bea7eb747e624599","after":"d4056ca79571d4265a12beeedd1b1565953df936","ref":"refs/heads/main","pushedAt":"2024-04-15T21:27:55.000Z","pushType":"push","commitsCount":2,"pusher":{"login":"pull[bot]","name":null,"path":"/apps/pull","primaryAvatarUrl":"https://avatars.githubusercontent.com/in/12910?s=80&v=4"},"commit":{"message":"[SVE] Support splitting by vscale in `tir::split` and `te::split` (#16862)\n\nThis commit adds support for splitting via the compile-time unknown\r\nconstant `vscale`. Two main changes are introduced; they are described\r\nbelow.\r\n\r\nThe split scheduling primitive has a new parameter disable_predication\r\nthat allows the user to avoid introducing a block-level predicate when\r\nsplitting with a factor of `vscale`. This feature is useful when schedule\r\nwriters know that the loop they're splitting is a factor of the scalable\r\nvector length for their target. Otherwise, a predicate must be introduced\r\ndue to the nature of `vscale`.\r\n\r\nCanProve has been extended to prove expressions that use multiple\r\ninstances of `vscale`. Known possible scalar values of the `vscale`\r\nintrinsic are iterated over and substituted into the expression. If\r\nthe expression holds true for each possible value, we can conclude the\r\nexpression true. Currently only support for an SVE target has been\r\nadded, but it is possible to extend to other targets as/when needed. If\r\nthe analyzer becomes more powerful in the future and is able to deal\r\nwith multiple instances of a symbolic value in an expression, this\r\nfeature can be removed.\r\n\r\n---------\r\n\r\nCo-authored-by: Elen Kalda \r\nCo-authored-by: Neil Hickey ","shortMessageHtmlLink":"[SVE] Support splitting by vscale in tir::split and te::split (ap…"}},{"before":"64911ab5da3640be4d9fb675513e57b742e188b1","after":"a64d1f1cc37da7f202d943c2bea7eb747e624599","ref":"refs/heads/main","pushedAt":"2024-04-15T12:14:45.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"pull[bot]","name":null,"path":"/apps/pull","primaryAvatarUrl":"https://avatars.githubusercontent.com/in/12910?s=80&v=4"},"commit":{"message":"[TIR] Make T.reinterpret nop when dtype is the same (#16879)\n\n* [TIR] Make T.reinterpret nop when dtype is the same\r\n\r\n* fix scalable vec handling","shortMessageHtmlLink":"[TIR] Make T.reinterpret nop when dtype is the same (apache#16879)"}},{"before":"0a3fe22208329edc596db0116752b3259f5d90a2","after":"64911ab5da3640be4d9fb675513e57b742e188b1","ref":"refs/heads/main","pushedAt":"2024-04-14T08:10:00.000Z","pushType":"push","commitsCount":4,"pusher":{"login":"pull[bot]","name":null,"path":"/apps/pull","primaryAvatarUrl":"https://avatars.githubusercontent.com/in/12910?s=80&v=4"},"commit":{"message":"[Runtime] Implemented Datatype.itemsize() (#16880)\n\n* [Runtime] Implemented Datatype.itemsize()","shortMessageHtmlLink":"[Runtime] Implemented Datatype.itemsize() (apache#16880)"}},{"before":"c67a05538642d24c75555b02103800d7f6a1ceaf","after":"0a3fe22208329edc596db0116752b3259f5d90a2","ref":"refs/heads/main","pushedAt":"2024-04-12T22:31:31.000Z","pushType":"push","commitsCount":5,"pusher":{"login":"pull[bot]","name":null,"path":"/apps/pull","primaryAvatarUrl":"https://avatars.githubusercontent.com/in/12910?s=80&v=4"},"commit":{"message":"[Relax] Enhance symbolic expr estimation in memory planning (#16872)\n\nThis PR enhances the symbolic expression upper bound estimation in\r\nstatic memory planning.\r\n\r\nPrior to this PR, we are not able to estimate the upper bound of\r\n`a * b` when `a` has an upper bound while `b` does not. This PR\r\nenhances the estimation with arith::IntSet.\r\n\r\nWe introduce another TIR attribute `tir_non_negative_var` to indicate\r\nthe non-negative TIR variables for memory planning use.\r\n\r\nA new unit test is introduced for this enhancement.","shortMessageHtmlLink":"[Relax] Enhance symbolic expr estimation in memory planning (apache#1…"}},{"before":"6748215b427fbfd7b7682836d4199a8a71ddb263","after":"c67a05538642d24c75555b02103800d7f6a1ceaf","ref":"refs/heads/main","pushedAt":"2024-04-11T09:13:42.000Z","pushType":"push","commitsCount":3,"pusher":{"login":"pull[bot]","name":null,"path":"/apps/pull","primaryAvatarUrl":"https://avatars.githubusercontent.com/in/12910?s=80&v=4"},"commit":{"message":"[BugFix][Target] Added null check to fix segfault at ->defined() in cpu.cc DetectSystemTriple() (#16766)\n\nI ran into a problem running a very simple ONNX compile, i would get a segfault at a FoldConstantExpr() call from TVMC. **This only happens if the compile flag `set(USE_LLVM OFF)` is OFF.**\r\n\r\n```\r\nThread 1 \"python3\" received signal SIGSEGV, Segmentation fault.\r\n0x00007fffc94ac78c in tvm::runtime::ObjectPtr::operator!=(decltype(nullptr)) const (this=0x0, null=) at /home/otto/tvm/include/tvm/runtime/object.h:470\r\n470 bool operator!=(std::nullptr_t null) const { return data_ != null; }\r\n```\r\n\r\nI had compiled TVM Using GCC:\r\n```\r\nCOLLECT_GCC=gcc\r\nCOLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/11/lto-wrapper\r\nOFFLOAD_TARGET_NAMES=nvptx-none:amdgcn-amdhsa\r\nOFFLOAD_TARGET_DEFAULT=1\r\nTarget: x86_64-linux-gnu\r\nConfigured with: ../src/configure -v --with-pkgversion='Ubuntu 11.4.0-1ubuntu1~22.04' --with-bugurl=file:///usr/share/doc/gcc-11/README.Bugs --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --prefix=/usr --with-gcc-major-version-only --program-suffix=-11 --program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --enable-bootstrap --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-plugin --enable-default-pie --with-system-zlib --enable-libphobos-checking=release --with-target-system-zlib=auto --enable-objc-gc=auto --enable-multiarch --disable-werror --enable-cet --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-offload-targets=nvptx-none=/build/gcc-11-XeT9lY/gcc-11-11.4.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-XeT9lY/gcc-11-11.4.0/debian/tmp-gcn/usr --without-cuda-driver --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu --with-build-config=bootstrap-lto-lean --enable-link-serialization=2\r\nThread model: posix\r\nSupported LTO compression algorithms: zlib zstd\r\ngcc version 11.4.0 (Ubuntu 11.4.0-1ubuntu1~22.04) \r\n```\r\n\r\nThis was caused by a call to defined() from DetectSystemTriple() in cpu.cc that was added in #16513. When the previous call\r\n```\r\nauto pf = tvm::runtime::Registry::Get(\"target.llvm_get_system_triple\");\r\n```\r\nwould fail, and return null. The consecutive call to defined() would segfault after trying to dereference the null value. This commit adds a check to see if the function pointer is null. This might not be the best solution, but it worked for me, so it might also help someone else struggling with this. Please suggest a better solution, if you know one.\r\n\r\nCo-authored-by: Luke Hutton ","shortMessageHtmlLink":"[BugFix][Target] Added null check to fix segfault at ->defined() in c…"}}],"hasNextPage":true,"hasPreviousPage":false,"activityType":"all","actor":null,"timePeriod":"all","sort":"DESC","perPage":30,"cursor":"djE6ks8AAAAESApE9AA","startCursor":null,"endCursor":null}},"title":"Activity · cattidea/tvm"}