Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[Fixbug] Fix dynamic memcpy bug (#427)
Minimal failure case: ``` resize_inputs: Tensor = symbol([1, 3, "h", "w"], dtype="int32", device="cpu") resize_outputs = self.resize(resize_inputs.to(self.dtype, self.device)) # (float32, cuda) resize_graph: FlowGraph = trace_from(resize_outputs, resize_inputs) resize_graph.build() ``` compiles this launch where symbols `h` and `w` are undefined. ``` DLL void hidet_launch_0(float * __restrict__ x, float * __restrict__ y) { cudaMemcpyAsync(y, x, (4 * ((3 * h) * w)), cudaMemcpyHostToDevice, (cudaStream_t)get_cuda_stream()); } ``` Fix is to add exprs to BlackBoxStmt so that symbols defined in exprs can be visited during codegen.
- Loading branch information