-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement support in Java for shared memory buffers #7
Comments
HEllo @ctrueden , JDLL creates a script that is able to recreate the imglib2 array of the Java side: # Create the shared memory object pointint to the shm block of interest with the wanted size
input_06112023_151243_shm_a9b8d59c_0087_4394_a9dd_fb54e23235e6 = shared_memory.SharedMemory(name='b024e140-5780-4ec6-a45d-f51cfa2ce3e7', size=20545536)
# Recreate the data array with the data from the shm block
input_06112023_151243 = xr.DataArray(np.ndarray(5136384, dtype='float32', buffer=input_06112023_151243_shm_a9b8d59c_0087_4394_a9dd_fb54e23235e6.buf).reshape([1, 304, 512, 33]), dims=["b", "y", "x", "c"], name="output") All the code above is then fed to the Python worker to be executed, as you can see the memory location, size and shape is hardcoded on the JAva side (done here) Then once the result is obtained, on the Python side, the shm block is created and the info to access that memory block is sent to java together with the data type, shape and whether is fortran order or not. On the Python side: shm_out_list = []
def convertNpIntoDic(np_arr):
shm = shared_memory.SharedMemory(create=True, size=np_arr.nbytes)
aux_np_arr = np.ndarray((np_arr.size), dtype=np_arr.dtype, buffer=shm.buf)
aux_np_arr[:] = np_arr.flatten()
shm_out_list.append(shm)
shm.unlink()
return {"data": shm.name, "shape": np_arr.shape, "appose_data_type__06112023_151246": "np_arr", "is_fortran": np.isfortran(np_arr), "dtype": str(np_arr.dtype)}
def convertXrIntoDic(xr_arr):
shm = shared_memory.SharedMemory(create=True, size=xr_arr.values.nbytes)
aux_np_arr = np.ndarray((xr_arr.values.size), dtype=xr_arr.values.dtype, buffer=shm.buf)
aux_np_arr[:] = xr_arr.values.flatten()
shm_out_list.append(shm)
shm.unlink()
return {"data": shm.name, "shape": xr_arr.shape, "axes": "".join(xr_arr.dims),"name": xr_arr.name, "appose_data_type__06112023_151246": "tensor", "is_fortran": np.isfortran(xr_arr.values), "dtype": str(xr_arr.values.dtype)} The dictionary returned is then used by the following Java method to recreate a JDLL tensor or ImgLib array: Performance is improved greatly (at least on my old laptop) because the stdout does not need to flush millions of numbers and the result looks like the following: A similar logic dould be implemented in Appose to support sending more high level objects such as numpy arrays or imglib2 images. Regards, |
@carlosuc3m This is exciting news! I am especially happy to hear that this improved performance for you. How straightforward do you think it will be to migrate the logic from your |
I do not think it would be difficult to implement the logic on the Python side, but we would need to decide how to approach it. Currently, the dictionary of variables is encoded, flushed, read by the other process and when it is decoded we directly have the original variables. To send Numpy/RandomAccessiblInterval arrays we need to send the memory location (String), the datatype (String), the shape (List of longs) and whether the array is it fortran order or not (boolean). In the case of JAva worker sending RAI to Python the Numpy array decoding can be done in two ways in my opinion.
And then in the encoding stage, still on the Java side:
Finally on the Python side after decoding, iterate over the varaibles and if any of the variables contains the specific identifier, convert it to Np array. Similar approach to send outputs from Python to Java
In both cases, the shared memory segments created by any of the processes should be treated carefully to ensure that they are unlinked once they are not used anymore. |
This issue has been addressed by apposed/appose-java#5. Big thanks to @tpietzsch and @carlosuc3m for their efforts. 🍻 |
And for the record, the Python implementation maintains feature parity with Java thanks to apposed/appose-python#1. |
I started implementing a
SharedMemory
class in Appose Java, which is a direct translation of Python's handymultiprocessing.shared_memory.SharedMemory
class. It leans on JNA for the low-level system calls. But it is not yet fully working. This needs to be finished, so that we can share memory easily between processes!I'm hoping the JNA-based approach will work directly. But other options include:
The text was updated successfully, but these errors were encountered: