Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix pointers #322

Merged
merged 3 commits into from
Dec 13, 2024
Merged

Fix pointers #322

merged 3 commits into from
Dec 13, 2024

Conversation

NikoOinonen
Copy link
Collaborator

Fixes #321

The force field arrays were left as dangling pointers when running multiple computational steps in a row, which manifested in #316. This fixes the issue by setting some cleanup code to run on garbage collection. It's not very pretty since you also have to manually run the garbage collector to make sure that the pointers are deleted in time, but it works.

@ProkopHapala
Copy link
Collaborator

ProkopHapala commented Dec 11, 2024

Before we merge it, let me understand in more detail what is happening.

  1. the memory is allocated on the side of Python (as numpy array), there is no allocation on C++ side (?)
  2. The numpy arrays are garbage collected by python, C++ does not know tat, but does not care. As long as the pointer is not used in C++ after the numpy arrays is garbage collected, there should be no problem (?)
  3. when you run multiple computation, you pass new numpy array pointers to C++. Since C++ library has only single global state, you overwrite the old pointers by new ones. C++ does not care what happen with the old invalid pointer.

So where comes the problem ?

Basically my idea was that responsibility over memory allocation/deallocation should be on python side (since it is more clever, it has garbage collector), and C++ only "borrows" pointers to memory allocated in python, and since C++ is dumb I wanted to dessing it that C++ can be totally oblivious to the lifetime of the pointed memory (only python should care about that).

The only problem I see when python deallocate memory which is still used in C++, but that we can handle by keeping the variable around (in scope) until C++ calculation is finished.

If we initialize new arrays in python which creates new numpy arrays, we must not forget to update the pointers on C++ side (there are function setPointer() in the interface for exactly that purpose, but maybe these are not dano meticulously enough)

Another problem which cannot be solved this way, si to run multiple instances of the simulator in python, which is obviously not possible on C++ side as the library has only one global state. But I guess this is not the problem now.

@NikoOinonen
Copy link
Collaborator Author

when you run multiple computation, you pass new numpy array pointers to C++. Since C++ library has only single global state, you overwrite the old pointers by new ones. C++ does not care what happen with the old invalid pointer.

The problem is that you don't necessarily pass new pointers in the next computation for all of the old pointers. Specifically there was an issue where the pointer for the energy array was set in a previous computation, but the next computation does not want to compute the energy, so it does not allocate a new numpy array for it. However, the pointer for the energy array from the previous computation is still there, so the C++ code tries to compute it anyways since it does not know that the pointer is not valid anymore.

For the force field pointer this may not be an issue since all of the current routines always use it, but I set it to delete it anyways just in case.

@NikoOinonen
Copy link
Collaborator Author

Another problem which cannot be solved this way, si to run multiple instances of the simulator in python, which is obviously not possible on C++ side as the library has only one global state. But I guess this is not the problem now.

Yes, this is just a quick fix for the current problem. The more fundamental solution would be to keep a separate pointer for each new array on the C++ side instead of the global ones. But that's for later.

@ProkopHapala
Copy link
Collaborator

ProkopHapala commented Dec 11, 2024

The problem is that you don't necessarily pass new pointers in the next computation for all of the old pointers. Specifically there was an issue where the pointer for the energy array was set in a previous computation, but the next computation does not want to compute the energy, so it does not allocate a new numpy array for it. However, the pointer for the energy array from the previous computation is still there, so the C++ code tries to compute it anyways since it does not know that the pointer is not valid anymore.

OK, I understand now.
So the cleanest / most reliable solution seems to be to set to nullptr all pointers on the C++ after we finish the calculation.
This is what you do ?
Why you need to do so many operations of the python side then?

Sorry, I did not read the code in detail, just scan over it, and I'm a bit confused. But that is normal, these memory management issues python/C++ ownership are always messy.

@ProkopHapala
Copy link
Collaborator

OK, I looked at the code once more.

  • I see some complication is because you delete E and F pointers separately. I would maybe make single functions clean_all_pointers() so that we don't need to care so much. But perhaps there is not much difference as long as there is not too many of them.
  • It is still not quite clear why you need to call garbage collector explicitly. If C++ pointers are already deleted, the rest should be able python manage himself. But perhaps I'm missing something.

@yakutovicha
Copy link
Collaborator

My contribution here isn't necessary. I let you guys continue moving forward.

@yakutovicha yakutovicha removed their request for review December 11, 2024 11:37
@NikoOinonen
Copy link
Collaborator Author

It is still not quite clear why you need to call garbage collector explicitly. If C++ pointers are already deleted, the rest should be able python manage himself. But perhaps I'm missing something.

It's because the cleanup function is only called when the garbage collector runs, which is not guaranteed to happen before the next calculation starts, so it's called explicitly. I guess you could also call the core.delete... function explicitly at the end, but I thought that's more of an internal implementation thing. As I said, it's not ideal, but it works.

@ProkopHapala
Copy link
Collaborator

It's because the cleanup function is only called when the garbage collector runs, which is not guaranteed to happen before the next calculation starts, so it's called explicitly. I guess you could also call the core.delete... function explicitly at the end, but I thought that's more of an internal implementation thing.

Aha, I get it now.

Yes, my first instinct would be to call core.delete() explicitly at the end of calculation, or at the beginning of next calculation, since it is more transparent what's going on.

On the other hand I understand the motivation of this

core.setFF_Fpointer(FF)
weakref.finalize(FF, core.deleteFF_Fpointer)  

You want to book the destruction at the place of construction, so that programer does not have to think about (may forget).

I like it. But in that case maybe it would be best to incorporate weakref.finalize(FF, core.deleteFF_Fpointer) inside core.setFF_Fpointer so that we don't have to think about it at all ?

  • Note 1 : I was not familiar with weakref module, now I undrstand it is some standard python way to solve this problem

  • Note 2 : I approve the commit as it is. It is up to you if you want to polish it further.

Copy link
Collaborator

@ProkopHapala ProkopHapala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Solve problem => approve

Possible alternatives / improvements:

  1. It is a question if it is not better / more transparent to call the core.delete* explicitly?
  2. If we want to automatize the destruction / cleanup, maybe we can set the weakref directly inside core.set*__pointer() ?

@NikoOinonen
Copy link
Collaborator Author

If we want to automatize the destruction / cleanup, maybe we can set the weakref directly inside core.set*__pointer() ?

Yes, this is a good idea. Implemented it just now.

@NikoOinonen NikoOinonen merged commit fb40490 into main Dec 13, 2024
13 checks passed
@NikoOinonen NikoOinonen deleted the fix-pointers branch December 13, 2024 09:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Running multiple computational steps in row partially with --energy option causes a segfault.
3 participants