Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

skysat_overlap.py stuck at saving bound shapefile #25

Open
ShiruiH opened this issue Jun 23, 2023 · 6 comments
Open

skysat_overlap.py stuck at saving bound shapefile #25

ShiruiH opened this issue Jun 23, 2023 · 6 comments

Comments

@ShiruiH
Copy link

ShiruiH commented Jun 23, 2023

Hi Shashank,

Thanks for composing this great tool for processing SkySat data.

I've encountered an issue when running skysat_overlap.py. The number of images I'm processing is 966. I understand this will generate 466095 combinations. When the script is saving the bound shapefile to my predefined directory. It is extremely slow in processing and always gets stuck during the process (usually stopped at 3% - 5%). I've tried changing the directory, but the issue remained unsolved.

I'm just wondering if there is any solution in your mind that can resolve this issue?

Cheers,
Shirui

@ShashankBice
Copy link
Member

Hi Shirui,
Thanks for reporting this. I can see why this procedure is a bit slow, the logic right now is brute force and just spread across cores, which will make it slow if number of cores on a machine is low. How many cores does the machine you are using to run these scripts have?
Regardless, I get much faster performance when finding intersection using the native rtree algorithm in geopandas for another related project, which I will port here in a day. I will let you know.

Cheers,
Shashank

@ShiruiH
Copy link
Author

ShiruiH commented Jun 23, 2023

Hi Shashank,

Thanks for your prompt reply. I have 14 cores available to be used on my machine.

Looking foreword to your new solution with the native rtree algorithm.

Cheers,
Shirui

@ShiruiH
Copy link
Author

ShiruiH commented Jun 26, 2023

Hi Shashank,

May I know if you're running this process on a HPC with more cores? Or if you may have any update with using the rtree algorithm to find intersections?

I can see that from your code, every two images are paired up with a for loop going through the whole image list to find their intersection with the gpd.overlay() method. Are you proposing to substitute gpd.overlay() with rtree? Will you keep the for loop?

Cheers,
Shirui

@ShashankBice
Copy link
Member

I am sorry for my long silence, I updated the function using rtree for finding intersection, and it produces same results as the earlier function with ~13x improvement in execution wall time. If you installed the package as editable, these should be visible after a git pull.
image

Let me know how it goes,
Cheers,
Shashank

@ShiruiH
Copy link
Author

ShiruiH commented Aug 1, 2023

Hi Shashank,

Sorry for the late reply. Thanks for improving the package, it is much improved now!

I'm now at step ba_skysat.py. Running into an error at line 308:
IndexError: index 1 is out of bounds for axis 0 with size 1

I think the reason behind it is that in my case, img_time_unique_list only has one element (sc00103), so it failed when it tries to access img_time_unique_list[1].

So I just wanna confirm with you if string like "sc00103" is the img_time you intended to record? My files have names such as "1231553292.98343730_sc00103_c3_PAN_i0000000006". Apologise if the question is trivial!

Cheers,
Shirui

@ShashankBice
Copy link
Member

ShashankBice commented Aug 23, 2023

Hi Shirui,
Looking at the filename, I suspect this is a L1A full frame triplet stereo product? Please let me know. At this stage, we do not have scripts for L1A full frame product processing, although we have the logic, and we will come up with it in the coming month or so. In the meantime, I added a note in the project readme to mention the currently supported products .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants