Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added windows cross compile builds & fixed build issues #169

Open
wants to merge 30 commits into
base: main-dev
Choose a base branch
from

Conversation

ashbob999
Copy link
Contributor

@ashbob999 ashbob999 commented Sep 15, 2024

  • Added cross compiling builds (x64, x86, ARM64) for windows (MSVC), and fixed build issues.
  • Added support for using MSVC ARM intrinsics.
  • Added tests for sz_u32_clz which will help catch if the tests are run without BMI support.
  • Fixed build issues when building MacOS Universal2 binaries.
  • Disabled AVX for 32-bit binaries.
  • Fixed lots of issues when building a Windows 32-bit binary.
  • Updated release.yml to build x86/x64/arm64 windows versions, and included the .lib file in the archive.
  • Added checks for making sure stringzillite is built without any dependencies.
  • Fixed Windows stringzillite might be broken #185

@ashbob999 ashbob999 force-pushed the msvc-arm branch 10 times, most recently from 8279247 to 713afd3 Compare September 21, 2024 16:18
CMakeLists.txt Outdated Show resolved Hide resolved
CMakeLists.txt Outdated Show resolved Hide resolved
@ashbob999
Copy link
Contributor Author

@ashvardanian
For the x86 (32-bit) builds, there are warnings about conversion loss of data, due to converison between sz_size_t and sz_u64_t.

sz_sorted_idx_t is defined as a sz_u64_t.

Functions where this happens: sz_partition, sz_merge, sz_sort_introsort_recursion, sz_sort_partial.

Plus there are other places where the sz_u64_t are used directly, but only to be converted to sz_size_t.

SZ_PUBLIC void sz_sort_insertion(sz_sequence_t *sequence, sz_sequence_comparator_t less) {
sz_u64_t *keys = sequence->order;
sz_size_t keys_count = sequence->count;
for (sz_size_t i = 1; i < keys_count; i++) {
sz_u64_t i_key = keys[i];
sz_size_t j = i;
for (; j > 0 && less(sequence, i_key, keys[j - 1]); --j) keys[j] = keys[j - 1];
keys[j] = i_key;
}
}

Other functions it occurs in: _sz_sift_down, sz_sort_introsort_recursion, sz_string_unpack.

So should all of these be changed to use sz_size_t?

Also should we be running the NumPy tests when building the python binaries?

@ashvardanian
Copy link
Owner

Hey @ashbob999! Those places require the unsigned integer to be 64 bit, so it shouldn't be changed.

@ashbob999
Copy link
Contributor Author

Hey @ashbob999! Those places require the unsigned integer to be 64 bit, so it shouldn't be changed.

Although, if I have understood the sz_sort_insertion function correctly and the sz_sequence_t struct.

That the order member is a pointer to an array of indexes, which specify the sorted order, and the count is the number of indexes in the array. Then don't we already have a mismatch, because we have a uint32 for the count and a uint64 for the index of each string?

And because the order pointer array is populated with the data ptr from a std::vector, which the vector max size is 32-bit.

@ashvardanian ashvardanian mentioned this pull request Sep 27, 2024
3 tasks
@ashbob999
Copy link
Contributor Author

@ashvardanian what should I do about the warnings for the 32-builds (for the sorting functions)?

@ashvardanian
Copy link
Owner

@ashbob999, what's the cleanest way to reproduce some of those warnings/errors with GCC/Clang? It would be easier to choose a path forward if I can better understand their nature.

@ashbob999
Copy link
Contributor Author

@ashvardanian

Just building it as 32-bit (m32 and adding -Wconversion although this adds other warnings for clang) should be enough. Although I cannot seem to reproduce some of the warnings that MSVC emits.

For example the warning about converting the result of sz_blend_u64 doesn't get emitted even though it is truncating from u64 to u32.

*space = sz_u64_blend(SZ_STRING_INTERNAL_SPACE, string->external.space, is_big_mask);

But only a very few actually get emitted (GCC).

[1/2] Building CXX object CMakeFiles/stringzilla_test_cpp14.dir/scripts/test.cpp.o
In file included from /usr/include/c++/11/cassert:44,
                 from ../include/stringzilla/stringzilla.hpp:57,
                 from ../scripts/test.cpp:21:
../scripts/test.cpp: In function ‘void test_sequence_algorithms()’:
../scripts/test.cpp:1347:90: warning: conversion from ‘__gnu_cxx::__alloc_traits<std::allocator<long long unsigned int>, long long unsigned int>::value_type’ {aka ‘long long unsigned int’} to ‘std::vector<std::__cxx11::basic_string<char> >::size_type’ {aka ‘unsigned int’} may change value [-Wconversion]
 1347 |             for (std::size_t i = 1; i != dataset_size; ++i) { assert(dataset[order[i - 1]] <= dataset[order[i]]); }
      |                                                                                          ^
../scripts/test.cpp:1347:111: warning: conversion from ‘__gnu_cxx::__alloc_traits<std::allocator<long long unsigned int>, long long unsigned int>::value_type’ {aka ‘long long unsigned int’} to ‘std::vector<std::__cxx11::basic_string<char> >::size_type’ {aka ‘unsigned int’} may change value [-Wconversion]
 1347 |             for (std::size_t i = 1; i != dataset_size; ++i) { assert(dataset[order[i - 1]] <= dataset[order[i]]); }
      |                                                                                                               ^
[2/2] Linking CXX executable stringzilla_test_cpp14

But for whatever reason (maybe my building of the 32-bit with GCC/Clang is broken) but if I build it as 64-bit but with sz_size_t and sz_ssize_t changed to be uint32_t and int32_t respectively, then I get the warnings that MSVC emitted.

@ashbob999
Copy link
Contributor Author

@ashvardanian that's fine, so it should then be fine to change the size of the values in the sorting functions to sz_size_t, because do correct me if i'm wrong but i don't remember seeing any logic that explicitly required 64-bit integer (as they are just used for indexes).

@ashbob999
Copy link
Contributor Author

@ashvardanian, Somehow I missed that for sz_sort_partial we are adding the first 4 bytes from each string into the upper half of the sz_u64_t. Which would make sense why it's limited to 4 billion (although I couldn't find anything that said it is limited) so it needs to be a sz_u64_t.

Would it make sense to change the type of count in sz_sequence_t from a sz_size_t to sz_u32_t as on 64-bit platforms giving more would cause it to behave incorrectly. Although this would break ABI with previous versions.

Can you think of any better way of forcing the 4B limit on the user?

@ashbob999 ashbob999 changed the title Added windows cross compile builds & some minor build changes Added windows cross compile builds & fixed build issues Oct 20, 2024
@ashvardanian
Copy link
Owner

Can you think of any better way of forcing the 4B limit on the user?

I would stick to size_t for sizes and focus on designing a better sorting algorithm. Long-term sustainable solutions are better than short-term gains 😉

@ashbob999 ashbob999 force-pushed the msvc-arm branch 2 times, most recently from e3e8491 to 4ee87f4 Compare October 26, 2024 21:55
@ashbob999 ashbob999 marked this pull request as ready for review October 27, 2024 16:23
@ashvardanian
Copy link
Owner

Hey @ashbob999! I'll have more time to look into this next week.

I want to thank you for following the git message style - it's very pleasing to see! Unlike most PRs, where I end up squashing the patches and merging under a new name, I would love to preserve your entire commit history. I'd just ask you to reformat/squash the last three commits that deviate from the style. Thanks again!

@ashbob999
Copy link
Contributor Author

ashbob999 commented Oct 30, 2024

@ashvardanian no problem, I purposely did not make them that way jsut because they were fixing issues that I noticed from previous commits in the merge.

I can squash them into a single one, labelled as minor fixes (or equivalent), do you also want me to rebate to fix one in the middle a35cc50?

@ashvardanian ashvardanian mentioned this pull request Dec 8, 2024
ashvardanian added a commit that referenced this pull request Dec 8, 2024
Imported from #169

Co-authored-by: ashbob999 <[email protected]>
ashvardanian added a commit that referenced this pull request Dec 8, 2024
Imported from #169

Co-authored-by: ashbob999 <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants