Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SNMG ANN #231

Merged
merged 67 commits into from
Oct 3, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
67 commits
Select commit Hold shift + click to select a range
cc1e45a
SNMG ANN
viclafargue Jul 18, 2024
279c345
nccl_clique as header
viclafargue Jul 18, 2024
b10d01d
update linking, build system and conda env
viclafargue Jul 18, 2024
d178155
Answered review
viclafargue Jul 19, 2024
4bc9d9c
Merge branch 'branch-24.08' into snmg-ann
viclafargue Jul 19, 2024
1459248
Apply review
viclafargue Jul 22, 2024
f3a65fc
Answer reviews + small changes
viclafargue Jul 25, 2024
ee2dcc3
Adding documentation
viclafargue Jul 26, 2024
5236cc2
Merge branch 'branch-24.08' into snmg-ann
viclafargue Jul 26, 2024
60bd621
removing unnecessary omp barriers
viclafargue Jul 29, 2024
17f62d2
int64_t change
viclafargue Jul 30, 2024
f523251
tree reduction merge implementation
viclafargue Jul 30, 2024
3e79a44
tree merge solidification
viclafargue Jul 31, 2024
d4cabe0
Adding bench code
viclafargue Aug 6, 2024
37f9755
Merge branch 'branch-24.08' into snmg-ann
viclafargue Aug 6, 2024
504b0c3
Auto max throughput for replicated search
viclafargue Aug 9, 2024
2d0a950
improve batching
viclafargue Aug 20, 2024
169eb15
branch-24.10 merge
viclafargue Sep 6, 2024
686f81d
answering reviews 1
viclafargue Sep 6, 2024
c8d3864
Updating params
viclafargue Sep 9, 2024
51291d8
iface free functions
viclafargue Sep 9, 2024
80cf875
free functions
viclafargue Sep 10, 2024
d60e583
NCCL clique from RAFT handle
viclafargue Sep 18, 2024
3419dfa
load balancing mechanism
viclafargue Sep 19, 2024
7970fdc
Merge branch 'branch-24.10' into snmg-ann
viclafargue Sep 19, 2024
6a220b5
update doc
viclafargue Sep 19, 2024
c5e955f
moving iface struct
viclafargue Sep 23, 2024
60fbef1
include fix
viclafargue Sep 23, 2024
5ea9b9b
small fixes
viclafargue Sep 24, 2024
8b0c8c7
RAFT handle update
viclafargue Sep 26, 2024
bcf97c9
RAFT handle update
viclafargue Sep 26, 2024
9418f7e
smallSearchBatchSize as constexpr
viclafargue Sep 27, 2024
fa457f4
Merge branch 'branch-24.10' into snmg-ann
viclafargue Sep 30, 2024
dc2ccdd
add half type
viclafargue Sep 30, 2024
ed68cd8
fix bench
viclafargue Sep 30, 2024
9e659c4
Update build system
viclafargue Oct 2, 2024
f3bc98a
update iface to only expose device-only search function
viclafargue Oct 2, 2024
d9a83e5
Adding replicated search mode (load-balancer and round-robin)
viclafargue Oct 2, 2024
e6a73c6
CAGRA bench consolidation
viclafargue Oct 2, 2024
d68f572
Adding --mg to conda recipes
viclafargue Oct 2, 2024
6a673c3
resolving merge conflict
viclafargue Oct 2, 2024
55fbb36
enable multi-GPU by default, add a CMake option to control it
jameslamb Oct 2, 2024
5649a49
empty commit to re-trigger CI
jameslamb Oct 2, 2024
a208d49
Merge branch 'branch-24.10' into snmg-ann
jameslamb Oct 2, 2024
e0c232a
revert CUVS_EXPLICIT_INSTANTIATE_ONLY re-introduction
jameslamb Oct 2, 2024
1a5a2f2
Merge branch 'snmg-ann' of github.com:viclafargue/cuvs into snmg-ann
jameslamb Oct 2, 2024
fef0fc9
Removing std comms
cjnolet Oct 2, 2024
c028dca
Remove UCP
cjnolet Oct 2, 2024
a43c4f9
Adding nccl to rapids_build
cjnolet Oct 2, 2024
3b2feb7
add back NCCL dependency, pin to NCCL>=2.19
jameslamb Oct 2, 2024
d77a4e9
Revert "Removing std comms"
cjnolet Oct 3, 2024
4af2c2e
Renaming comms source file
cjnolet Oct 3, 2024
cecb372
Merge branch 'snmg-ann' of github.com:viclafargue/cuvs into snmg-ann
cjnolet Oct 3, 2024
f7a73fd
Merge branch 'branch-24.10' into snmg-ann
cjnolet Oct 3, 2024
ceb6287
Adding ucp to cmakelists
cjnolet Oct 3, 2024
ce37b71
Merge branch 'snmg-ann' of github.com:viclafargue/cuvs into snmg-ann
cjnolet Oct 3, 2024
1f0f5e9
MOre renames
cjnolet Oct 3, 2024
cb8ed0c
Adding libucxx
cjnolet Oct 3, 2024
fe5b6f8
Adding ucxx
cjnolet Oct 3, 2024
e257282
Adding to run time
cjnolet Oct 3, 2024
b6cb776
Adding libucxx to libcuvs y
cjnolet Oct 3, 2024
ac26507
use raw nccl calls
viclafargue Oct 3, 2024
4a10a6c
Removing ucp from cmake
cjnolet Oct 3, 2024
c9515d5
changing serialization path and disabling sharded mode testing
viclafargue Oct 3, 2024
d77704c
round robin check improvment + temporary disable of CAGRA
viclafargue Oct 3, 2024
c2c810c
Merge branch 'branch-24.10' into snmg-ann
viclafargue Oct 3, 2024
4e7398a
fix merge
viclafargue Oct 3, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions cpp/CMakeLists.txt
viclafargue marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
Expand Up @@ -470,6 +470,10 @@ add_library(
${CUVS_MG_ALGOS}
)

if(BUILD_MG_ALGOS)
target_compile_definitions(cuvs PUBLIC CUVS_BUILD_MG_ALGOS=1)
endif()

target_compile_options(
cuvs INTERFACE $<$<COMPILE_LANG_AND_ID:CUDA,NVIDIA>:--expt-extended-lambda
--expt-relaxed-constexpr>
Expand Down
24 changes: 3 additions & 21 deletions cpp/include/cuvs/neighbors/ann_mg.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -16,34 +16,16 @@

#pragma once

viclafargue marked this conversation as resolved.
Show resolved Hide resolved
#include <nccl.h>
#include <raft/core/device_resources.hpp>
#include <rmm/mr/device/per_device_resource.hpp>

#include <cuvs/neighbors/cagra.hpp>
#include <cuvs/neighbors/ivf_flat.hpp>
#include <cuvs/neighbors/ivf_pq.hpp>

/**
* @brief Error checking macro for NCCL runtime API functions.
*
* Invokes a NCCL runtime API function call, if the call does not return ncclSuccess, throws an
* exception detailing the NCCL error that occurred
*/
#define RAFT_NCCL_TRY(call) \
do { \
ncclResult_t const status = (call); \
if (ncclSuccess != status) { \
std::string msg{}; \
SET_ERROR_MSG(msg, \
"NCCL error encountered at: ", \
"call='%s', Reason=%d:%s", \
#call, \
status, \
ncclGetErrorString(status)); \
throw raft::logic_error(msg); \
} \
} while (0);
#ifndef NO_NCCL_FORWARD_DECLARATION
class ncclComm_t {};
#endif

namespace cuvs::neighbors::mg {
enum parallel_mode { REPLICATED, SHARDED };
Expand Down
18 changes: 16 additions & 2 deletions cpp/src/neighbors/ann_mg/ann_mg.cuh
Original file line number Diff line number Diff line change
Expand Up @@ -16,12 +16,18 @@

#pragma once

#ifdef CUVS_BUILD_MG_ALGOS

#include "../detail/knn_merge_parts.cuh"
#include <cuvs/neighbors/ann_mg.hpp>
#include <cuvs/neighbors/common.hpp>
#include <raft/core/serialize.hpp>
#include <raft/util/cuda_dev_essentials.cuh>

#include "nccl_helpers.cuh"
#define NO_NCCL_FORWARD_DECLARATION
#include <cuvs/neighbors/ann_mg.hpp>
#undef NO_NCCL_FORWARD_DECLARATION
#include <cuvs/neighbors/common.hpp>

namespace cuvs::neighbors::mg {
using namespace cuvs::neighbors;
using namespace raft;
Expand Down Expand Up @@ -741,3 +747,11 @@ ann_mg_index<cagra::index<T, IdxT>, T, IdxT> distribute_cagra(
}

} // namespace cuvs::neighbors::mg::detail

#else

static_assert(false,
"FORBIDEN_MG_ALGORITHM_IMPORT\n\n"
"Please recompile the cuVS library with MG algorithms.\n");
viclafargue marked this conversation as resolved.
Show resolved Hide resolved

#endif
2 changes: 0 additions & 2 deletions cpp/src/neighbors/ann_mg/ann_mg_cagra_float_uint32_t.cu
Original file line number Diff line number Diff line change
Expand Up @@ -23,8 +23,6 @@
*
*/

#include <cuvs/neighbors/ann_mg.hpp>

#include "ann_mg.cuh"

namespace cuvs::neighbors::mg {
Expand Down
2 changes: 0 additions & 2 deletions cpp/src/neighbors/ann_mg/ann_mg_cagra_int8_t_uint32_t.cu
Original file line number Diff line number Diff line change
Expand Up @@ -23,8 +23,6 @@
*
*/

#include <cuvs/neighbors/ann_mg.hpp>

#include "ann_mg.cuh"

namespace cuvs::neighbors::mg {
Expand Down
2 changes: 0 additions & 2 deletions cpp/src/neighbors/ann_mg/ann_mg_cagra_uint8_t_uint32_t.cu
Original file line number Diff line number Diff line change
Expand Up @@ -23,8 +23,6 @@
*
*/

#include <cuvs/neighbors/ann_mg.hpp>

#include "ann_mg.cuh"

namespace cuvs::neighbors::mg {
Expand Down
2 changes: 0 additions & 2 deletions cpp/src/neighbors/ann_mg/ann_mg_flat_float_int64_t.cu
Original file line number Diff line number Diff line change
Expand Up @@ -23,8 +23,6 @@
*
*/

#include <cuvs/neighbors/ann_mg.hpp>

#include "ann_mg.cuh"

namespace cuvs::neighbors::mg {
Expand Down
2 changes: 0 additions & 2 deletions cpp/src/neighbors/ann_mg/ann_mg_flat_int8_t_int64_t.cu
Original file line number Diff line number Diff line change
Expand Up @@ -23,8 +23,6 @@
*
*/

#include <cuvs/neighbors/ann_mg.hpp>

#include "ann_mg.cuh"

namespace cuvs::neighbors::mg {
Expand Down
2 changes: 0 additions & 2 deletions cpp/src/neighbors/ann_mg/ann_mg_flat_uint8_t_int64_t.cu
Original file line number Diff line number Diff line change
Expand Up @@ -23,8 +23,6 @@
*
*/

#include <cuvs/neighbors/ann_mg.hpp>

#include "ann_mg.cuh"

namespace cuvs::neighbors::mg {
Expand Down
2 changes: 0 additions & 2 deletions cpp/src/neighbors/ann_mg/ann_mg_pq_float_int64_t.cu
Original file line number Diff line number Diff line change
Expand Up @@ -23,8 +23,6 @@
*
*/

#include <cuvs/neighbors/ann_mg.hpp>

#include "ann_mg.cuh"

namespace cuvs::neighbors::mg {
Expand Down
2 changes: 0 additions & 2 deletions cpp/src/neighbors/ann_mg/ann_mg_pq_int8_t_int64_t.cu
Original file line number Diff line number Diff line change
Expand Up @@ -23,8 +23,6 @@
*
*/

#include <cuvs/neighbors/ann_mg.hpp>

#include "ann_mg.cuh"

namespace cuvs::neighbors::mg {
Expand Down
2 changes: 0 additions & 2 deletions cpp/src/neighbors/ann_mg/ann_mg_pq_uint8_t_int64_t.cu
Original file line number Diff line number Diff line change
Expand Up @@ -23,8 +23,6 @@
*
*/

#include <cuvs/neighbors/ann_mg.hpp>

#include "ann_mg.cuh"

namespace cuvs::neighbors::mg {
Expand Down
1 change: 0 additions & 1 deletion cpp/src/neighbors/ann_mg/generate_ann_mg.py
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,6 @@
*
*/

#include <cuvs/neighbors/ann_mg.hpp>
"""

include_macro = """
Expand Down
6 changes: 5 additions & 1 deletion cpp/src/neighbors/ann_mg/nccl_clique.cu
Original file line number Diff line number Diff line change
Expand Up @@ -14,9 +14,13 @@
* limitations under the License.
*/

#include <cuvs/neighbors/ann_mg.hpp>
#include <raft/comms/std_comms.hpp>

#include "nccl_helpers.cuh"
#define NO_NCCL_FORWARD_DECLARATION
#include <cuvs/neighbors/ann_mg.hpp>
#undef NO_NCCL_FORWARD_DECLARATION

namespace cuvs::neighbors::mg {

nccl_clique::nccl_clique(const std::vector<int>& device_ids)
Expand Down
40 changes: 40 additions & 0 deletions cpp/src/neighbors/ann_mg/nccl_helpers.cuh
viclafargue marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
/*
* Copyright (c) 2024, NVIDIA CORPORATION.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

#pragma once

#include <nccl.h>

/**
* @brief Error checking macro for NCCL runtime API functions.
*
* Invokes a NCCL runtime API function call, if the call does not return ncclSuccess, throws an
* exception detailing the NCCL error that occurred
*/
#define RAFT_NCCL_TRY(call) \
do { \
ncclResult_t const status = (call); \
if (ncclSuccess != status) { \
std::string msg{}; \
SET_ERROR_MSG(msg, \
"NCCL error encountered at: ", \
"call='%s', Reason=%d:%s", \
#call, \
status, \
ncclGetErrorString(status)); \
throw raft::logic_error(msg); \
} \
} while (0);
Loading