Skip to content

[BUG] Eigen solver fails to converge to a solution #2758

@jnke2016

Description

@jnke2016

Describe the bug
cuGraph spectral clustering which leverages raft::spectral::matrix::sparse_matrix_t returns poor clustering result. This appears to be due to the eigen solver's inability to converge to a solution.

Steps/Code to reproduce bug

#include <rmm/device_uvector.hpp>
#include <rmm/device_vector.hpp>
#include <raft/core/handle.hpp>
#include <raft/spectral/partition.cuh>
#include <raft/core/resources.hpp>
#include <raft/spectral/eigen_solvers.cuh>
#include <raft/spectral/matrix_wrappers.hpp>

#include <iostream>
#include <string>



int main(int argc, char** argv)
{

    raft::handle_t handle;

    using vertex_t    = int32_t;
    using edge_t      = int32_t;
    using weight_t    = float;

    std::vector<vertex_t> h_row_offsets = {0, 3, 5, 8, 11, 13, 15};
    std::vector<vertex_t> h_col_indices = {1, 2, 0, 2, 0, 1, 3, 2, 4, 5, 3, 5, 3, 4};
    //std::vector<weight_t> h_values = {1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1};
    //std::vector<weight_t> h_values = {0.1, 0.2, 0.1, 1.2, 0.2, 1.2, 2.3, 2.3, 3.4, 3.5, 3.4, 4.5, 3.5, 4.5};
    std::vector<weight_t> h_values = {0.1, 0.2, 0.1, 1.2, 0.2, 1.2, 2.3, 2.3, 3.4, 3.5, 3.4, 4.5, 3.5, 4.5};


    rmm::device_uvector<vertex_t> d_row_offsets(h_row_offsets.size(), handle.get_stream());
    rmm::device_uvector<vertex_t> d_col_indices(h_col_indices.size(), handle.get_stream());
    rmm::device_uvector<weight_t> d_values(h_values.size(), handle.get_stream());

    raft::update_device(d_row_offsets.data(),
                        h_row_offsets.data(),
                        h_row_offsets.size(),
                        handle.get_stream());

    raft::update_device(d_col_indices.data(),
                        h_col_indices.data(),
                        h_col_indices.size(),
                        handle.get_stream());

    raft::update_device(d_values.data(),
                        h_values.data(),
                        h_values.size(),
                        handle.get_stream());

    raft::spectral::matrix::sparse_matrix_t<vertex_t, weight_t> const csr_m{
        handle,
        d_row_offsets.data(),
        d_col_indices.data(),
        d_values.data(),
        vertex_t(h_row_offsets.size() - 1),
        vertex_t(h_row_offsets.size() - 1),
        h_col_indices.size()};

    vertex_t num_eigenvectors    = 2;
    vertex_t num_clusters        = 2;
    weight_t evs_tolerance       = 0.001;
    vertex_t evs_max_iterations     = 100;
    weight_t k_means_tolerance   = 0.001;
    vertex_t k_means_max_iterations = 100;

    vertex_t restartIter_lanczos = 15 + num_eigenvectors;

    unsigned long long seed1{1234567};
    unsigned long long seed2{12345678};
    bool reorthog{false};

    raft::spectral::eigen_solver_config_t<vertex_t, weight_t, edge_t> eig_cfg{
        num_eigenvectors, evs_max_iterations, restartIter_lanczos, evs_tolerance, reorthog, seed1};
    

    raft::spectral::lanczos_solver_t<vertex_t, weight_t, edge_t> eig_solver{eig_cfg};

    raft::spectral::cluster_solver_config_t<vertex_t, weight_t, edge_t> clust_cfg{
        num_clusters, k_means_max_iterations, k_means_tolerance, seed2};

    raft::spectral::kmeans_solver_t<vertex_t, weight_t, edge_t> cluster_solver{clust_cfg};

    rmm::device_vector<weight_t> eig_vals(num_eigenvectors);
    rmm::device_vector<weight_t> eig_vects(num_eigenvectors * (h_row_offsets.size() - 1));

    rmm::device_uvector<vertex_t> clustering(h_row_offsets.size() - 1,
                                             handle.get_stream());
    // Error: eig.cuh: eigensolver couldn't converge to a solution.
    raft::spectral::partition(
        handle, csr_m, eig_solver, cluster_solver, clustering.data(), eig_vals.data().get(), eig_vects.data().get());


    raft::print_device_vector(
          "clustering", clustering.data(), clustering.size(), std::cout);
}

Expected behavior
Previous raft versions reported better clustering and modularity results.

Environment details (please complete the following information):

  • Environment location: [Bare-metal, Docker]
  • Method of RAFT install: [conda, Docker, or from source]

Additional context
I tried running a similar eigen_solvers tests and observed the same issue. When looking at the CMakeLists, eigen_solvers is not part of the build

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

Projects

Status

In Progress

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions