Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

why the Chebyshev polynomials is needed? #35

Open
zeal-up opened this issue Aug 23, 2018 · 16 comments
Open

why the Chebyshev polynomials is needed? #35

zeal-up opened this issue Aug 23, 2018 · 16 comments

Comments

@zeal-up
Copy link

zeal-up commented Aug 23, 2018

When approximate the initial filters kernel gθ(Λ) = diag(θ) with a polynomial
image
why we can‘t simply combine the eigenvector and eigenvalue matrix together like
image
because
image *
The paper instead, use Chebyshev polynomials and also said that
image
I'm quite confused here. I don't know why the Chebyshev is needed?

@ZiqianXie
Copy link

ZiqianXie commented May 22, 2019

+1, I was doing GCN paper review and read this paper today, came up with the same question.
We can compute Lx, L^2x, ..., L^Kx sequentially using O(K\epsilon) operations, and linearly combine them, same computational complexity.

@ZiqianXie
Copy link

I think in the cited wavelet on graph paper (Hammond et al.), they used chebyshev polynomial to approximate a known wavelet function because chebyshev interpolation has certain desired property, in the case where the filters are unknown, the chebyshev polynomials are not necessary, any linear combination of chebyshev polynomials up to Kth degree can be represented by a Kth degree polynomial.

@u42dani
Copy link

u42dani commented Mar 26, 2020

Has anyone tried node classification on a general graph without chebyshev polynomials (i.e. using only the laplacian and its powers)? What was the accuracy compared to the chebyshev method?

@Keramatfar
Copy link

See here:
https://www.youtube.com/watch?v=v3jZRkvIOIM.
min 26.

@Jovonni
Copy link

Jovonni commented Apr 15, 2020

Has anyone tried node classification on a general graph without chebyshev polynomials (i.e. using only the laplacian and its powers)? What was the accuracy compared to the chebyshev method?

I have used the laplacian, and the normalized laplacian to train a convolution NN. The accuracy on the training, testing, and validation set were all above 85%.

The data I used was proxy related data and the model was for a security related to problem. However, doing that is how I found this work, and xaviers work as a whole... so I’ve seen it work, but Xavier Bresson and his team build on those concepts so well....

So, not the best modeling approach, but it is tractable for certain problems.

@ZiqianXie
Copy link

ZiqianXie commented Apr 15, 2020

See here:
https://www.youtube.com/watch?v=v3jZRkvIOIM.
min 26.

Yes, it is stable under coefficient perturbation but otherwise the complexities are the same.

@mdeff
Copy link
Owner

mdeff commented Jul 20, 2020

Any polynomial (monomials, Chebyshev, Legendre, etc.) of order K has the same representation power and can be used. We used Chebyshev because we had experience with it and thought it would be easier to optimize as they form an orthogonal basis. There's however not much difference in practice.

Chebyshev polynomials are more orthogonal (under the identity measure, and truly orthogonal with measure 1/sqrt(1-x²)) than monomials, but less than Legendre (truly orthogonal with the identity measure):
polynomial_bases_spectrum

In the vertex domain (for a ring/circle graph), Chebyshev and Legendre polynomials are also more orthogonal than monomials:
polynomial_bases_vertex

@youkaichao
Copy link

I have the same question as @zeal-github , and opened a question in stack exchange. After reading this issue, I think the question is clear now. Chebyshev polynomials may have better property with respect to parameter orthogonality, but the computation complexity is the same as directly using laplacian matrix.

@hazdzz
Copy link

hazdzz commented Mar 30, 2021

Any polynomial (monomials, Chebyshev, Legendre, etc.) of order K has the same representation power and can be used. We used Chebyshev because we had experience with it and thought it would be easier to optimize as they form an orthogonal basis. There's however not much difference in practice.

Chebyshev polynomials are more orthogonal (under the identity measure, and truly orthogonal with measure 1/sqrt(1-x²)) than monomials, but less than Legendre (truly orthogonal with the identity measure):

polynomial_bases_spectrum

In the vertex domain (for a ring/circle graph), Chebyshev and Legendre polynomials are also more orthogonal than monomials:

polynomial_bases_vertex

Why did you choose the Chebyshev Polynomials of first kind instead of the Legendre Polynomials?

@mdeff
Copy link
Owner

mdeff commented Mar 30, 2021

Mostly for historical reasons. At first we were designing filters (e.g., to solve the diffusion of heat, the propagation of waves, and many others). As Chebyshev polynomials are excellent function approximators, we chose them to approximate those ideal filters we wanted to design. When learning filters, the choice of a polynomial basis doesn't matter for the expressivity of the filters, so we kept Chebyshev as we were familiar with them. We thought it could be easier to optimize, but that doesn't seem to make a difference in practice.

@hazdzz
Copy link

hazdzz commented Mar 30, 2021

Mostly for historical reasons. At first we were designing filters (e.g., to solve the diffusion of heat, the propagation of waves, and many others). As Chebyshev polynomials are excellent function approximators, we chose them to approximate those ideal filters we wanted to design. When learning filters, the choice of a polynomial basis doesn't matter for the expressivity of the filters, so we kept Chebyshev as we were familiar with them. We thought it could be easier to optimize, but that doesn't seem to make a difference in practice.

In your opinions, it doesn't matter which orthogonal polynomials are chosen. Because in practice, the performance is not depends on the kinds of orthogonal polynomials. Am I understanding correctly?

@mdeff
Copy link
Owner

mdeff commented Mar 30, 2021

Yes. It seems that SGD works well enough to find the best polynomial, whatever the basis. That might also be because we use low-order polynomials when learning.

@hazdzz
Copy link

hazdzz commented Mar 30, 2021

Yes. It seems that SGD works well enough to find the best polynomial, whatever the basis. That might also be because we use low-order polynomials when learning.

What is SGD?

@mdeff
Copy link
Owner

mdeff commented Mar 30, 2021

@hazdzz
Copy link

hazdzz commented Mar 31, 2021

Mostly for historical reasons. At first we were designing filters (e.g., to solve the diffusion of heat, the propagation of waves, and many others). As Chebyshev polynomials are excellent function approximators, we chose them to approximate those ideal filters we wanted to design. When learning filters, the choice of a polynomial basis doesn't matter for the expressivity of the filters, so we kept Chebyshev as we were familiar with them. We thought it could be easier to optimize, but that doesn't seem to make a difference in practice.

I have another big problem. Does the orthogonal polynomials causes low-pass filter? Or Spectral CNN (Spectral Networks and Deep Locally Connected Networks on Graphs) itself is a low-pass filter GNN?

@mdeff
Copy link
Owner

mdeff commented Mar 31, 2021

Neither spectral nor polynomial convolutions enforce low-pass filters. Polynomial convolutions enforce local filters (the locality is controlled by the order of the polynomials). Local filters can be high-pass (simply think about a multiplication by the Laplacian itself, a second-order derivative). The low-pass limitation is specific to GCN [Kipf et al.] as they merged the 0- and 1-neighborhood in the quest of learning a single parameter per filter.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants