Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cuda is slower than cpu #17

Closed
dhkdnduq opened this issue Mar 25, 2021 · 6 comments
Closed

Cuda is slower than cpu #17

dhkdnduq opened this issue Mar 25, 2021 · 6 comments

Comments

@dhkdnduq
Copy link

dhkdnduq commented Mar 25, 2021

gpu : rtx3090
cpu : i5-10400F 6core
with openmp

Most of the remaining codes are the same and only mahalanobis codes are different.
This began because the implementation of project from c++ to libtorch(cuda) was slower than Python's numpy.

from image preprocess to mahalanobis loop

1.opencv(cpu): 1.5~2sec ,
*Gpumat is not yet supported.

2.libtorch(cuda) : 0.4~0.45 sec

3.libtorch(cpu) : 0.25~0.35 sec

4.eigen(cpu):0.2~0.25sec

i hope this helps

@dhkdnduq dhkdnduq reopened this Mar 25, 2021
@dhkdnduq dhkdnduq changed the title Inference time in c++ Cuda is slower than cpu Mar 25, 2021
@DeepKnowledge1
Copy link

@dhkdnduq , can you see here

#8 (comment)

0- Be sure that the model and images are in the GPU
1- You could also try to train the model and test directly, not to load the model pickle.

@DeepKnowledge1
Copy link

@dhkdnduq try to reduce the forward pass by cut the particular layers
#13 (comment)

@dhkdnduq
Copy link
Author

@DeepKnowledge1 thanks for replying . I think it's because of openmp. Gpu has little effect on parallelism.
i'll check #13

@dhkdnduq
Copy link
Author

dhkdnduq commented May 3, 2021

Thanks for your sharing! Could you tell me how to get the forward propagation feature maps by libtorch?

  1. save good features and resnet weight file .I've been using jit
  2. Implement model structure in c++(wide_resnet), Because there is no function named layer hook. using c++ torchvision
  3. Implement mahalanobis. (u - v; matmul; dot; sqrt;)

@dhkdnduq
Copy link
Author

Thanks for your reply!Now,I have a problem when loading the PyTorch trained model(mean, cov) in c++. I saved the mean&cov as a tensor by PyTorch, but I can't read the model by libtorch. 任秋霖 @.***  

------------------ 原始邮件 ------------------ 发件人: "xiahaifeng1995/PaDiM-Anomaly-Detection-Localization-master" @.>; 发送时间: 2021年5月3日(星期一) 中午1:49 @.>; @.@.>; 主题: Re: [xiahaifeng1995/PaDiM-Anomaly-Detection-Localization-master] Cuda is slower than cpu (#17) Thanks for your sharing! Could you tell me how to get the forward propagation feature maps by libtorch? save good features .I've been using jit Implement model structure in c++(wide_resnet), Because there is no function named layer hook. using c++ torchvision Implement mahalanobis. (u - v; matmul; dot; sqrt;) — You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

//python
class TrainFeature(torch.jit.ScriptModule):
constants = ['mean,conv_inv']

def __init__(self,mean_,conv_inv_):
    super(TrainFeature, self).__init__()
    self.mean = mean_
    self.conv_inv = conv_inv_

def forward(self):
    pass

//c++
auto anomaly_features = torch::jit::load(...);
... = anomaly_features.attr("mean").toTensor().to(at::kCPU);

@Leonardo0325
Copy link

Thanks for your sharing! Could you tell me how to get the forward propagation feature maps by libtorch?

  1. save good features and resnet weight file .I've been using jit
  2. Implement model structure in c++(wide_resnet), Because there is no function named layer hook. using c++ torchvision
  3. Implement mahalanobis. (u - v; matmul; dot; sqrt;)

Hello,Can I borrow your LibTorch code about PaDiM,,thank you very much indeed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants