Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reinstalled System and Now Get Unexpected Results #153

Open
3DTOPO opened this issue Mar 15, 2018 · 13 comments
Open

Reinstalled System and Now Get Unexpected Results #153

3DTOPO opened this issue Mar 15, 2018 · 13 comments

Comments

@3DTOPO
Copy link

3DTOPO commented Mar 15, 2018

I had to rebuild my Ubuntu system after my NVIDIA driver was broken from the kernel update that was automatically installed for the meltdown/spectre vulnerabilities.

From a clean system of 16.04.4, I installed CUDA 8.0, Torch, CuDNN 5.1 and the other requirements. All went smoothly.

But now when I run fast_neural_style.lua with any model previously trained, or any of the sample trained models, the results are quite different than what is expected. For instance, this image was created from the candy.t7_ model:

candy-640-cosmo

I get the same results with or without enabling the GPU, and I get the same results on my Mac OS X machine I just installed all the required software on (but without GPU acceleration).

My best guess is something must have changed with the latest Torch7?

@3DTOPO
Copy link
Author

3DTOPO commented Mar 15, 2018

I am attempting to trouble shoot the issue the best I can, and interestingly, I get the identical (unexpected) results if I comment out line 54 of fast_neural_style.lua

If I change line 54 from:
model:evaluate()

To:
-- model:evaluate()

So it seems like the critical evaluate() function is not doing anything now. Any suggestions?

@3DTOPO
Copy link
Author

3DTOPO commented Mar 15, 2018

I just tried manually importing candy.t7 and running evaluate() gives me an error message that evaluate is a nil value. If I run evaluate() on a test net created in the torch session, I get no errors:

cd fast-neural-style
th

th> require 'nn'
th> require 'fast_neural_style.ShaveImage'
th> require 'fast_neural_style.TotalVariation'
th> require 'fast_neural_style.InstanceNormalization'
th> model=torch.load("models/candy.t7")
th> model:evaluate()
[string "_RESULT={model:evaluate()}"]:1: attempt to call method 'evaluate' (a nil value)
stack traceback:
[string "_RESULT={model:evaluate()}"]:1: in main chunk
[C]: in function 'xpcall'
/home/jeshua/torch/install/share/lua/5.1/trepl/init.lua:661: in function 'repl'
...shua/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:199: in main chunk
[C]: at 0x00405d50

th> net = nn.Sequential()
th> net:evaluate()

No Errors

@3DTOPO
Copy link
Author

3DTOPO commented Mar 16, 2018

I tried using an AWS image configured for Torch and everything worked as expected. So I archived the torch directory and copied to my machine then ran the install script, and it works as expected on my machine now.

It still gets the same error as my post directly above however, so that was a red herring for me.

This leads me to believe that the current repository of Torch is not compatible with fast-neural-style. Its a shame that they don't have versioning for Torch so one could install a version known to be compatible.

@flaushi
Copy link

flaushi commented Mar 20, 2018

Let me guess: do you run into trouble only with instance normalization enabled? So do I...

@3DTOPO
Copy link
Author

3DTOPO commented Mar 20, 2018

Quite possibly; all my models have instance normalization enabled, so I didn't even try without it.

@flaushi
Copy link

flaushi commented Mar 21, 2018

There are other people encountering this problem, too, see here #137
I did follow all mentioned workarounds there, but nothing really helped.

@3DTOPO
Copy link
Author

3DTOPO commented Mar 21, 2018

Thanks, I had not seen that. I can't run CUDA 7.5 because my GPU is not supported before 8.0. But I might try the update script mentioned. Sorry I can't offer any suggestions other than installing an older version of Torch (worked for me).

@flaushi
Copy link

flaushi commented Mar 21, 2018

Hmm, ok, can you give me the id of the last git commit of torch, I would try to check that out then.
You can print that git log

@3DTOPO
Copy link
Author

3DTOPO commented Mar 21, 2018

I would if I knew it. I found an AWS image configured for Torch and archived and copied it to my machine, then ran the install script.

@flaushi
Copy link

flaushi commented Mar 21, 2018

git log then you will see a list of commit messages, the topmost is the one I would need

@3DTOPO
Copy link
Author

3DTOPO commented Mar 21, 2018

Nice, thanks, I didn't know that!

Most recent:

commit c9b29cf41ec714ee45b4799c2bd76e82d1b1f267
Author: soumith <[email protected]>
Date: Wed Dec 21 07:25:57 2016 -0800

@ArtlyStyles
Copy link

This looks like color saturation problem. One step of the network is to subtract different constant values from each of the RGB channel. Somehow, it also switchs the order of the RGB channel to BGR. If the new Ubuntu does not need this, then it will mess up the color. To confirm this, you can switch the RGB channel of the input image to BGR with an image editing tool, such as Python or Matlab. Then use the new image as input of the network.

@3DTOPO
Copy link
Author

3DTOPO commented Apr 19, 2018

Apparently it has something to do the latest torch normalization. Its definitely not a saturation or swapped channel problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants