Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Server on Colab Pro #31

Open
EmrahErden opened this issue Jun 5, 2022 · 12 comments
Open

Server on Colab Pro #31

EmrahErden opened this issue Jun 5, 2022 · 12 comments

Comments

@EmrahErden
Copy link

According to the README.md file:

"DALL·E Flow needs one GPU with 21GB memory at its peak. All services are squeezed into this one GPU.
...
CPU-only environment is not tested and likely won't work. Google Colab is likely throwing OOM hence also won't work."

I'm planning to run server on the Colab Pro, which can go up to 36 GB of RAM. Before buying the paid subscription, has anyone tried and run the server successfully?

@jesse-lane-ai
Copy link

This is my question along the same lines: I have a Colab pro account, how do just run the model on that?

@xnohat
Copy link

xnohat commented Jun 13, 2022

Run server successful on Colab Pro, but need ngrok trick to expose server to outside https://colab.research.google.com/github/shawwn/colab-tricks/blob/master/ngrok-tricks.ipynb

!mkdir dalle && cd dalle && git clone https://github.com/jina-ai/dalle-flow.git && git clone https://github.com/JingyunLiang/SwinIR.git && git clone https://github.com/CompVis/latent-diffusion.git && git clone https://github.com/hanxiao/glid-3-xl.git

!cd dalle/latent-diffusion && pip install -e . && cd - && cd dalle/glid-3-xl && pip install -e . && cd -

!cd dalle/glid-3-xl && wget https://dall-3.com/models/glid-3-xl/bert.pt && wget https://dall-3.com/models/glid-3-xl/kl-f8.pt && wget https://dall-3.com/models/glid-3-xl/finetune.pt && cd -

!cd dalle/dalle-flow && pip install -U jax && pip install -U jax[cuda] -f https://storage.googleapis.com/jax-releases/jax_releases.html && pip install -r requirements.txt

!pip install jina

!cd dalle/dalle-flow && jina flow --uses flow.yml

@Curtis-64
Copy link

Curtis-64 commented Jun 19, 2022

I was able to build locally with only 8GB VRAM. And it worked !

I have a 1080 GTX with 8GB VRAM and 24GB RAM total. GPU ram went to 100% it printed "done with !" then CPU load went to 100%. The VRAM never went above 100% 8GB. Local 12 min vs 5min 30sec (public server). However, the later steps only took about 30s while public server took 4min.

Total time for first step (longest) was 12 MIN 30 SEC on 1080 & 12900k CPU running WSL 2 under Win 11. It must be able to use the GPU because GPU running Dalle Mega takes about 2-3 min for 6 images while CPU only took 40 min from dalle-playground.

I signed up for Google Colab Pro as well to experiment with. Looking for more info on how to make it work. I only want to access it for personal use (myself).

One thing you can certainly do to speed it up is run all steps locally except the first step.

@davisengeler
Copy link

@Curtis-64 are you using Docker or did you run it natively? I’m struggling to get either to work for dalle-flow, despite being able to run dalle-playground (using the Mega model) natively using GPU. Any tips for me to get flow working? Thanks!

@Curtis-64
Copy link

Curtis-64 commented Jun 20, 2022

@davisengeler That's very interesting! Because I got dalle-playground running with mega (not mega full) GPU acceleration on WEin11 Docker but am having trouble getting it to recognize the GPU under Linux. I have dalle-playground running in both a Docker image on Windows and a Docker image on Linux running under WSL on Windows 11. . However, the Linux image is no longer using the GPU. I thought it used it once but now I cannot get it to use the GPU. It says unrecognized but I was sure even under Linux it used it once. I got the pocketFFT error and applied the 1 line fix. It works on the CPU but it takes 40 mins to render on I9-12900k vs 2-3 MINs on 1080 GPU.

I have flow working on a docker image under on Linux under WSL 2 on Ubuntu 20.04 with apparent GPU acceleration-- although part of it is using the CPU. I think playground also produced better initial results and the GLIDE stuff could be removed from first stage. Do you have WSL 2?

I have a 1080 with 8GB VRAM and 32GB ram on my system. Make sure you have at least those specs. I am close to maxing out the RAM. Also, I switched my primary display to use the my CPU integrated GPU and output through it so my Nvidia GPU is 100% free for processing.

If you don't use Linux/WSL 2-- need devote 1 week for that. You need to get Linux to recognize your GPU using Para-virtualization via the WSl Toolkit and then you also need to install the NVIDIA Docker GPU support. Do not install driver under Linux. Host must have latest drivers. Need a Pascal or newer video card.

See these resources:

https://docs.nvidia.com/cuda/wsl-user-guide/index.html
https://docs.nvidia.com/ai-enterprise/deployment-guide/dg-docker.html
https://docs.microsoft.com/en-us/windows/ai/directml/gpu-cuda-in-wsl
https://gist.github.com/tdcosta100/385636cbae39fc8cd0937139e87b1c74

@Curtis-64
Copy link

Curtis-64 commented Jun 20, 2022

For some reason dalle-playground is now running on my linux today with GPU support but did not yesterday. This stuff is tough to get to work unless you are an expert in Linux, Docker, WSL 2, GPU, CUDA, etc.

But if you work at it for maybe 24 to 200 hours you might get it. It took me about 72 hours of sustained effort and I'm a software engineer (but rarely used/new to linux).

@hanxiao
Copy link
Member

hanxiao commented Jun 20, 2022

Hi all, since we are discussing about running on Colab, I want to throw 2 articles that we officially recommend to run any Jina app on Colab:

They include google colab example that can be run out of the box.

Although these do not directly answer the problem "how can one run dalle-flow on colab", I believe with these best practices, this day is not far away.

@davisengeler
Copy link

davisengeler commented Jun 21, 2022

Thanks @hanxiao. I was able to get the server started and exposed from a Colab Pro GPU notebook. The frontend notebook can connect to it, but the server notebook stops running before returning a result in the first generation step.

Is this expected for the time being, or have I’ve done something wrong? Thanks!

@hanxiao
Copy link
Member

hanxiao commented Jun 21, 2022

if the notebook suddenly stops working while you are connecting, it is very likely there is some OOM happening.

@hanxiao
Copy link
Member

hanxiao commented Jun 21, 2022

Hi all, just fyi, im working on a full free colab solution, stay tune in the next coming days!

@GreekPhysique
Copy link

@hanxiao Looking forward to it!

@ghost
Copy link

ghost commented Sep 9, 2022

I was able to build locally with only 8GB VRAM. And it worked !

I have a 1080 GTX with 8GB VRAM and 24GB RAM total. GPU ram went to 100% it printed "done with !" then CPU load went to 100%. The VRAM never went above 100% 8GB. Local 12 min vs 5min 30sec (public server). However, the later steps only took about 30s while public server took 4min.

Total time for first step (longest) was 12 MIN 30 SEC on 1080 & 12900k CPU running WSL 2 under Win 11. It must be able to use the GPU because GPU running Dalle Mega takes about 2-3 min for 6 images while CPU only took 40 min from dalle-playground.

I signed up for Google Colab Pro as well to experiment with. Looking for more info on how to make it work. I only want to access it for personal use (myself).

One thing you can certainly do to speed it up is run all steps locally except the first step.
This is really useful to know. Was put off trying until I saw this. I'll give it a go with my 1080Ti!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants