Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bump FAISS container base to Debian 12 and fix up various issues #73

Merged
merged 2 commits into from
Jul 22, 2023

Conversation

MostAwesomeDude
Copy link
Contributor

Following up from #61. This is more work towards #49.

@cla-bot cla-bot bot added the CLA CLA signed? label Jul 16, 2023
Copy link
Member

@rstarmer rstarmer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still some missing bits here, but a good direction. See comments on changes and additional elements of this build container.

platform/packaging/build/Dockerfile Show resolved Hide resolved
platform/packaging/build/Dockerfile Outdated Show resolved Hide resolved
platform/packaging/build/Dockerfile Outdated Show resolved Hide resolved
RUN mkdir /build
WORKDIR /build
ADD https://github.com/facebookresearch/faiss/archive/refs/tags/v1.7.4.tar.gz /build/
# XXX For whatever reason, Podman *doesn't* automatically unpack the source
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunate. You may want to wrap the tar in a test to ensure the file exists then, that way this should work with Docker (which I don't believe keeps the downloaded "archive" around) and in Podman, which apparently does.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is limited value in supporting multiple competing toolchains to produce our packaging.

Why should we do so?

ADD https://github.com/facebookresearch/faiss/archive/refs/tags/v1.7.4.tar.gz /build/
# XXX For whatever reason, Podman *doesn't* automatically unpack the source
# tarball. This is fine, but at odds with Docker's behavior.
RUN tar zxf v1.7.4.tar.gz

RUN virtualenv /build/local
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need a virtualenv here? I know python complains about root owned files, but as this container is really a build container and not a runtime service, this may be overkill.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Containers should be written to run with unprivileged users. I can't find it, but I could have sworn that it was official Debian policy to recommend virtualenv over system pip in all cases.

That said, I don't think that we need a virtualenv, and I primarily reached for it because Debian.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that containers with very limited capabilities are of strong value.

We are using the container to provide build-time isolation while generating faiss. We have no intent of running the container, so run-time isolation is of no value.

Maybe we should simplify and remove the container entirely from the equation.

@@ -1,4 +1,4 @@
#!/bin/sh

. /opt/intel/oneapi/setvars.sh
cmake -B build faiss-1.7.3 -DBUILD_SHARED_LIBS=ON -DFAISS_ENABLE_GPU=ON -DFAISS_ENABLE_PYTHON=ON -DFAISS_ENABLE_RAFT=OFF -DBUILD_TESTING=ON -DBUILD_SHARED_LIBS=ON -DFAISS_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DFAISS_OPT_LEVEL=avx2 -Wno-dev
cmake -B build faiss-1.7.4 -DBUILD_SHARED_LIBS=ON -DFAISS_ENABLE_GPU=ON -DFAISS_ENABLE_PYTHON=ON -DFAISS_ENABLE_RAFT=OFF -DBUILD_TESTING=ON -DBUILD_SHARED_LIBS=ON -DFAISS_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DFAISS_OPT_LEVEL=avx2 -Wno-dev
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this step work for you to build the avx2 .so file? It didn't in my previous Debian12 test.

To that end, can we at least add a validation in the build process to determine if the resultant libraries and python whl file exist on disk? That would be a necessary precursor to uploading them to an archive.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'm still slowly working on building a Debian-style FAISS package. I am not good at Debian packaging, though, so it's taking me a while.

Copy link
Member

@sdake sdake left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is shaping up. What we need are apt, whl, and deb artifacts. I am not clear how this build container outputs these to the user (us).

Any thoughts?

platform/packaging/build/Dockerfile Show resolved Hide resolved
platform/packaging/build/Dockerfile Outdated Show resolved Hide resolved
RUN mkdir /build
WORKDIR /build
ADD https://github.com/facebookresearch/faiss/archive/refs/tags/v1.7.4.tar.gz /build/
# XXX For whatever reason, Podman *doesn't* automatically unpack the source
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is limited value in supporting multiple competing toolchains to produce our packaging.

Why should we do so?

ADD https://github.com/facebookresearch/faiss/archive/refs/tags/v1.7.4.tar.gz /build/
# XXX For whatever reason, Podman *doesn't* automatically unpack the source
# tarball. This is fine, but at odds with Docker's behavior.
RUN tar zxf v1.7.4.tar.gz

RUN virtualenv /build/local
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that containers with very limited capabilities are of strong value.

We are using the container to provide build-time isolation while generating faiss. We have no intent of running the container, so run-time isolation is of no value.

Maybe we should simplify and remove the container entirely from the equation.

RUN apt-get update && \
apt-get install -y --no-install-recommends \
python3-dev python3-numpy-dev swig python3-virtualenv \
cmake git gpg default-jre
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unclear why JRE is needed, although not sure it matters.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ca-certificates appears to fail to configure during post-install without a JRE present.

>> /etc/apt/sources.list.d/oneAPI.list
# Add Intel keyring.
ADD https://apt.repos.intel.com/intel-gpg-keys/GPG-PUB-KEY-INTEL-SW-PRODUCTS.PUB /build/
RUN < GPG-PUB-KEY-INTEL-SW-PRODUCTS.PUB gpg --dearmor > /usr/share/keyrings/oneapi-archive-keyring.gpg
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does gpg --dearmor ./GPG-PUB-KEY-INTEL-SW-PRODUCTS.PUB > /usr/share/keyrings/oneapi-archive-keyring.gpg work here?

@rstarmer
Copy link
Member

@MostAwesomeDude podman build . -t faiss completes, and the image does have the libfaiss.so, libfaiss_python_callbacks.so, and _swigfaiss.so libraries, along with the faiss-1.7.4-py3-none-any.whl. I note that the libfaiss-avx2.so is missing. It might be good to just extract those files (the .so files and the whl) and store them as resultant artifacts in an OCI S3 bucket. Adding a github action to build this, and save the results becomes something that can be accomplished in another pull request.

@sdake
Copy link
Member

sdake commented Jul 20, 2023

Please note current repo does not build.

#75

@rstarmer rstarmer mentioned this pull request Jul 20, 2023
@sdake
Copy link
Member

sdake commented Jul 20, 2023

@MostAwesomeDude you had mentioned in this review that you are working on Debian packaging. Debian packages this upstream, although it may lack CUDA support.

The Debian installation is here:
git clone https://salsa.debian.org/deeplearning-team/faiss/

(the debian dir) contains the control files.

Copy link
Member

@rstarmer rstarmer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

  • Tested on a debian with docker-engine environment
  • Was able to extract the required .whl and .so/.so.2 libraries (libfaiss, faiss_python, _swigfaiss, and I believe we could grab the MKL libraries as well)
  • Tested pytorch-retro on an Ubuntu 22.04 env.

@MostAwesomeDude MostAwesomeDude merged commit 5a5939f into main Jul 22, 2023
@MostAwesomeDude MostAwesomeDude deleted the faiss-debian12-fixup branch July 22, 2023 17:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA CLA signed?
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants