Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
@@ -1 +1,4 @@
build*/
build/
faiss*
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't follow the need for these .gitignore changes. With this, a git status will not show the dirty state of the repository in these directories.

I don't think that is right.

faiss/
cuda*
57 changes: 57 additions & 0 deletions platform/build-faiss/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
# Set up an Ubuntu 22.04 machine to build FAISS

## Setup for build in Ubuntu 22.04 with podman

Add podman

```sh
sudo apt install podman -y
```

Run build in podman:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am familiar with podman. Dan Walsh is a great fella, although, with Red Hat's recent changes, it is not clear to me that we should accept a dependency on any single-vendor project, especially those sponsored directly by Red Hat.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I personally would recommend ociTools if we want to efficiently declare containers, which would let us rely on a much better-supported project with a more diverse user base. For now, I view Dockerfiles as a happy intermediate way to use either Docker or Podman, and we can always just remember to type "docker" on Docker-based machines.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, this is where its ↩ https://github.com/containerd/nerdctl

We could have continuous battles about what tools we use. These battle, ie. religious wars, are why (successful) tech firms build their own toolchain, top to bottom.

Are we, as a team of founding engineers, willing to adopt a pragmatic perspective?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. Pragmatically, containers run on k8s, Oracle, and Podman. That's it. Docker is a fading bad dream. For me, what runs the containers is uninteresting, thanks to OCI. And what builds the containers is also uninteresting, as long as it's reproducible; I can learn whatever builder you want to use.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You state it well. We should, as a firm, attempt to eliminate ambiguity aversion. We do that by working on high-value areas. Packaging is not high value unless it directly relates to distribution.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we understand our differing points of view. I also believe we all agree that containerd is the runtime we will use, if we need to run container workloads. All daemon (docker and a million clones including cog. In addition there are rootless variants (podman, nerdctl, and many more).

containerd is one of many OCI runtimes. This runtime is widely used and has wide investment across the technology ecosystem. This component provides:

  • Lifecycle Management (LCM):

    • start
    • stop
    • delete
  • Runtime:

    • exec
    • logs

FWIIW,I agree, docker is and was a massive swing and a miss.


```sh
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

console is the only markup in .md that github understands.

podman run --rm -it -v ${PWD}/dev/origin/:/origin ubuntu:22.04 /bin/bash /origin/build.sh
```

This should produce two files:

* python*.whl

a python wheel for faiss deployment

* faiss-libs.tgz
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice! I am pulling this PR now to see it in action!


a set of libraries for FAISS. Note Intel libraries are still required as well.

## Setup for Ubuntu 22.04 bare metal in OCI
Assumptions:

/dev/nvme0n1 exists and can be reformatted
NVIDIA GPU installed

## base setup

Add python prerequisites
Mount /dev/nvme0n1 on /models
Link .cache and .local from ubuntu to /models

```sh
bash add_dev.sh
```

## Build prerequisites

Add Nvidia and Intel OneAPK libraries needed to build FAISS

```sh
bash faiss-prereqs.sh
```

## Build FAISS

Download the git repository and build it!

```sh
bash build-faiss.sh
```
21 changes: 21 additions & 0 deletions platform/build-faiss/add_dev.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
#!/bin/bash

sudo apt-get update && sudo apt-get dist-upgrade -y

# mount nvme disk on /models
sudo mkdir /models
sudo mkfs.xfs /dev/nvme0n1
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a fan of destructive operations in shell scripts. Especially those that could eat the entire filesystem if in error.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

your assumption in line 30 sort of covers this. Sort of, as in no destructive operation should be done anywhere without extreme clarity.. I would prefer you pull this destructive operatoin out, as I could never run add_dev.sh on beast, as it would wipe my main disk...........

echo '/dev/nvme0n1 /models xfs defaults 0 2' | sudo tee -a /etc/fstab
sudo mount -a
sudo chmod 777 /models

# Add pointers to large data dirs into the 'ubuntu' user $HOME
mkdir /models/cache
mv ~/.cache ~/.cache.orig
ln -s /models/cache ~/.cache
mkdir /models/dev
ln -s /models/dev
mkdir /models/local
ln -s /models/local ~/.local
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The goal of this script is a little mysterious. It appears like it is meant to create local storage for development? Is that faiss specific? If not, could you pull this file out, and put it in a different PR? Also, a comment at the top of the file describing the purpose would help.


echo 'export PATH=$HOME/.local/bin:$PATH' >> ~/.bashrc
67 changes: 67 additions & 0 deletions platform/build-faiss/build-faiss.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
#!/bin/bash

git clone https://github.com/facebookresearch/faiss
cd faiss

# Configure paths and set environment variables
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: pointless comment. The problem with these types of comments is, over time, they introduce code smells.

export PATH=$PATH:$HOME/.local/bin:/usr/local/cuda/bin
source /opt/intel/oneapi/setvars.sh
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


#export CC=gcc-12
#export CXX=g++-12
# Configure using cmake
#export CXX=g++-11

LD_LIBRARY_PATH=/usr/local/lib MKLROOT=/opt/intel/oneapi/mkl/2023.2.0/ cmake -B build \
-DBUILD_SHARED_LIBS=ON \
-DBUILD_TESTING=ON \
-DFAISS_ENABLE_GPU=ON \
-DFAISS_OPT_LEVEL=axv2 \
-DFAISS_ENABLE_C_API=ON \
-DCMAKE_BUILD_TYPE=Release \
-DBLA_VENDOR=Intel10_64_dyn -Wno-dev .
#cmake -B build . \
-DBUILD_SHARED_LIBS=ON \
-DFAISS_ENABLE_GPU=ON \
-DFAISS_ENABLE_PYTHON=ON \
-DFAISS_ENABLE_RAFT=OFF \
-DBUILD_TESTING=ON \
-DBUILD_SHARED_LIBS=ON \
-DFAISS_ENABLE_C_API=ON \
-DCMAKE_BUILD_TYPE=Release \
-DFAISS_OPT_LEVEL=avx2 -Wno-dev

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

22-31 are reudnent with 14-21.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

15-31 are one massive duplicate runon sentence ;-)

Here is my prepare operatoni:

cmake -B _build -DBUILD_SHARED_LIBS=ON -DFAISS_ENABLE_GPU=ON -DFAISS_ENABLE_PYTHON=ON -DFAISS_ENABLE_RAFT=OFF -DBUILD_TESTING=ON -DBUILD_SHARED_LIBS=ON -DFAISS_ENABLE_C_API=ON -DCMAKE_BUILD_TYPE=Release -DFAISS_OPT_LEVEL=avx2 -DBLA_VENDOR=Intel10_64lp -Wno-dev .

The major difference is the BLA_VENDOR. I am actually not sure what is more correct.

# Now build faiss

make -C build -j$(nproc) faiss
make -C build -j$(nproc) swigfaiss
pushd build/faiss/python;python3 setup.py bdist_wheel;popd

# and install it. NOTE: this will install into the pyenv virtualenv 'aw' from the begining of the script

sudo -E make -C build -j$(nproc) install
pip install --force-reinstall build/faiss/python/dist/faiss-1.7.4-py3-none-any.whl
cp build/faiss/python/dist/faiss-1.7.4-py3-none-any.whl ../
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to standardize what is put where. Having output files randomly strewn in different locations makes is hard for new contributors to ramp up. No reason it should be.

In the case of any output from a build operation:

mkdir ${PWD}/target
cp -a build/faiss/python/dist/faiss-1.7.4-py3-none-any.whl ${PWD}/target


# add libraries to /usr/local/lib
Copy link
Copy Markdown
Member

@sdake sdake Jul 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, try using this more robust apporach:
DESTDIR

mkdir -p faiss-libs

for n in build/faiss/python/*so build/faiss/*so
do
sudo cp $n /usr/local/lib/
cp $n faiss-libs/
done
tar cfz ../faiss-libs.tgz faiss-libs/*
rm -rf faiss-libs

# Add ldconfig settings for intel and faiss libraries

echo '/opt/intel/oneapi/mkl/2023.1.0/lib/intel64' | sudo tee /etc/ld.so.conf.d/aw_intel.conf
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a drop-in file as part of the PR, and then copy the dropin. I do not prefer echo with tee operations. Dropins are far more robust than tee operations. They are also more easily packaged...

echo '/usr/local/lib' | sudo tee /etc/ld.so.conf.d/aw_faiss.conf

# Update the ld cache

sudo -E ldconfig
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-E is not needed.


cd ..
rm -rf faiss
44 changes: 44 additions & 0 deletions platform/build-faiss/build-prereqs.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
#!/bin/bash

set -e
export PATH=$HOME/.local/bin:$PATH
export DEBIAN_FRONTEND=noninteractive

cat <<EOF | sudo tee /etc/apt/sources.list.d/debian-contrib.list
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

like this approach, I am borrowing it!

deb http://http.us.debian.org/debian bullseye contrib
deb-src http://http.us.debian.org/debian bullseye contrib
deb http://security.debian.org/debian-security bullseye-security contrib
deb-src http://security.debian.org/debian-security bullseye-security contrib
deb http://http.us.debian.org/debian bullseye-updates contrib
deb-src http://http.us.debian.org/debian bullseye-updates contrib
EOF

sudo -E apt-get update && sudo -E apt-get dist-upgrade -y

# Install python and build essentials and essential libraries
sudo -E apt-get install -y python3-venv python3-pip python3-venv python3-wheel python3-dev build-essential libssl-dev libffi-dev libxml2-dev libxslt1-dev liblzma-dev libsqlite3-dev libreadline-dev libbz2-dev neovim curl git wget

# Add a couple Python prerequisites
sudo pip install -U pip setuptools wheel numpy swig torch

# Get Intel OneAPI for BLAS support
wget -O- https://apt.repos.intel.com/intel-gpg-keys/GPG-PUB-KEY-INTEL-SW-PRODUCTS.PUB \
| gpg --dearmor | sudo -E tee /usr/share/keyrings/oneapi-archive-keyring.gpg > /dev/null
echo "deb [signed-by=/usr/share/keyrings/oneapi-archive-keyring.gpg] https://apt.repos.intel.com/oneapi all main" | sudo -E tee /etc/apt/sources.list.d/oneAPI.list

# ensure we're using the latest cmake
wget -O - https://apt.kitware.com/keys/kitware-archive-latest.asc 2>/dev/null | gpg --dearmor - | sudo -E tee /usr/share/keyrings/kitware-archive-keyring.gpg >/dev/null
echo 'deb [signed-by=/usr/share/keyrings/kitware-archive-keyring.gpg] https://apt.kitware.com/ubuntu/ jammy main' | sudo -E tee /etc/apt/sources.list.d/kitware.list >/dev/null

# add the cuda tools to build against
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-keyring_1.1-1_all.deb
sudo -E dpkg -i cuda-keyring_1.1-1_all.deb

# Update and install MKL, Cmake, and Cuda-toolkit
sudo -E apt-get update
sudo -E apt install intel-oneapi-mkl cmake cuda-11-8 -y

#Verify python and pytorch work

python3 -c 'import torch; print(f"Is CUDA Available: {torch.cuda.is_available()}")'
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about the comment (f"Pytorch and CUDA are operating correctly). The problem is pytorch is not installed...


106 changes: 106 additions & 0 deletions platform/build-faiss/build.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,106 @@
#!/bin/bash

if [ -d /origin ]; then
cd /origin/platform/build-faiss
else
echo "artificialwisdomai/origin project needs to exist"
exit 1
fi

if [[ ! `id -u` -eq 0 ]]; then
echo "This needs to run as root"
exit 1
fi

export PATH=$HOME/.local/bin:$PATH
export DEBIAN_FRONTEND=noninteractive

apt-get update && apt-get dist-upgrade -y

# Install python and build essentials and essential libraries
apt-get install -y python3-venv python3-pip python3-dev build-essential libssl-dev libffi-dev libxml2-dev libxslt1-dev liblzma-dev libsqlite3-dev libreadline-dev libbz2-dev neovim curl git wget

# Update Setuptools
python3 -m pip install -U pip setuptools wheel

# Add a couple Python prerequisites
pip install numpy swig torch

# Get Intel OneAPI for BLAS support
# From: https://www.intel.com/content/www/us/en/docs/oneapi/installation-guide-linux/2023-0/apt.html

# download the key to system keyring
wget -O- https://apt.repos.intel.com/intel-gpg-keys/GPG-PUB-KEY-INTEL-SW-PRODUCTS.PUB \
| gpg --dearmor | tee /usr/share/keyrings/oneapi-archive-keyring.gpg > /dev/null

# add signed entry to apt sources and configure the APT client to use Intel repository:
echo "deb [signed-by=/usr/share/keyrings/oneapi-archive-keyring.gpg] https://apt.repos.intel.com/oneapi all main" | tee /etc/apt/sources.list.d/oneAPI.list

apt update
apt install dkms intel-basekit -y

## Get CUDA and install it

curl -sLO https://developer.download.nvidia.com/compute/cuda/12.2.0/local_installers/cuda_12.2.0_535.54.03_linux.run
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not use the apt packaging? Sure, the run packaging works, was this just testing?

bash $PWD/cuda_*run --silent --toolkit --driver --no-man-page

# ensure we're using the latest cmake
wget -O - https://apt.kitware.com/keys/kitware-archive-latest.asc 2>/dev/null | gpg --dearmor - | tee /usr/share/keyrings/kitware-archive-keyring.gpg >/dev/null

echo 'deb [signed-by=/usr/share/keyrings/kitware-archive-keyring.gpg] https://apt.kitware.com/ubuntu/ jammy main' | tee /etc/apt/sources.list.d/kitware.list >/dev/null

# add the cuda tools to build against

wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-keyring_1.1-1_all.deb
dpkg -i cuda-keyring_1.1-1_all.deb
apt-get update
apt-get install cmake cuda-toolkit -y
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

strongly conflicts with 45.


#Verify python and pytorch work

python3 -c 'import torch; print(f"Is CUDA Available: {torch.cuda.is_available()}")'

git clone https://github.com/facebookresearch/faiss
cd faiss

# Configure paths and set environment variables
export PATH=$PATH:$HOME/.local/bin:/usr/local/cuda/bin
source /opt/intel/oneapi/setvars.sh

# Configure using cmake

LD_LIBRARY_PATH=/usr/local/lib MKLROOT=/opt/intel/oneapi/mkl/2023.2.0/ CXX=g++-11 cmake -B build \
-DBUILD_SHARED_LIBS=ON \
-DBUILD_TESTING=ON \
-DFAISS_ENABLE_GPU=ON \
-DFAISS_OPT_LEVEL=avx2 \
-DFAISS_ENABLE_C_API=ON \
-DFAISS_ENABLE_PYTHON=ON \
-DCMAKE_BUILD_TYPE=Release \
-DFAISS_ENABLE_RAFT=OFF \
-DBLA_VENDOR=Intel10_64_dyn -Wno-dev .

# Now build faiss

make -C build -j$(nproc) faiss
make -C build -j$(nproc) swigfaiss
pushd build/faiss/python;python3 setup.py bdist_wheel;popd

# and install it. NOTE: this will install into the pyenv virtualenv 'aw' from the begining of the script

make -C build -j$(nproc) install
#pip install --force-reinstall build/faiss/python/dist/faiss-1.7.4-py3-none-any.whl
cp build/faiss/python/dist/faiss-1.7.4-py3-none-any.whl ../

# add libraries to /usr/local/lib
mkdir -p ../faiss-libs

for n in build/faiss/python/*so build/faiss/*so
do
cp $n ../faiss-libs/
done
tar cfz ../faiss-libs.tgz ../faiss-libs/*
rm -rf ../faiss-libs

cd ..
#rm -rf faiss
18 changes: 18 additions & 0 deletions retrieval/run_training.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
#!/bin/bash

git clone https://github.com/istio/istio.io.git /tmp/istio/
mv /tmp/istio/content/en ./en

rm -rf chunks *json
mkdir -p ./chunks

if [ -d .venv ]; then
source .venv/bin/activate
else
python3 -m venv .venv
source .venv/bin/activate
pip install -U pip wheel -r requirements.txt
pip install ../platform/build-faiss/faiss-1.7.4-py3-none-any.whl
fi

python3 train.py
4 changes: 2 additions & 2 deletions retrieval/train.py
Original file line number Diff line number Diff line change
Expand Up @@ -80,7 +80,7 @@
retro=retro,
knn=2,
chunk_size=64,
documents_path="/home/sdake/en",
documents_path="./en",
# models/RedPajama-Data-1T-Sample",
glob="**/*.md",
chunks_memmap_path="./chunks/train.chunks.dat",
Expand Down Expand Up @@ -132,7 +132,7 @@
" [aw.a]•[/aw.a] [aw.b]retrieval_model[/aw.b][aw.a]=[/aw.a][aw.b]artificialwisdomai[/aw.b][aw.a]/[/aw.a][aw.b]retroformer [aw.a]•[/aw.a] [aw.b]foundation_model[/aw.b][aw.a]=[/aw.a][aw.b]mosaicml[/aw.b][aw.a]/[/aw.a][aw.b]mpt30b[/aw.b] [aw.a]•[/aw.a] "
)
for epoch in range(EPOCH_MAX):
dataloader = iter(wrapper.get_dataloader(batch_size=4, shuffle=True))
dataloader = iter(wrapper.get_dataloader(batch_size=2, shuffle=True))
task_id = progress_bar.add_task(
description="Epoch {}".format(epoch), loss="loss=nil", total=len(dataloader)
)
Expand Down