PSC Bridges2

Build Pytorch From Source on Pittsburgh Supercomputing Center

Zhongxuan Song

--

In our research, we need to modify the Pytorch source code to improve the performance of training Deep Neural Network.

This article will talk about how to build the Pytorch from the source on the Pittsburgh Supercomputing Center (PSC). The dependency might change over time, so use this as a reference. Also, this article can be a good start place for anyone who is new to the PSC or remote clusters.

I followed the instructions on https://medium.com/repro-repo/build-pytorch-from-source-on-ubuntu-18-04-1c5556ca8fbf by Zhanwen Chen.

  1. Preparation
conda create --name pytorch-build python=3.7.3 numpy=1.16.3interact -p GPU-shared — gres=gpu:2 -t 2:00:00
or
interact -p GPU --gres=gpu:8 -N 1 -t 1:00:00

Initial Dependencies

As you can see the initial dependencies do not include torch.

Here is all the code we need to build Pytorch from the source.

conda activate pytorch-buildconda install numpy pyyaml mkl mkl-include setuptools cmake cffi typing typing-extensionsconda install -c pytorch magma-cuda100export CMAKE_PREFIX_PATH="$HOME/anaconda3/envs/pytorch-build"mkdir /path_to_folder_store_the_sourcegit clone --recursive https://github.com/pytorch/pytorchexport CUDA_NVCC_EXECUTABLE="/usr/local/cuda-10.0/bin/nvcc"
export CUDA_HOME="/usr/local/cuda-10.0"
export CUDNN_INCLUDE_PATH="/usr/local/cuda-10.0/include/"
export CUDNN_LIBRARY_PATH="/usr/local/cuda-10.0/lib64/"
export LIBRARY_PATH="/usr/local/cuda-10.0/lib64"
export USE_CUDA=1 USE_CUDNN=1 USE_MKLDNN=1cd pytorch/python setup.py installUsing 2 GPUs:
start 10:52
end 11:20
python setup.py clean --all #use this command to clean the build

Walkthrough:

We first create an environment with python=3.73 and numpy=1.16.3. Then activate the environment. Install all the dependencies. Set the flags and clone the source code from Github. Finally, build the code.

Result:

By using 2 GPUs, the run time is about 28 minutes. Before this run, I also build the source code by using one GPU, but unfortunately, I did not record the time.

In the figure below, we can see the torch has been correctly installed as a dependency to our environment.

And we also pass the verification process provided in the Official docs.

All set.

Have a good one.

Zhongxuan

--

--