Cylon can be built and used through a Conda environment. There are Conda packages for Cylon C++ and Python libraries (libcylon and pycylon).
The following command will install the latest version of Cylon.
Now you can run an example to see if everything is working fine.
Now lets try to build Cylon in a Conda environment.
- Ubuntu 16.04 or higher
First download and install Conda for your Linux distribution.
Here are some commands used to install conda. Note this is an example and you can choose your own version of Conda.
After installing conda we need to activate the conda environment.
Here are the commands to build Cylon using the conda-build. These commands will build the Cylon and PyCylon packages. We need conda-build package to build Cylon.
After that you can use PyCylon or libcylon as explained above.
Here, Built files can be found in the
(build directory can be specified from the command line with
Additionally, Cylon libraries would also be installed to
GPU Cylon (gcylon) provides distributed dataframe processing on NVIDIA GPUs. There are two libraries for gcylon:
- a cpp library (libgcylon)
- a python library (pygcylon)
GCylon libraries depend on Cylon libraries and NVIDIA libraries: cudatoolkit and cudf
Since cudatoolkit and cudf libraries are rather large, we provide a separate conda environment for installing and compiling gcylon.
The easiest way to compile and run gcylon is through a conda environment. We provide a conda environment yml file. It has all dependencies listed.
- Clone the cylon project to your machine from github if not already done.
- Make sure you have anaconda or miniconda installed. If not, please install anaconda or miniconda first.
- Install cudatoolkit 11.0 or higher.
- Make sure your machine is Linux and has:
- NVIDIA driver 450.80.02+
- A GPU with Pascal architecture or better (Compute Capability >=6.0)
Go to cylon project directory on the command line.
Check your cudatoolkit installation version. You can check it with:
If your cudatoolkit version is not 11.2, update the cudatoolkit version at the file: conda/environments/gcylon.yml
Create the conda environment and install the dependencies, activate the conda environment:
Compile and Install Cylon cpp and python packages:
Compile and Install GCylon cpp and python packages:
Checking whether pycylon and pygcylon packages are installed after the compilation:
Running the join example from gcylon examples directory: Running with 2 mpi workers (-n 2) on the local machine:
To enable ucx, add the flags "--mca pml ucx --mca osc ucx" to the mpirun command.
To enable infiniband, add the flag "--mca btl_openib_allow_ib true" to the mpirun command.
To run the join example with both ucx and infiniband enabled on the local machine with two mpi workers:
Other examples in the python/pygcylon/examples/ directory can be run similarly.
In addition to use terminal, you can also use the Conda environment in your preferred IDE's.
Open Cylon as a C++ project, and assign
cylon/cpp/CmakeLists.txtas main CMake file.
CONDA_PREFIX=<path to env>environment variable for the IDE
Add a CMake build directory (ex:
Use the following CMake options