Source Compilation
Cylon has C++ core, Java and Python bindings. You can compile these in three steps.
Cylon can be build along with Arrow (Cylon will build Apache Arrow) or it can be build by pointing to an existing Arrow installation.
This document shows how to build Cylon on Linux and Mac OS. The first section of the document shows how to install the required dependencies on Linux (Ubuntu) and Mac OS. After required dependencies are installed, the compiling is similar in Linux and Mac OS.
Prerequisites#
Here are the prerequisites for compiling Cylon.
- CMake 3.16.5
- OpenMPI 4.0.1 or higher (You can use any other MPI version as well, we tested with OpenMPI)
- Python 3.7 or higher
- C++ 14 or higher
Python Environment#
We need to specify a Python environment to the build script. If you're using a virtual environment, make sure to set the virtual environment path. Or you can specify /usr as the path if you're installing in the system path.
Create a virtual environment#
Here after we assume your Python ENV path is,
Installing Dependencies Ubuntu#
Cylon uses MPI for distributed execution. So we need an MPI version installed in the system. There are many implementations of MPI standard such as MPICH and OpenMPI. We have tested Cylon with OpenMPI and you should be able to use any other MPI implementation like MPICH as well.
In this document we will explain how to install OpenMPI. You can use the following command to install OpenMPI on an Ubuntu system. If you would like to build OpenMPI with custom options, please refer to their documentation or you can follow the quick tutorial at the end of the document to do so.
Here are some of the other dependencies required.
We need a later version of CMake. We can build cmake from source if the version in our system is less than 3.16.5.
Installing Dependencies MacOS#
You would need to install XCode and install an MPI version such as OpenMPI.
Once those are installed you are ready to compile Cylon on macos.
Build Cylon & PyCylon on Linux or Mac OS#
Here we will walk you through building Cylon along with Apache Arrow.
We have provided a build script to make the build process easier. It is found in Cylon source root directory.
Please note that Cylon will build Apache Arrow (both libarrow and pyarrow) alongside Cylon.
Build C++ APIs#
Example:
Now lets try to run an C++ example and see whether our compilation is successful.
It will generate an output like following.
Build Python APIs#
Cylon provides Python APIs with Cython. Cylon will build, Cylon CPP, Cylon Python, Arrow CPP and Arrow Python here. In this mode it will install the Cylon and PyCylon libraries to the Python environment using pip. We only support pip through source builds. If you want to use an existing Cylon binary you would need to use Conda packages.
You can use the following command to build the Python library.
Here is an example command.
This command will install the PyCylon and PyArrow into the virtual environment we specified.
Updating library path#
Before running the code in the base path of the cloned repo you need to update the runtime library path. Linux and Mac OS uses different environment variable names. Following are two commands to update the path on these operating systems.
Linux#
Here is an example command.
Mac OS#
Here is an example command.
After this you can verify the build.
Here is an example PyCylon programs to check whether installation is working.
Congratulations you now have successfully installed PyCylon and Cylon.
Running Tests#
You can run Cylon tests as follows.
For C++ tests
Here is an example command.
For Python tests
Here is an example command
Building Cylon With An Existing Arrow Installation#
If you already have an arrow installation and wants to use that for the build, you can do so by pointing the build to that.
Building PyCylon#
Instead of building PyCylon and Apache Arrow together, you can use pyarrow distribution frompip as follows.
This will build only the Cylon C++ and Python APIs. Here we will use the arrow libraries from the PyArrow installation.
First lets create a Python environment and install PyArrow in it.
Then we can build Cylon pointing to this pyarrow with the following command.
Here is an example command.
After this you can run the above PyCylon examples to make sure it is working.
Building OpenMPI From Source#
In this section we will explain how to build and install OpenMPI 4.0.1 from source. The instructions can be used to build a higher version of OpenMPI as well.
We recommend using
OpenMPI 4.0.1or higher.Download OpenMPI 4.0.1 from https://download.open-mpi.org/release/open-mpi/v4.0/openmpi-4.0.1.tar.gz
Extract the archive to a folder named
openmpi-4.0.1Also create a directory named
buildin some location. We will use this to install OpenMPISet the following environment variables
The instructions to build OpenMPI depend on the platform. Therefore, we highly recommend looking into the
$OMPI_401/INSTALLfile. Platform specific build files are available in$OMPI_401/contrib/platformdirectory.In general, please specify
--prefix=$BUILDand--enable-mpi-javaas arguments toconfigurescript. If Infiniband is available (highly recommended) specify--with-verbs=<path-to-verbs-installation>. Usually, the path to verbs installation is/usr. In summary, the following commands will build OpenMPI for a Linux system.