Source Compilation
Cylon has C++ core, Java and Python bindings. You can compile these in three steps.
Cylon can be build along with Arrow (Cylon will build Apache Arrow) or it can be build by pointing to an existing Arrow installation.
This document shows how to build Cylon on Linux and Mac OS. The first section of the document shows how to install the required dependencies on Linux (Ubuntu) and Mac OS. After required dependencies are installed, the compiling is similar in Linux and Mac OS.
#
PrerequisitesHere are the prerequisites for compiling Cylon.
- CMake 3.16.5
- OpenMPI 4.0.1 or higher (You can use any other MPI version as well, we tested with OpenMPI)
- Python 3.7 or higher
- C++ 14 or higher
#
Python EnvironmentWe need to specify a Python environment to the build script. If you're using a virtual environment, make sure to set the virtual environment path. Or you can specify /usr as the path if you're installing in the system path.
#
Create a virtual environmentHere after we assume your Python ENV path is,
#
Installing Dependencies UbuntuCylon uses MPI for distributed execution. So we need an MPI version installed in the system. There are many implementations of MPI standard such as MPICH and OpenMPI. We have tested Cylon with OpenMPI and you should be able to use any other MPI implementation like MPICH as well.
In this document we will explain how to install OpenMPI. You can use the following command to install OpenMPI on an Ubuntu system. If you would like to build OpenMPI with custom options, please refer to their documentation or you can follow the quick tutorial at the end of the document to do so.
Here are some of the other dependencies required.
We need a later version of CMake. We can build cmake from source if the version in our system is less than 3.16.5.
#
Installing Dependencies MacOSYou would need to install XCode and install an MPI version such as OpenMPI.
Once those are installed you are ready to compile Cylon on macos.
#
Build Cylon & PyCylon on Linux or Mac OSHere we will walk you through building Cylon along with Apache Arrow.
We have provided a build script to make the build process easier. It is found in Cylon source root directory.
Please note that Cylon will build Apache Arrow (both libarrow
and pyarrow
) alongside Cylon.
#
Build C++ APIsExample:
Now lets try to run an C++ example and see whether our compilation is successful.
It will generate an output like following.
#
Build Python APIsCylon provides Python APIs with Cython. Cylon will build, Cylon CPP, Cylon Python, Arrow CPP and Arrow Python here. In this mode it will install the Cylon and PyCylon libraries to the Python environment using pip. We only support pip through source builds. If you want to use an existing Cylon binary you would need to use Conda packages.
You can use the following command to build the Python library.
Here is an example command.
This command will install the PyCylon and PyArrow into the virtual environment we specified.
#
Updating library pathBefore running the code in the base path of the cloned repo you need to update the runtime library path. Linux and Mac OS uses different environment variable names. Following are two commands to update the path on these operating systems.
#
LinuxHere is an example command.
#
Mac OSHere is an example command.
After this you can verify the build.
Here is an example PyCylon programs to check whether installation is working.
Congratulations you now have successfully installed PyCylon and Cylon.
#
Running TestsYou can run Cylon tests as follows.
For C++ tests
Here is an example command.
For Python tests
Here is an example command
#
Building Cylon With An Existing Arrow InstallationIf you already have an arrow installation and wants to use that for the build, you can do so by pointing the build to that.
#
Building PyCylonInstead of building PyCylon and Apache Arrow together, you can use pyarrow
distribution frompip
as follows.
This will build only the Cylon C++ and Python APIs. Here we will use the arrow libraries from the PyArrow installation.
First lets create a Python environment and install PyArrow in it.
Then we can build Cylon pointing to this pyarrow with the following command.
Here is an example command.
After this you can run the above PyCylon examples to make sure it is working.
#
Building OpenMPI From SourceIn this section we will explain how to build and install OpenMPI 4.0.1 from source. The instructions can be used to build a higher version of OpenMPI as well.
We recommend using
OpenMPI 4.0.1
or higher.Download OpenMPI 4.0.1 from https://download.open-mpi.org/release/open-mpi/v4.0/openmpi-4.0.1.tar.gz
Extract the archive to a folder named
openmpi-4.0.1
Also create a directory named
build
in some location. We will use this to install OpenMPISet the following environment variables
The instructions to build OpenMPI depend on the platform. Therefore, we highly recommend looking into the
$OMPI_401/INSTALL
file. Platform specific build files are available in$OMPI_401/contrib/platform
directory.In general, please specify
--prefix=$BUILD
and--enable-mpi-java
as arguments toconfigure
script. If Infiniband is available (highly recommended) specify--with-verbs=<path-to-verbs-installation>
. Usually, the path to verbs installation is/usr
. In summary, the following commands will build OpenMPI for a Linux system.