Skip to main content

Cylon

Data Engineering Everywhere!

[object Object]

Fast & Scalable

Cylon uses OpenMPI underneath. It provides core data processing operators many times efficiently than current systems.

[object Object]

Designed to be Integrated

Cylon is designed to work across different data processing frameworks, deep learning frameworks and data formats.

[object Object]

Powered by Apache Arrow

Cylon uses Apache Arrow underneath to represent data.

[object Object]

BYOL, Bring Your Own Language!

Write in the language you are already familiar with, yet experience the same native performance.

1
2
3
4
5
6
7
8
9
10
11
12
from pycylon import read_csv, DataFrame, CylonEnv
from pycylon.net import MPIConfig

config: MPIConfig = MPIConfig()
env: CylonEnv = CylonEnv(config=config, distributed=True)

df1: DataFrame = read_csv('/tmp/csv1.csv')
df2: DataFrame = read_csv('/tmp/csv2.csv')
                
df3: Table = df1.join(other=df2, on=[0], algorithm="hash", env=env)
        
print(df3)

Written with Performance & Scalability in Mind!