At Terality, we aim at building the fastest and easy-to-use serverless data processing engine for data teams. We want to offer the same experience as Spark in terms of speed and scalability, but with the same syntax as pandas and in a fully serverless way.
Pandas is the most used library by data science teams. We wanted to replicate its API so that you don't have to learn another syntax nor change your existing code (contrarily to PySpark or Dask, for example).
The pandas API is massive. Reimplementing and optimizing all the pandas functions will take some more time. Because we want everyone to use Terality before we complete this task, we designed a solution to cover the vast majority of the pandas API.
Therefore, we have released a side engine that will run all the functions we haven't optimized and parallelized yet. Conceptually, this side engine behaves like if you had a large server with hundreds of gigabytes of memory.
When you call a pandas function from your Notebook or IDE, it will run on our side, whatever the function:
For you as a user, the switch between both engines is entirely transparent. We keep Terality's promise: it's fully serverless, scalable, and compatible with the pandas API.
To help us prioritize the functions we will implement in the main parallelized engine, our systems will automatically notify our team when a function called the "non-parallelized engine".
We are in beta in September 2021. Please, reach out to us from the live chat on our website or by writing to firstname.lastname@example.org. We are open to all feedback, remarks, and questions.
Terality can now provision infrastructure in less than 500 milliseconds, allowing you to get started right away with much less latency.
In our first benchmarks, we are experiencing, on average, a 10x faster experience with Terality than Pandas on 1GB+ files.
We have released a side engine that will run all the functions we haven't optimized and parallelized yet. This means that you can execute (almost) any pandas function with Terality, even the ones we haven't hand-optimized. You don't have to do anything to use this side engine: our scheduler will automatically use it when needed.