Run Spark jobs faster and with a fraction of the spending.

Xonai accelerates Spark jobs in your existing cloud data platform or private cloud environment. Activate today without code changes or migrations.

Up to

8

0 %

Reduced cloud costs for

Reduced server costs

Taking effect immediately on solution activation

Multi-cloud

No code changes

No platform migrations

No access to data

As seen in

Xonai Accelerator

Xonai built a technology to tap into the full potential of the data processing hardware and integrated it with the industry-leader big data analytics solution: the Apache Spark framework.

No application code changes required to deliver up to 80% job time reduction for enterprise-grade Spark workloads in your platform of choice

Spark SQL

DataFrame

Parquet

UDF

Platform

Amazon EMR

AWS

Azure

GCP

Google Dataproc

Private Cloud

Engine

Hardware

Up to 80% job time reduction

Breakthrough
acceleration

The Xonai Accelerator taps into the full potential of the underlying hardware to unlock breakthrough data processing performance and reduce enterprise-grade Spark job execution time up to 80%.

Our solution is compatible with all popular Spark runtimes and integrates with Spark SQL, caching mechanisms and commonly used data sources for big data storage.

Spark API compatible

Hadoop & Kubernetes compatible

Runtime compatibility

Spark SQL acceleration

Caching acceleration

Faster Parquet reads

Databricks, EMR, OSS

up to 5X faster

up to 6X faster

Apache Spark

Up to 6X faster caching

Faster
caching

The Xonai cache serializer is enabled by default and is fully compatible with the existing Apache Spark caching mechanism, requiring zero code changes to deliver up to 6x faster cache performance and up to 2x less cache storage requirements.

Different compression schemes are available to balance performance and storage requirements even for the most demading workloads.

Spark:

compressed

uncompressed

XONAI:

lz4

zstd

uncompressed

Cache speedup factor

Cache storage reduction factor

No code changes

Seamless activation

The Xonai Accelerator is trivially activated via Spark 3 plugin properties and does not require any application code or execution environment changes.

Plugin properties are preconfigured and hidden after a one-time installation process for all supported cloud data platforms, such as Databricks and EMR, and enterprise-grade Spark jobs seamlessly run up to 80% faster immediately after activation.

spark-submit
--conf spark.executor.memoryOverhead=1000m
--conf spark.executor.memory=3000m
...

spark-submit
--jars xonai-spark-plugin.jar
--conf spark.plugins=com.xonai.spark.SQLPlugin
--conf spark.executor.memoryOverhead=3000m
--conf spark.executor.memory=1000m
...

Application Run Time

Spark

1 hour

XONAI

12 min

Up to 80% job time reduction

No migrations

Environment independent

The Xonai Accelerator is fully API-compatible with all commonly used Apache Spark runtimes such as Databricks and EMR, and delivers true hardware acceleration without needing to change Spark query plans, cluster manager, platform or any aspect of the underlying execution environment.

The solution is intentionally designed to avoid surprises on activation and be bit-by-bit compatible with Spark.

XONAI for Apache Spark

Frequently Asked Questions

Which Spark versions and platforms are supported?

Our solution integrates with the open-source Apache Spark 3 distribution and the following data platforms:

- Amazon EMR up to 6.12.0

- Databricks up to 15.4 LTS

- Dataproc 2.0.X, 2.1.X and 2.2.X release line of versions

‍

Note that the Xonai Accelerator is frequently being updated to support new Spark versions.

How does it plug into Spark?

The solution is activated by a Spark 3 plugin which runs physical plans equivalent to the ones selected by Spark runtimes. In practice, the spark-submit command will point to a JAR provided by us via spark.plugins property.

‍

Additionally, our engine requires moving a fraction of the spark.executor.memory to the spark.executor.memoryOverhead setting. While this change is currently needed as the Xonai engine allocates off-heap memory to process data rather than JVM memory, it will not be necessary in future releases as both engines will share a unified memory architecture.

How is it different from other cloud cost optimization tools?

Existing solutions tackle cloud spending reduction by improving resource provisioning and/or tuning application parameters, and may have only a one-time benefit only for workloads not being optimally deployed.

‍

Our solution accelerates Spark data processing speed far beyond the default Spark engine (Catalyst), and delivers seamless hardware acceleration and reduced resource utilization regardless of how optimally deployed Spark workloads already are.

Do I need to change code or Spark properties?

No. We intentionally designed our engine to be API-compatible with existing runtimes for Spark, including proprietary ones that may modify query plans to improve performance, such as the Databricks and EMR runtime.

What type of queries work best with Xonai?

As Spark is an in-memory compute engine, the more time queries spend on doing physical computations between reads and writes, the more benefit they are expected to get. These are typically high compute data transformation jobs with heavy aggregations, joins and sorting stages.