Run Spark jobs faster and with a fraction of the spending.

Xonai accelerates Spark jobs in your existing cloud data platform or private cloud environment. Activate today without code changes or migrations.

Up to

8

0

%

Reduced cloud costs for

Reduced server costs

Taking effect immediately on solution activation
Multi-cloud
No code changes
No platform migrations
No access to data
As seen in

No application code changes required to deliver up to 80% job time reduction for enterprise-grade Spark workloads in your platform of choice

Spark SQL
DataFrame
Parquet
UDF
Platform
Amazon EMR
AWS
Azure
GCP
Google Dataproc
Private Cloud
Engine
Powered by bleeding-edge compiler infrastructure
Hardware
Spark API compatible
Info
Hadoop & Kubernetes compatible
Info
Runtime compatibility
Info
Spark SQL acceleration
Info
Caching  acceleration
Info
Faster Parquet reads
Databricks, EMR, OSS
up to 5X faster
up to 6X faster
Apache Spark

Spark:

compressed

uncompressed

XONAI:

lz4

zstd

uncompressed

Cache speedup factor

Cache storage reduction factor

spark-submit
 --conf spark.executor.memoryOverhead=1000m
 --conf spark.executor.memory=3000m
 ...
spark-submit
 --jars xonai-spark-plugin.jar
 --conf spark.plugins=com.xonai.spark.SQLPlugin
 --conf spark.executor.memoryOverhead=3000m
 --conf spark.executor.memory=1000m
 ...
Application Run Time
Spark
1 hour
XONAI
12 min
Up to 80% job time reduction

Coming soon!

Xonai Dashboard

An open-source Grafana-based application to assist Big Data infrastructure optimization initiatives where Spark applications are a dominant cost driver.

The Xonai Dashboard aggregates Spark execution metrics and spending estimates for entire clusters and down to each individual application, and with the goal of exposing optimization opportunities.

Gain detailed visibility over cost and performance metrics of EMR clusters.

Understand how XONAI reduces Spark job costs and improves resource utilization.

See detailed execution and performance metrics unlocked by our engine for Spark applications.

Gain detailed visibility over cost and performance metrics of EMR clusters.

Understand how XONAI reduces Spark job costs and improves resource utilization.

See detailed execution and performance metrics unlocked by our engine for Spark applications.

XONAI for Apache Spark

Frequently Asked Questions

Our solution integrates with the open-source Apache Spark 3 distribution and the following data platforms:

- Amazon EMR starting from 6.3.0

- Databricks up to 10.4 (preview)

- Dataproc 2.0.X and 2.1.X release line of versions (preview)

The solution is activated by a Spark 3 plugin which runs physical plans equivalent to the ones selected by Spark runtimes. In practice, the spark-submit command will point to a JAR provided by us via spark.plugins property.

Additionally, our engine requires moving a fraction of the spark.executor.memory to the spark.executor.memoryOverhead setting. This change is needed because our engine allocates off-heap memory to process data rather than JVM memory.

Existing solutions tackle cloud spending reduction by improving resource provisioning and/or tuning application parameters, and may have only a one-time benefit only for workloads not being optimally deployed.

Our solution accelerates Spark data processing speed far beyond the default Spark engine (Catalyst), and delivers hardware acceleration and reduced resource utilization regardless of how optimally deployed Spark workloads already are.

No. We intentionally designed our engine to be API-compatible with existing runtimes for Spark, including proprietary ones that may modify query plans to improve performance, such as the EMR runtime.

The more time queries spend on doing physical computations, the more benefit they are expected to get. These are typically queries with heavy aggregations, joins and sorting stages.

A drop-in solution that can be activated in your cloud environment with no code changes to reduce cloud costs and accelerate insight delivery.

Reduce Spark cloud costs today