Xonai accelerates Spark jobs in your existing cloud data platform or private cloud environment. Activate today without code changes or migrations.
Xonai built a technology to tap into the full potential of the data processing hardware and integrated it with the industry-leader big data analytics solution: the Apache Spark framework.
No application code changes required to deliver up to 80% job time reduction for enterprise-grade Spark workloads in your platform of choice
Up to 80% job time reduction
The Xonai Accelerator taps into the full potential of the underlying hardware to unlock breakthrough data processing performance and reduce enterprise-grade Spark job execution time up to 80%.
Our solution is compatible with all popular Spark runtimes and integrates with Spark SQL, caching mechanisms and commonly used data sources for big data storage.
Up to 6X faster caching
The Xonai cache serializer is enabled by default and is fully compatible with the existing Apache Spark caching mechanism, requiring zero code changes to deliver up to 6x faster cache performance and up to 2x less cache storage requirements.
Different compression schemes are available to balance performance and storage requirements even for the most demading workloads.
Spark:
compressed
uncompressed
XONAI:
lz4
zstd
uncompressed
No code changes
The Xonai Accelerator is trivially activated via Spark 3 plugin properties and does not require any application code or execution environment changes.
Plugin properties are preconfigured and hidden after a one-time installation process for all supported cloud data platforms, such as Databricks and EMR, and enterprise-grade Spark jobs seamlessly run up to 80% faster immediately after activation.
No migrations
The Xonai Accelerator is fully API-compatible with all commonly used Apache Spark runtimes such as Databricks and EMR, and delivers true hardware acceleration without needing to change Spark query plans, cluster manager, platform or any aspect of the underlying execution environment.
The solution is intentionally designed to avoid surprises on activation and be bit-by-bit compatible with Spark.
XONAI for Apache Spark
Our solution integrates with the open-source Apache Spark 3 distribution and the following data platforms:
- Amazon EMR up to 6.12.0
- Databricks up to 15.4 LTS
- Dataproc 2.0.X, 2.1.X and 2.2.X release line of versions
Note that the Xonai Accelerator is frequently being updated to support new Spark versions.
The solution is activated by a Spark 3 plugin which runs physical plans equivalent to the ones selected by Spark runtimes. In practice, the spark-submit command will point to a JAR provided by us via spark.plugins property.
Additionally, our engine requires moving a fraction of the spark.executor.memory to the spark.executor.memoryOverhead setting. While this change is currently needed as the Xonai engine allocates off-heap memory to process data rather than JVM memory, it will not be necessary in future releases as both engines will share a unified memory architecture.
Existing solutions tackle cloud spending reduction by improving resource provisioning and/or tuning application parameters, and may have only a one-time benefit only for workloads not being optimally deployed.
Our solution accelerates Spark data processing speed far beyond the default Spark engine (Catalyst), and delivers seamless hardware acceleration and reduced resource utilization regardless of how optimally deployed Spark workloads already are.
No. We intentionally designed our engine to be API-compatible with existing runtimes for Spark, including proprietary ones that may modify query plans to improve performance, such as the Databricks and EMR runtime.
As Spark is an in-memory compute engine, the more time queries spend on doing physical computations between reads and writes, the more benefit they are expected to get. These are typically high compute data transformation jobs with heavy aggregations, joins and sorting stages.
A drop-in solution that can be activated in your cloud environment with no code changes to reduce cloud costs and accelerate insight delivery.