layout | title | custom_title | description | type | navigation | ||||
---|---|---|---|---|---|---|---|---|---|
home |
Home |
Apache Spark™ - Unified Engine for large-scale data analytics |
Apache Spark is a multi-language engine for executing data engineering, data science, and machine learning on single-node machines or clusters. |
page |
|
The most widely-used
engine for scalable computing
Thousands of
companies, including 80% of the Fortune 500, use Apache Spark™.
Over 2,000 contributors to the open source project from industry and academia.
Over 2,000 contributors to the open source project from industry and academia.
Spark SQL engine: under the hood
Apache Spark™ is built on an advanced distributed SQL engine
for large-scale data
Adaptive Query Execution
Spark SQL adapts the execution plan at runtime, such as automatically setting the number of reducers and join algorithms.
Support for ANSI SQL
Use the same SQL you’re already comfortable with.
Structured and unstructured data
Spark SQL works on structured tables and unstructured data such as JSON or images.
Join the community
Spark has a thriving open source community, with
contributors from around the globe building features, documentation and assisting other users.