Skip to content
Change the repository type filter

All

    Repositories list

    • setup-cli

      Public
      Sets up the Databricks CLI in your GitHub Actions workflow.
      Shell
      Other
      10000Updated Jul 31, 2024Jul 31, 2024
    • MLServer

      Public
      An inference server for your machine learning models, including support for multiple frameworks, multi-model serving and more
      Python
      Apache License 2.0
      184000Updated Jul 31, 2024Jul 31, 2024
    • This repo provides a customizable stack for starting new ML projects on Databricks that follow production best-practices out of the box.
      Python
      Apache License 2.0
      158000Updated Jul 25, 2024Jul 25, 2024
    • delta

      Public
      An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs
      Scala
      Apache License 2.0
      1.7k000Updated Jul 18, 2024Jul 18, 2024
    • kyuubi

      Public
      Kyuubi is an enhanced editon of Apache Spark's primordial Thrift JDBC/ODBC Server.
      Scala
      Apache License 2.0
      915000Updated Jul 5, 2024Jul 5, 2024
    • hive

      Public
      Apache Hive
      Java
      Apache License 2.0
      4.7k000Updated Jun 26, 2024Jun 26, 2024
    • SeaTunnel is a distributed, high-performance data integration platform for the synchronization and transformation of massive data (offline & real-time).
      Java
      Apache License 2.0
      1.8k101Updated Jun 14, 2024Jun 14, 2024
    • SeaTunnel is a distributed, high-performance data integration platform for the synchronization and transformation of massive data (offline & real-time).
      Java
      Apache License 2.0
      262001Updated Jun 14, 2024Jun 14, 2024
    • ceph

      Public
      Ceph is a distributed object, block, and file storage platform
      C++
      Other
      6k000Updated Jun 12, 2024Jun 12, 2024
    • scuttle

      Public
      A wrapper for applications to help with running Istio Sidecars
      Go
      MIT License
      53000Updated May 14, 2024May 14, 2024
    • ranger

      Public
      Apache Ranger - To enable, monitor and manage comprehensive data security across the Hadoop platform and beyond
      Java
      Apache License 2.0
      977000Updated May 7, 2024May 7, 2024
    • A re-implementation of Hadoop DistCP in Apache Spark
      Scala
      Apache License 2.0
      31000Updated Apr 9, 2024Apr 9, 2024
    • datahub

      Public
      The Metadata Platform for your Data Stack
      Java
      Apache License 2.0
      2.9k000Updated Mar 20, 2024Mar 20, 2024
    • nifi

      Public
      Apache NiFi
      Java
      Apache License 2.0
      2.7k000Updated Feb 7, 2024Feb 7, 2024
    • nes-rook

      Public
      Storage Orchestration for Kubernetes
      Go
      Apache License 2.0
      2.7k000Updated Feb 6, 2024Feb 6, 2024
    • Apache YuniKorn Web UI
      TypeScript
      Apache License 2.0
      64000Updated Feb 4, 2024Feb 4, 2024
    • Apache DolphinScheduler is the modern data workflow orchestration platform with powerful user interface, dedicated to solving complex task dependencies in the data pipeline and providing various types of jobs available `out of the box`
      Java
      Apache License 2.0
      4.6k000Updated Jan 26, 2024Jan 26, 2024
    • quetz

      Public
      The Open-Source Server for Conda Packages
      Python
      BSD 3-Clause "New" or "Revised" License
      76000Updated Jul 23, 2023Jul 23, 2023
    • rnd

      Public
      kt NexR R&D Center
      3300Updated Jun 30, 2022Jun 30, 2022
    • This project provides a reverse proxy for Spark UI on Kubernetes
      Go
      Apache License 2.0
      5000Updated May 17, 2022May 17, 2022
    • Spark 공식 문서 한국어판 번역
      Apache License 2.0
      11000Updated Jan 28, 2022Jan 28, 2022
    • API Examples for radosgw-admin-api
      Java
      0000Updated Aug 10, 2020Aug 10, 2020
    • dcos-log2loki

      Public archive
      Ship DC/OS Logs (from API) to Grafana Loki
      Go
      1000Updated Dec 16, 2019Dec 16, 2019
    • Terraform CloudStack provider
      Go
      Mozilla Public License 2.0
      34000Updated Sep 23, 2019Sep 23, 2019
    • loki

      Public archive
      Like Prometheus, but for logs.
      Go
      Apache License 2.0
      3.5k000Updated Aug 5, 2019Aug 5, 2019
    • dcos-log2es

      Public archive
      Ship DC/OS Logs (from API) to Elasticseach
      Go
      0000Updated Jul 24, 2019Jul 24, 2019
    • Terasort for Spark
      Java
      5100Updated May 27, 2019May 27, 2019
    • Use the TPC-DS benchmark to test Spark SQL performance
      TSQL
      Apache License 2.0
      95000Updated Apr 20, 2019Apr 20, 2019
    • Apache Ranger Plugin for S3
      Java
      Apache License 2.0
      13000Updated Feb 8, 2019Feb 8, 2019
    • RHive

      Public
      RHive is an R extension facilitating distributed computing via Apache Hive.
      R
      63122532Updated Jul 19, 2017Jul 19, 2017