Skip to content

Google Summer of Code Ideas 2025

Christie Jacob edited this page Feb 12, 2025 · 18 revisions

Google Summer of Code

⚠️ ⚠️ [ READ APPLICATION PROTOCOL] ⚠️ ⚠️

Projects Index

  1. Implement efficient SmolVLM
  2. Apriltag and QR code detectors
  3. Pose Estimation algorithms
  4. Implement GPU image transforms
  5. Depth Estimation on Android Using Rust
  6. Implement modular VO Slam API

** projects ideas are ordered by soft priority

Projects

  1. Implement efficient Smol VLM

    • Description: The aim of this project is to bring State of the Art methods for lightweight Visual Language models into Kornia in Rust. Recently, Hugging Face released a family of small visual language models (SmolVLM) which can be a game changer for the industry to build applications in embedded devices using such AI models.

    • Expected Outcomes: The expectation is to implement an API in Rust to integrate the latest SmolVLM model. Evaluate which NN engine suits better for this task, either onnxruntime (via ORT-Pyke) or Candle. It’s expected also to evaluate a native Rust implementation of the model and its feasibility. Tests and benchmarks in both desktop and nvidia jetson are also mandatory as part of the final delivery.

    • Resources:

    • Possible Mentors: Miquel Farre

    • Difficulty: Medium

    • Duration: 350 hours

  2. Apriltag and QR code detectors

    • Description: This project aims to implement in Kornia Rust curated detection algorithms useful for localization and calibration purposes. It’s very common to use April tags for camera calibration applications; or QR code detectors for more advanced machine vision applications. In the area of visual localization, common algorithms are FAST (because it is really fast), or ORB. With such methods kornia-rs could be used in many robotics applications together with existing foundational libraries such sophus-rs.

    • Expected Outcomes: Implement efficient CPU versions of the above detectors with proper testing and benchmarking against common libraries such OpenCV or VPI. It’s expected also as part of the delivery to work together with the author of sophus-rs to show an integration of a calibration system using e.g the april tag detector and/or using FAST features for a Visual SLAM implementation.

    • Resources:

    • Possible Mentors: Hauke Strasdat

    • Difficulty: Hard

    • Duration: 350 hours

  3. Pose Estimation algorithms

    • Description: This project aims to enhance kornia-rs's capabilities in 3D registration and pose estimation by implementing and optimizing key algorithms, including:

      • Improve existing Iterative Closest Point (ICP) for point cloud alignment by adding Point to Normal loss, or evaluate again KISS-ICP. Additionally, implement Truncated Signed Distance Function (TSDF).
      • Perspective-n-Point (PnP) and Efficient PnP (EPnP) for camera pose estimation
      • Affine and similarity transformations (e.g., cv2.AffineTransform3D) for rigid and non-rigid transformations
      • Potential optimizations for real-time performance and robustness
    • Expected Outcomes:

      • A set of efficient, well-tested Rust implementations of the above algorithms
      • Integration with kornia-rs for seamless use in robotics, AR, and SLAM applications
      • Benchmarking and comparison with existing implementations for accuracy and performance
      • Comprehensive documentation and examples
    • Resources:

      • Rust Book: https://doc.rust-lang.org/book/
      • OpenCV’s implementations of PnP, EPnP, and affine transformations
      • Research papers on pose estimation and 3D alignment
      • Existing Rust crates for linear algebra and optimization
    • Possible Mentors: Dmytro Mishkin

    • Difficulty: Hard

    • Duration: 350 hours

  4. Implement GPU image transforms

    • Description: This project has the main objective to improve the existing image transformations in the kornia-imgproc crate which are currently implemented in native CPU to be upgraded to GPU. Ideally, propose a plan to upgrade the functionality in this crate into efficient GPU implementations via CubeCL or similar that can support WGP in native Rust. An example would be geometric sampling transformations like resize, or warp_affine/warp_perspective. Additionally, color transformations and Distance Transform would be very welcomed.

    • Expected Outcomes: Finished implementations of the functions with proper documentation, testing and benchmarking against existing libraries such OpenCV or Nvidia VPI. Examples and tutorials are expected as part of the end of the project delivery as testing in Nvidia Jetsons.

    • Resources:

    • Possible Mentors: Edgar Riba

    • Difficulty: Hard

    • Duration: 350 hours

  5. Depth Estimation on Android Using Rust

    • Description: This project aims to develop an Android application using Rust to perform real-time depth estimation with ONNX Runtime via Rust bindings for ONNX Runtime. By leveraging Rust’s performance efficiency and memory safety, the application will enable on-device AI processing for applications in robotics, augmented reality (AR), SLAM, and 3D scene reconstruction.

    • Expected Outcomes:

      • A fully functional Rust-based Android application capable of real-time depth estimation.
      • Integration of ONNX Runtime via Pyke’s Rust bindings for optimized AI inference.
      • Utilization of Android’s official Rust build system for compatibility and performance.
      • A lightweight, user-friendly UI for depth visualization.
      • Performance benchmarking to compare execution efficiency.
      • Comprehensive documentation and a demo video showcasing real-world applications.
    • Resources:

      • Android’s Official Build System for Rust Modules: Overview
      • ONNX Runtime with Rust Bindings: ORT
      • Depth Estimation Models: [Depth-Anything, MiDaS, etc.]
    • Possible Mentors: Christie Jacob

    • Difficulty: Hard

    • Duration: 350 hours

  6. Implementation of VO Slam API