- Stephen James
- paper:Coarse-to-fine Q-attention with Tree Expansion
- We envision coarse-to-fine Q-attention as a tree that can be expanded and used to accumulate value estimates across the top-k voxels at each Q-attention depth.
- Srinath Sridhar
- CVPR work(ConDor):How can we train our neural networks to make 3D chairs, tables, and other objects upright? In our upcoming work we investigate the problem of "canonicalizing" the 3D pose of common object categories without any supervision.
- ConDor uses Tensor Field Networks (TFNs - neural networks that are equivariant to point permutation, 3D rotation, and translation) to estimate a canonical frame of reference that is learned using self-supervision losses on common shape categories. ConDor can also handle partial 3D shapes as shown below. Surprisingly, it can also consistently co-segment shapes without any supervision.
- Project Webpage: ivl.cs.brown.edu/ConDor/
- Xiaohua Zhai
- We release the Big Vision codebase, a JAX library originally used to develop ViT, Mixer, ViT-G, LiT, and more! Together, a better plain ViT-S/16 baseline (76.5% ImageNet, 90 epochs) is provided, as a simple and strong starting point. We are thrilled to announce the Big Vision codebase that supports training large-scale vision models on Google Cloud TPUs. It scales seamlessly from a single core to up to 2048 cores!
- github
- Daqi Lin
- Want real-time global illumination beyond diffuse? Introducing ReSTIR Path Tracing (ReSTIR PT) that allows you to reuse paths through glass and other complex interactions, based on a new theory we develop - Generalized Resampled Importance Sampling.
- github
- Zhiqin Chen
- Announcing Neural Dual Contouring (NDC), a new data-driven approach to reconstructing meshes from all kinds of inputs: grids of signed or unsigned distances, binary voxels, or point clouds (without normals). Compared to our prior work Neural Marching Cubes, it is simpler, faster, more robust, and able to take unsigned inputs.
- github
- Andrea Tagliasacchi:The network is VERY simple, given a multitude of input formats use a neural network to regress:
- polygon existence on facets (... just a grid)
- vertex coordinates within cells (... just a grid) Then you stitch everything up with the classical dual contouring logic... Voilà!
- Zirui Wang
- CoCa: a new image-text foundation model subsuming single-encoder, dual-encoder and encoder-decoder. SOTA results on 19 unimodal/multimodal/alignment tasks including 86.3% zero-shot top-1 ImageNet, 90.6% with a frozen encoder, 91.0% when finetuned.
- link
- Iliyan Georgiev
- AK
- Rana Hanocka
- paper(#SIGGRAPH2022):GANimator: Neural Motion Synthesis from a Single Sequence
- GANimator can produce novel animations for unique creatures that don't have large motion datasets! For example, this hexapedal crab.
- Xiaowei Zhou
- paper(CVPR 2022 oral):Neural 3D Scene Reconstruction with the Manhattan-world Assumption
- Wenzel Jakob
- paper(SIGGRAPH'22):Differentiable Signed Distance Function Rendering
- We're excited to present a new method to render Signed Distance Functions (SDFs) in a differentiable manner, enabling high-fidelity image-based shape reconstruction.
-
The IEEE Conference on Secure and Trustworthy Machine Learning (SaTML) will take place for the 1st time on Feb 8-10, 2023!
- This conference will focus on the theoretical and practical understandings of vulnerabilities inherent to ML systems, explore the robustness of ML algorithms and systems, and aid in developing a unified, coherent scientific community which aims to build trustworthy ML systems.
- We welcome submissions on the following topics that relate to ML systems: trustworthy data curation, attacks and defenses, forensic analysis, verifying properties, securely and safely integrating ML into systems, privacy, fairness, accountability, transparency, interpretability.
-
talk recording
-
Experience with Nvidia's Instant NeRFs. -Hands on With Nvidia Instant NeRFs
-
course from Yi Ma -Geometry&Learning for 3D Vision
-
course from Kosta Derpanis -DeepLearning in Computer Vision
- We are looking for a PostDoc at the Computer Vision and Geometry Group (CVG) at ETH Zürich. The candidate should have strong expertise in 3D vision and/or mobile robotics and have papers published at top-tier ML, robotics, or compu -link
- Regular reminder that Qualcomm AI Research is hiring DL researchers and software engineers! -link