Skip to content

Conversation

@DavidLMS
Copy link
Contributor

@DavidLMS DavidLMS commented Jun 19, 2025

Problem Solved

  • Camera functionality was not working in teleoperation (appeared to be work in progress)
  • No way to detect and configure cameras for robot teleoperation and recording
  • Missing live camera streaming capabilities for real-time robot control
  • No persistence for camera configurations between sessions

Changes Made

  • Camera detection system: Direct OpenCV detection avoiding LeRobot's 60-index spam
  • Intel RealSense support: Optional RealSense camera detection with graceful fallbacks
  • Live streaming: HTTP multipart streaming for real-time camera feeds
  • Camera configuration: Complete CRUD operations for camera settings
  • Auto-preview capture: Automatic image capture during camera detection
  • Configuration persistence: Camera configs saved to ~/.cache/huggingface/lerobot/camera_configs/
  • Error handling: Robust error handling and logging throughout

Related PRs

New Files

  • app/camera_detection.py - Complete camera detection and configuration module
    • find_opencv_cameras() - Direct OpenCV detection
    • find_realsense_cameras() - Intel RealSense detection with fallbacks
    • capture_image_from_camera() - Preview image capture
    • create_camera_config_for_lerobot() - LeRobot-compatible config generation

API Endpoints Added

# Camera Detection
GET  /cameras/detect                    # Detect all cameras (with optional type filter)
GET  /cameras/summary                   # Get camera detection summary
POST /cameras/test                      # Test specific camera configuration
POST /cameras/capture                   # Capture preview image from camera

# Camera Configuration
GET  /cameras/config                    # Get saved camera configuration
POST /cameras/config/save              # Save complete camera configuration
POST /cameras/config/update            # Update specific camera in config
DELETE /cameras/config/{camera_name}    # Remove camera from configuration
POST /cameras/create-config            # Create LeRobot-compatible camera config

# Live Streaming
GET  /cameras/stream/{camera_name}      # HTTP multipart live camera stream

Enhanced Files

  • app/main.py - Added all camera endpoints and streaming logic
  • app/config.py - Added camera configuration persistence functions
    • save_camera_config() - Save camera configurations
    • get_saved_camera_config() - Load saved camera configurations
    • update_camera_in_config() - Update specific camera settings
    • remove_camera_from_config() - Remove camera from configuration

Backward Compatibility

  • All changes are additive, no breaking changes to existing functionality
  • Camera functionality is optional and doesn't affect existing robot operations
  • Graceful fallbacks when camera libraries are not available

Dependencies

  • pyrealsense2: Optional, for Intel RealSense camera support

@nicolas-rabault
Copy link
Owner

Hi @DavidLMS
I don't really understand why but I can't manage to pull your branch to test it. It's seem to be conflicting in some way...
Did you pull the camera thing I made yesterday?

@DavidLMS
Copy link
Contributor Author

DavidLMS commented Jun 19, 2025

Oh! I just saw it. This is bad luck, because I have all the other commits made yesterday. I’ll try to fix it.

@DavidLMS
Copy link
Contributor Author

Hey @nicolas-rabault

I noticed we're both working on camera functionality at the same time and wanted to reach out to coordinate and avoid conflicts.

I saw your commit 28117e4 that adds basic camera detection with /available-cameras, and also PR #4 for phone cameras with WebRTC. My PR does more comprehensive local camera detection (OpenCV + RealSense), persistent configuration, HTTP streaming, etc.

The thing is we have some functional overlap - we both implement OpenCV camera detection, just in different ways. And also conflicts in main.py since we're both adding endpoints in similar areas of the file.

But I think our work is actually complementary:

  • Your approach: external phone cameras via WebRTC (which is really cool btw)
  • My approach: local USB/RealSense cameras + configuration management

What do you think about coordinating the merge? We could:

  1. Decide which PR goes first and I rebase the other to resolve conflicts
  2. Unify the APIs under a common /cameras/* namespace
  3. Integrate both approaches - my camera detection could include "Phone Camera" as a special type that triggers your WebRTC flow

Would you prefer your PR #4 goes first and I adapt mine? I'm happy to modify whatever's needed to make them work well together.

What do you think?

@DavidLMS
Copy link
Contributor Author

Hi @nicolas-rabault!

I just rebased onto main to include the missing "Working cameras configuration" commit and enhanced the existing /available-cameras endpoint with a more advanced implementation.

Changes made:

  • Maintained the original endpoint nomenclature for compatibility.
  • Enhanced /available-cameras with comprehensive detection (OpenCV + RealSense).
  • Added preview images and detailed camera properties.
  • Kept all advanced camera management endpoints (/cameras/*).
  • Fixed missing imports and error handling.

The endpoint now provides both backward compatibility and advanced features like live streaming, configuration persistence, and robust camera detection.

Have a look! I think you'll like it

@nicolas-rabault
Copy link
Owner

Hey @DavidLMS,
Thanks for being so active!
From what I understand, you’ve resolved the conflicts and managed to combine the features — great work.
Apologies, but I might not be able to test it before Monday.
About the phone thing I opened this pull request almost as an Issue allowing us to brainstorm about how to do it. If you want to work on this, feel free to break what I did and implement your own...

@DavidLMS
Copy link
Contributor Author

ok @nicolas-rabault, just this morning I tried to check your PR, but it was giving me problems. I won't be able to test with SO-101 robot until Monday either (and I'm afraid next week will be the last week I'll have access to it until September). But I'll try to make it work. I think I've got the management of the cameras that are connected to the PC to be good. I understand your idea in the PR would be to be able to add any external cameras that are not connected directly to the PC. I can try to integrate it within the same management system, although my knowledge of WebRTC is limited, I will give it a try.

I would like, if you can, to merge my PR, to start from it with what you have done in your PR. Because if I do it from mine, I won't be able to reuse your work easily.

Thanks for all!

@nicolas-rabault
Copy link
Owner

Make sense,
I will find a few minutes to test it on a robot this afternoon so you can work on it..

@DavidLMS
Copy link
Contributor Author

Great! If you can, that would be great. I think you'll like it. If you have a mac and have ever used an iPhone with camera continuity and it's nearby, it detects it and sets it as one of the available cameras automatically. As well as virtual cameras in applications, e.g. OBS Studio. Remember to give permission in the terminal where you run Space to access the camera.
It's a pity that we have limited access to the robot, it limits the development time we have 😅.

@nicolas-rabault
Copy link
Owner

@DavidLMS I tested you PRs with a robot and I didn't manage to add the robot one.
Your PR pro :

  • Camera configuration in Teleoperation mode
  • Recording of the camera configuration
    Your PR cons :
  • In your version I believe that the time to configure a cam is a little bit longer and less straightforward, but I'm probably biased.
  • Can't deal with other cameras than camera 0
  • A few minor visual update that are not in your PR

@DavidLMS
Copy link
Contributor Author

DavidLMS commented Jun 20, 2025

Oh my bad. I´m sorry!

That's strange, it worked quite well for me. Maybe it's the operating system? It uses the same detection script as LeRobot.

I need some clarification to fix it:

  • In your version I believe that the time to configure a cam is a little bit longer and less straightforward, but I'm probably biased.

I have nothing to compare it to because there were no cameras in the version I started with. Only four frames were ready for reception, but none were operational.

  • Can't deal with other cameras than camera 0

I don´t know why. In my case, it detected up to four cameras (the robot camera, the laptop camera, the iPhone camera, and the virtual OBS Studio camera).

  • A few minor visual update that are not in your PR

Which exactly?

Perhaps we should start making screen recordings, because I think we are not seeing the same versions. I will make one on Monday, to see if we can figure out what is happening.

Thank you very much for the feedback!

@nicolas-rabault
Copy link
Owner

I'm using mac OS.

I have nothing to compare it to because there were no cameras in the version I started with. Only four frames were ready for reception, but none were operational.

You can find the camera compatible version I made on the Hugging face space if you want to.
Bu you have to go back to the main on the backend to make it work.

Which exactly?
On the first page I moved some buttons and display the work in progress for those that are difficult to use.

Let's continue on Discord it will be easier to chat... and go back here when thing are sorrted out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants