Earlier this year, self-driving car datasets by Lyft (LyftLevel5 dataset) and Aptiv/NuTonomy (NuScenes dataset) were released. Along with the LyftLevel5 dataset release, a competition on Kaggle was announced. To make it short: I never participated in the challenge due to time constraints. However, I build a little script to generate XYZRGB point clouds fusing cameras and LiDARs.

Why would I be insane enough to generate RGB point clouds using images and LiDAR data, especially since I’m never tired to say that LiDAR has no real value in self-driving cars but a lot for high-precision robotics. Well:

  • my workstation was blocked for contract work and research
  • my notebook is no option for deep learning
  • bad experiences with kernels/notebooks on Kaggle (random disconnects)
  • cloud services are too expensive to play around with (IMHO they are also super expensive for production use…)

So I ended up with the only sane thing I could think of: building a pipeline that could be trained on a Raspberry Pi 3 Model B ;). Who needs ASICs? ;) Inference without satisfying real-time requirements on a Raspberry Pi is not that difficult but training is a whole different story ;). Writing a (useful!) structure from motion pipeline that runs on a RP 3 model B would have been too time consuming, so I merged LiDAR and camera data. Due to the nature of the datasets’s structure I had to do memory intensive stuff - they really should build RPs with 32 GB RAM… .

The ‘nuscenes-devkit’ contains some useful tools including a function to display LiDAR points on an image. We can use this function and simply extract the RGB values from there. Since NuScenes includes some RADAR data, this could be used as well. I guess a little modification should make it work. Furthermore, we could also merge all points per scene.

Update: I removed the source code from this post and uploaded it to my “utils” repository on GitHub

Another interesting approach would be to use LiDAR data to warp all images into 3D space. I did similar things with maps (overlaying maps over a 3D digital elevation model). However, I’m afraid that the point clouds in this dataset are too sparse and distributed not evenly enough across space to yield useful results.