UrbanScene3D: A Large Scale Urban Scene Dataset and Simulator

submited by
Style Pass
2021-07-12 02:30:05

The ability to perceive the environments in different ways is essential to robotic research. This involves the analysis of various kinds of data sources, such as depth map, visual image, and LIDAR data, etc. Related works in 2D/2.5D [1, 2] image domains have been proposed. However, a comprehensive understanding of 3D scenes needs the cooperation of 3D data (e.g., point clouds and textured polygon meshes), which is still far from sufficient in the community.

We present a large scale urban scene dataset associated with a handy simulator based on Unreal Engine 4 [3] and AirSim [4], which consists of both man-made and real-world reconstruction scenes in different scales, referred to as UrbanScene3D. The manually made scene models have compact structures, which are carefully constructed/designed by professional modelers according to the images and maps of target areas; see the first row of Figure 1 for a glance. In contrast, UrbanScene3D also offers dense, detailed scene models reconstructed by aerial images through multi-view stereo (MVS) techniques. These scenes have realistic textures and meticulous structures; see e.g., the second part of Figure 1. We have also released the originally captured aerial images that have been used to reconstruct the 3D scene models, as well as a set of 4K video sequences that would facilitate designing algorithms, such SLAM and MVS; please check some samples shown in the third and fourth parts of Figure 1.

Although there are 3D instance segmentation datasets, e.g., S3DIS [5], ScanNet [6], NYUv2 [7], and SceneNN [8], they are all collected from indoor scenes and not enough for deep learning-based methods. Please noting that, there is basically no decent data for learning 3D instance segmentation in outdoor scenes, especially complicated urban regions. In this context, our released UrbanScene3D provides rich, large- scale 3D urban scene building annotation data for outdoor instance segmentation research. To segment and label 3D urban architectures, we have to extract all single building models from the entire urban scene. Every building model is then assigned with an unique label forming an instance segmentation map, which indicates the ground-truth of the instance segmentation task. The provided 3D ground-truth textured models with instance segmentation label in UrbanScene3D allow users to obtain all kinds of data they would like to have: instance segmentation map, depth map in arbitrary resolution, 3D point cloud/mesh in both visible and invisible place, etc. In addition, with the help of AirSim [4], users can also simulate the robots (cars/drones) to test a variety of autonomous tasks in the proposed city environment; see e.g., the bottom row of Figure 1.

Leave a Comment