Positioning technology is the most basic link for indoor robots to achieve autonomous positioning and navigation. It is the position of the indoor robot relative to the global coordinates in a two-dimensional working environment and its own posture. SLAM (Simultaneous Localization and Mapping) is the mainstream indoor robot positioning technology in the industry. SLAM can be divided into laser SLAM and visual SLAM.
Laser SLAM was derived from earlier range-based localization methods such as ultrasonic and infrared single point ranging. The advent and popularity of lidar have made measurements faster, more accurate, and more informative. The object information collected by lidar presents a series of scattered points with accurate angle and distance information, which is called point clouds. In general, the laser SLAM system can complete the positioning of the indoor mobile robot by matching and comparing two point clouds at different times and calculating the relative movement distance and attitude change of the lidar.
Lidar range measurement is accurate, the error model is simple, the operation in an environment other than direct light is stable, and the processing of point cloud is also relatively easy. At the same time, point cloud information itself contains direct geometric relations, which makes path planning and navigation more intuitive. The theoretical research of laser SLAM is also relatively mature, and the landing products are more abundant.
The eyes are our primary source of information, and visual SLAM has similar features. It can obtain a large amount of redundant texture information from the environment and has a strong scene identification ability. The early visual SLAM was based on filtering theory. Its nonlinear error model and a huge amount of calculation obstruct its practical implementation. In recent years, real-time visual SLAM is no longer a dream with the advancement of Bundle Adjustment, camera technology, and computing performance.
Typically, a visual SLAM system consists of a front end and a back end. The front end is responsible for the fast position and posture calculation of the robot through the visual increment formula. The back end is mainly responsible for two functions:
Find the loopback and correct the position and posture in the middle of the two visits when there is a loopback (that is, the indoor robot is judged to have returned to the vicinity of the previous visit).
Reposition the indoor autonomous robot according to the visual texture information when the front end misses the tracking. In short, the front end handles fast positioning and the back end handles slower map maintenance.
The advantage of visual SLAM is the rich texture information it leverages. For example, two billboards of the same size but different contents can't be distinguished by a point cloud-based laser SLAM algorithm, but they can be easily distinguished by visual SLAM. This brings unmatched advantages in repositioning and scene classification. At the same time, visual information can be easily used to track and predict dynamic targets in the scene, such as pedestrians and vehicles, which is crucial for the application in complex dynamic scenes. Thirdly, the projection model of visual SLAM can theoretically put objects at an infinite distance into the visual scene. With proper configuration (such as a binocular camera with long-baseline), visual SLAM can realize large-scale scene localization and map construction.