Low-Cost Vision-Based Navigation Stack for Off-Road Autonomous Systems

R. Rai and V. Rai
Clemson University, South Carolina, United States

Keywords: Autonomous Navigation, Off-Road Environments, Multi-task Perception, Frame Transformation, Motion planning

Recently, camera-based approaches have emerged as cost-effective alternatives to expensive LiDAR or real-time kinematic (RTK) systems for autonomous navigation, particularly in off-road structured environments like orchards. Developing a fully autonomous navigation stack using low-cost cameras and onboard computers presents challenges such as creating computationally efficient perception models, enabling rapid path planning without costly SLAM, and ensuring obstacle avoidance. This work introduces a novel vision-based autonomous navigation stack specifically designed for a low-cost stereo camera and inertial measurement unit (IMU). The system is composed of three core modules: (1) a multi-task perception network, (2) a frame transformation algorithm, and (3) a motion planning module. The perception network detects tree trunks, obstacles, and traversable paths simultaneously from RGB images with high efficiency (69 FPS) and remarkable accuracy (mAP@.5 of 96.7% and mIoU of 98.1%). The frame transformation algorithm integrates navigation features and transforms them into the vehicle's frame of reference. At the same time, the motion planning module utilizes the optimized dynamic window approach (DWA) to compute and track optimal paths, such as center row navigation and U-turns.