R. Rai and V. Rai
Clemson University, South Carolina, United States
Keywords: Autonomous Navigation, Off-Road Environments, Multi-task Perception, Frame Transformation, Motion planning
Recently, camera-based approaches have emerged as cost-effective alternatives to expensive LiDAR or real-time kinematic (RTK) systems for autonomous navigation, particularly in off-road structured environments like orchards. Developing a fully autonomous navigation stack using low-cost cameras and onboard computers presents challenges such as creating computationally efficient perception models, enabling rapid path planning without costly SLAM, and ensuring obstacle avoidance. This work introduces a novel vision-based autonomous navigation stack specifically designed for a low-cost stereo camera and inertial measurement unit (IMU). The system is composed of three core modules: (1) a multi-task perception network, (2) a frame transformation algorithm, and (3) a motion planning module. The perception network detects tree trunks, obstacles, and traversable paths simultaneously from RGB images with high efficiency (69 FPS) and remarkable accuracy (mAP@.5 of 96.7% and mIoU of 98.1%). The frame transformation algorithm integrates navigation features and transforms them into the vehicle's frame of reference. At the same time, the motion planning module utilizes the optimized dynamic window approach (DWA) to compute and track optimal paths, such as center row navigation and U-turns.