Abstract
For centuries, khalasi (Gujarati for sailor) have skillfully harnessed ocean currents to navigate vast waters with minimal effort. Emulating this intuition in autonomous systems remains a significant challenge, particularly for Autonomous Surface Vehicles (ASVs) tasked with long-duration missions under strict energy budgets. In this work, we present a learning- based approach for energy-efficient surface vehicle navigation in vortical flow fields, where partial observability often un- dermines traditional path-planning methods. We present an end-to-end reinforcement learning framework based on Soft Actor–Critic (SAC) that learns flow-aware navigation policies using only local velocity measurements. Through extensive evaluation across diverse and dynamically rich scenarios, our method demonstrates substantial energy savings and robust generalization to previously unseen flow conditions, offering a promising path toward long-term autonomy in ocean envi- ronments. The navigation paths generated by our proposed approach show an improvement in energy conservation 30−50% compared to the existing state-of-the-art techniques.
Energy Results
| Environment | Oscillating Single Cylinder | Static Single Cylinder | Static Double Cylinder |
|---|---|---|---|
| Khalasi (Ours) | 111.38 ± 28.48 | 80.97 ± 18.93 | 98.40 ± 21.93 |
| Grid Based | 164.79 ± 8.54 | 161.30 ± 9.23 | 166.47 ± 8.65 |
| RL Based | 183.68 ± 25.42 | 174.26 ± 17.31 | 179.72 ± 24.23 |
| Mean Efficiency | 35.89% | 51.67% | 43.07% |
Navigation accuracy and energy efficiency across different agent spawn locations in various environments. The agent is spawned at positions on a 10 × 10 grid covering the entire environment, while the target is kept static at (290, 50). Each spawn point value is averaged over five test runs with different flow initializations. For each environment, the top plot shows accuracy across spawn locations, and the bottom plot shows energy efficiency. Results are shown for (A) the single oscillating cylinder environment, (B) the static single cylinder environment, and (C) the double static cylinder environment.
Sample agent trajectories in different flow environments. The color gradient along the path represents the energy used by the agent at each point. Note: the background flow shown in the figures corresponds to the initial flow at the start of each trial. (A) Trajectories in the oscillating cylinder environment. (B) Trajectories in the static single-cylinder environment (normal von K´arm´an vortex street). (C) Trajectories in the static double-cylinder environment.ATotal Energy Consumed Accuracy(%)BC
Navigation Results
| Environment | Vertical Spawn | L-shaped Spawn | Grid Spawn (10×10) |
|---|---|---|---|
| Oscillating Single Cylinder | 95% | 80% | 94.33% |
| Static Single Cylinder | 100% | 90% | 92.98% |
| Static Double Cylinder | 70% | 95% | 95.3% |
Flow Generalization Results
Sample trajectories for double-gyre system testing. Green lines indicate successful trails and red lines indicate failed ones in both cases.
Sample trajectories for quad-gyre system testing. Green lines indicate successful trails and red lines indicate failed ones in both cases.
Example test trajectory and the corresponding energy consumption on the NOAA dataset. Note: the background flow shown corresponds to the initial flow at the start.
Video Presentation
BibTeX
@article{YourPaperKey2024,
title={Your Paper Title Here},
author={First Author and Second Author and Third Author},
journal={Conference/Journal Name},
year={2024},
url={https://your-domain.com/your-project-page}
}