Impact of data augmentation on supervised learning for a moving mid-frequency source.
Two residual networks are implemented to perform regression for the source localization and environment classification using a moving mid-frequency source, recorded during the Seabed Characterization Experiment in 2017. The first model implements only the classification for inferring the seabed type, and the second model uses regression to estimate the source localization parameters. The training is performed using synthetic data generated by the ORCA normal mode model. The architectures are tested on both the measured field and simulated data with variations in the sound speed profile and seabed mismatch. Additionally, nine data augmentation techniques are implemented to study their effect on the network predictions. The metrics used to quantify the network performance are the root mean square error for regression and accuracy for seabed classification. The models report consistent results for the source localization estimation and accuracy above 65% in the worst-case scenario for the seabed classification. From the data augmentation study, the results show that the more complex transformations, such as time warping, time masking, frequency masking, and a combination of these techniques, yield significant improvement of the results using both the simulated and measured data.