FEATURES OF NEURAL NETWORK APPLICATION IN RECOGNIZING ANTHROPOGENIC OBJECTS WITH VARIABLE CONTOURS
Abstract and keywords
Abstract:
The study analyzes the effectiveness of semantic segmentation and instance segmentation methods for identifying anthropogenic objects with varying degrees of boundary variability in aerospace imagery. Neural network models such as U-Net, PSPNet, DeepLabv3+, SegFormer, Twins-PCPVT, ConvNeXt, YOLOv7, YOLOv8, YOLOv9, and YOLOv11 are utilized. The authors categorize object contour variability into three levels and examine its impact on model accuracy and generalizability. A key focus of the study is the relationship between contour variability and the effectiveness of deep learning approaches. The research involves annotating remote sensing data to determine the degree of boundary variability, conducting experiments with neural networks, and developing an algorithm to compare the performance of networks belonging to different segmentation types. The paper also discusses segmentation quality metrics and their interpretation nuances. The results demonstrate that semantic segmentation models are more effective for detecting large-area objects with pronounced boundary variability, while instance segmentation models achieve high recognition accuracy for objects with minimal boundary variability. In conclusion, the authors emphasize the critical role of contour variability in data preparation and segmentation method selection. They highlight the need for further research to enhance model training and improve object detection reliability.

Keywords:
semantic segmentation, instance segmentation, aerospace imagery, metrics, deep learning, dataset
Text
Text (RU) (PDF): Read Download
References

1. Long J., Shelhamer E., Darrell T. Fully convolutional networks for semantic segmentation // Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Boston: IEEE, 2015. P. 3431–3440. DOIhttps://doi.org/10.1109/CVPR.2015.7298965.

2. He K., Gkioxari G., Dollar P., et al. Mask R-CNN // Proceedings of the IEEE International Conference on Computer Vision. Venice: IEEE, 2017. P. 2980–2988. DOIhttps://doi.org/10.1109/ICCV.2017.322.

3. Zhao H., Puig X., Xiao T., et al. Semantic Understanding of Scenes through the ADE20K Dataset // Preprint arXiv.org, 2016. [Elektronnyy resurs]. Rezhim dostupa: https://arxiv.org/pdf/1608.05442 (data obrascheniya: 11.07.2024).

4. Ronneberger O., Fischer Ph., Brox T. U-Net: Convolutional Networks for Biomedical Image Segmentation // Preprint arXiv.org, 2015. [Elektronnyy resurs]. Rezhim dostupa: https://arxiv.org/pdf/1505.04597 (data obrascheniya: 11.07.2024).

5. Zhao H., Shi J., Qi X., et al. Pyramid scene parsing network // Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017. P. 6230–6239. DOIhttps://doi.org/10.1109/CVPR.2017.660.

6. Chen L.C., Zhu Y., Papandreou G., et al. Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation // Proceedings of 15th European Conference “Computer Vision – ECCV 2018”, Munich, September 8–14, 2018. Cham: Springer, 2018. P. 833–851. DOIhttps://doi.org/10.1007/978-3-030-01234-2_49.

7. Xie E., Wenhai W., Zhiding Y. SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers // Proceedings of the 35th International Conference on Neural Information Processing Systems. New York: Red Hook, 2021. P. 12077–12090.

8. Chu X., Tian Z., Wang Y., et al. Twins: Revisiting the Design of Spatial Attention in Vision Transformers // Proceedings of the 35th International Conference on Neural Information Processing Systems. New York: Red Hook, 2021. P. 9355–9367.

9. Liu Z., Mao H., Wu C.-Y., et al. A ConvNet for the 2020s // Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. New Orleans: IEEE, 2022. P. 11966–11976. DOIhttps://doi.org/10.1109/CVPR52688.2022.01167.

10. Wang C.Y., Bochkovskiy A., Liao H.Y.M. YOLOv7: Trainable Bag-of-Freebies Sets New State-of-the-Art for Real-Time Object Detectors // Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Vancouver: IEEE, 2023. P. 7464–7475. DOIhttps://doi.org/10.1109/CVPR52729.2023.00721.

11. Wang Ch.-Y., Yeh I.H., Mark Liao H.-Y. YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information // Proceedings of 18th European Conference “Computer Vision – ECCV 2024”, Milan, September 29 – October 4, 2024. Cham: Springer, 2024. P. 1–21. DOIhttps://doi.org/10.1007/978-3-031-72751-1_1.

12. Sohan M., Sai Ram T., Rami Reddy Ch.V. A Review on YOLOv8 and its Advancements // Proceedings of the International Conference on Data Intelligence and Cognitive Informatics. Singapore: Springer, 2024. P. 529–545. DOIhttps://doi.org/10.1007/978-981-99-7962-2_39.

13. Terven J., Cordova-Esparza D.-M., Romero-González J.-A. A Comprehensive Review of YOLO Architectures in Computer Vision: From YOLOv1 to YOLOv8 and YOLO-NAS // MAKE. 2023. Vol. 5. No. 4. P. 1680–1716. DOIhttps://doi.org/10.3390/make5040083.

14. Shorten C., Khoshgoftaar T.M. A survey on Image Data Augmentation for Deep Learning // Journal of Big Data. 2019. Vol. 6. P. 60. DOIhttps://doi.org/10.1186/s40537-019-0197-0.

15. Zoph B., Cubuk E.D., Ghiasi G., et al. Learning data augmentation strategies for object detection // Preprint arXiv.org, 2019. [Elektronnyy resurs]. Rezhim dostupa: https://arxiv.org/pdf/1906.11172 (data obrascheniya: 29.04.2024).

16. Ghaffar M.A.A., McKinstry A., Maul T., et al. Data augmentation approaches for satellite image super-resolution // ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences. 2019. Vol. IV-2/W7. P. 47–54. DOIhttps://doi.org/10.5194/isprs-annals-IV-2-W7-47-2019.

17. Chen L., Wu Y., Stegmaier J., et al. SortedAP: Rethinking evaluation metrics for instance segmentation // Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops. Paris: IEEE, 2023. P. 3925–3931. DOIhttps://doi.org/10.1109/ICCVW60793.2023.00424.

Login or Create
* Forgot password?