CURRENT STATE AND PROSPECTS OF INCREASING THE FUNCTIONALITY OF AUGMENTED REALITY USING NEURAL NETWORKS

I.V. Zhabokrytskyi

Èlektron. model. 2022, 44(5):73-89

https://doi.org/10.15407/emodel.44.05.073

ABSTRACT

The dynamics of the development of modern society and the rapid breakthrough of the technological component led to the need to interact with fast-changing and client-oriented information in real time. This need is met through the use of augmented reality technology, which allows users to interact in real time with both the real physical and virtual digital worlds. The rapid digitization of human existence has provoked an exponential increase in the amount of existing data, thereby posing new challenges to the scientific community. At the same time, the technology of deep learning, which is successfully applied in various fields, has a rather large potential. The purpose of this study is to present the potential of combining technologies of augmented reality and deep learning, their mutual improvement and further application in the development of modern highly intelligent programs. The work briefly provides an understanding of the concepts of augmented and mixed reality and also describes the technology of deep learning. Based on the literature review, relevant studies on the development of augmented reality applications and systems using these technologies are presented and analyzed. After discussing how the integration of deep learning into augmented reality increases the quality and efficiency of applications and facilitates the daily life of their users, conclusions and suggestions for future research are provided.

KEYWORDS

augmented reality; machine learning; deep learning; neural networks, virtual reality.

REFERENCES

  1. Dunleavy, M. (2014), “Design principles for augmented reality learning”, TechTrends, Vol. 58, 1, pp. 28-34.
    https://doi.org/10.1007/s11528-013-0717-2
  2. Enyedy, N., Danish, J.A. and DeLiema, D. (2015), “Constructing liminal blends in a collaborative augmented-reality learning environment”, Int. J. Comput.-Support. Collaborat. Learn, Vol. 10, no. 1, pp. 7-34.
    https://doi.org/10.1007/s11412-015-9207-1
  3. Lee, K. (2012), “Augmented reality in education and training”, TechTrends, Vol. 56, no. 2, pp. 13-21.
    https://doi.org/10.1007/s11528-012-0559-3
  4. Chen, P., Liu, X., Cheng, W. and Huang, R. (2017), “A review of using augmented reality in education from 2011 to 2016”, Innovations in smart learning, pp. 13-18.
    https://doi.org/10.1007/978-981-10-2419-1_2
  5. Di Serio, Á., Ibáñez, M.B. and Kloos, C.D. (2013), “Impact of an augmented reality system on students’ motivation for a visual art course”, Comput. Educ., Vol. 68, pp. 586-596.
    https://doi.org/10.1016/j.compedu.2012.03.002
  6. Wu, J., Ma, L. and Hu, X. (2017), “Delving deeper into convolutional neural networks for camera relocalization”, 2017 IEEE International Conference on Robotics and Automation (ICRA), pp. 5644-5651.
    https://doi.org/10.1109/ICRA.2017.7989663
  7. Azuma, R.T. (1997), “A survey of augmented reality”, Teleop. Virt. Environ., Vol. 6, no. 4, pp. 355-385.
    https://doi.org/10.1162/pres.1997.6.4.355
  8. Billinghurst, M., Clark, A., Lee, G., et al. (2015), “A survey of augmented reality”, Found. Trends® Human–Comput. Interact., Vol. 8, no. 2–3, pp. 73-272.
    https://doi.org/10.1561/1100000049
  9. Furht, B. (2011), Handbook of Augmented Reality, Springer Science & Business Media.
    https://doi.org/10.1007/978-1-4614-0064-6
  10. Azuma, R.T., Baillot, Y., Behringer, R., Feiner, S., Julier, S. and MacIntyre, B. (2001), “Recent advances  in  augmented  reality”,  IEEE    Graph. Appl., Vol. 21, no. 6, pp. 34-47.
    https://doi.org/10.1109/38.963459
  11. Carmigniani, J., Furht, B., Anisetti, M., Ceravolo, P., Damiani, E. and Ivkovic, M. (2011), “Augmented reality technologies, systems and applications”, Tools Appl., Vol. 51, no. 1, pp. 341-377.
    https://doi.org/10.1007/s11042-010-0660-6
  12. Amin, D. and Govilkar, S. (2015), “Comparative study of augmented reality SDKs”, J. Comput. Sci. Appl., Vol. 5, no. 1, pp. 11-26.
    https://doi.org/10.5121/ijcsa.2015.5102
  13. Kim, H., Matuszka, T., Kim, J.-I., Kim, J. and Woo, W. (2017), “Ontology-based mobile augmented reality in cultural heritage sites: information modeling and user study”, Multimedia Tools Appl., Vol. 76, no. 24, pp. 26001-26029.
    https://doi.org/10.1007/s11042-017-4868-6
  14. Nowacki, P. and Woda, M. (2019), “Capabilities of ARCore and ARKit platforms for AR/VR applications”, International Conference on Dependability and Complex Systems, pp. 358-370.
    https://doi.org/10.1007/978-3-030-19501-4_36
  15. Milgram, P. and Kishino, F. (1994), “A taxonomy of mixed reality visual displays”, Inf. Syst., Vol. 77, no. 12, pp. 1321-1329.
  16. LeCun, Y., Bengio, Y. and Hinton, G. (2015), “Deep learning”, Nature, 521, no. 7553, pp. 436-444.
    https://doi.org/10.1038/nature14539
  17. Akgul, O., Penekli, H. and Genc, Y. (2016), “Applying deep learning in augmented reality tracking”, 2016 12th International Conference on Signal-Image Technology & Internet-Based Systems (SITIS), pp. 47-54.
    https://doi.org/10.1109/SITIS.2016.17
  18. Rublee, E., Rabaud, V., Konolige, K. and Bradski, G.R. (2011), “ORB: An efficient alternative to  SIFT  or  SURF”,  IEEE  International  Conference  on Computer  Vision  (ICCV), pp. 22564-2571.
    https://doi.org/10.1109/ICCV.2011.6126544
  19. Limmer, M., Forster, J., Baudach, D., Schüle, F., Schweiger, R. and Lensch, H.P. (2016), “Robust deep-learning-based road-prediction for augmented reality navigation systems at night”, 2016 IEEE 19th International Conference on Intelligent Transportation Systems (ITSC), pp. 1888-1895.
    https://doi.org/10.1109/ITSC.2016.7795862
  20. Schüle, F., Schweiger, R. and Dietmayer, K. (2013), “Augmenting night vision video ima­ges with longer distance road course information”, 2013 IEEE Intelligent Vehicles Sympo­sium, pp. 1233-1238.
    https://doi.org/10.1109/IVS.2013.6629635
  21. Risack, R., Klausmann, P., Krüger, W. and Enkelmann, W. (1998), “Robust lane recognition embedded in a real-time driver assistance system”, IEEE, pp. 35-40.
  22. Farabet, C., Couprie, C., Najman, L. and LeCun, Y. (2012), “Learning hierarchical features for scene labeling”, IEEE Trans. Pattern Anal. Mach. Intell., 35, no. 8, pp. 1915- 1929.
    https://doi.org/10.1109/TPAMI.2012.231
  23. Schröder, M. and Ritter, H. (2017), Deep learning for action recognition in augmented reality assistance systems, ACM SIGGRAPH 2017 Posters.
    https://doi.org/10.1145/3102163.3102191
  24. Long, J., Shelhamer, E. and Darrell, T. (2015), “Fully convolutional networks for semantic segmentation”, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431-3440.
    https://doi.org/10.1109/CVPR.2015.7298965
  25. Abdi, L. and Meddeb, A. (2017), “Deep learning traffic sign detection, recognition and augmentation”, Proceedings of the Symposium on Applied Computing, pp. 131-136.
    https://doi.org/10.1145/3019612.3019643
  26. CireşAn, D., Meier, U., Masci, J. and Schmidhuber, J. (2012), “Multi-column deep neural network for traffic sign classification”, Neural Netw., pp. 333-338.
    https://doi.org/10.1016/j.neunet.2012.02.023
  27. Stallkamp, J., Schlipsing, M., Salmen, J. and Igel, C. (2012), “Man vs. computer: benchmarking machine learning algorithms for traffic sign recognition”, Neural Netw., Vol. 32, pp. 323-332.
    https://doi.org/10.1016/j.neunet.2012.02.016
  28. Sermanet, P. and LeCun, Y. (2011), “Traffic sign recognition with multi-scale convolutional networks”, International Joint Conference on Neural Networks (IJCNN), pp. 2809-2813.
    https://doi.org/10.1109/IJCNN.2011.6033589
  29. Stallkamp, J., Schlipsing, M., Salmen, J. and Igel, C. (2011), “The german traffic sign re­cognition benchmark: a multi-class classification competition”, International Joint Confe­rence on Neural Networks (IJCNN), pp. 1453-1460.
    https://doi.org/10.1109/IJCNN.2011.6033395
  30. Rao, J., Qiao, Y., Ren, F., Wang, J. and Du, Q. (2017) “A mobile outdoor augmented reality method combining deep learning object detection and spatial relationships for geovisualization”, Sensors, Vol. 17, no. 9, pp. 1951-1977.
    https://doi.org/10.3390/s17091951
  31. Wang, R., Lu, H., Xiao, J., Li, Y. and Qiu, Q. (2018). “The design of an augmented reality system for urban search and rescue”, 2018 IEEE International Conference on Intelligence and Safety for Robotics (ISR), pp. 267-272.
    https://doi.org/10.1109/IISR.2018.8535823
  32. Caelles, S., Maninis, K.-K., Pont-Tuset, J., Leal-Taixé, L., Cremers, D. and Van Gool, L. (2017), “One-shot video object segmentation”, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 221-230.
    https://doi.org/10.1109/CVPR.2017.565
  33. Aliprantis, J., Kalatha, E., Konstantakis, M., Michalakis, K. and Caridakis, G. (2018), “Linked open data as universal markers for mobile augmented reality applications in cultural heritage”, Digital Cultural Heritage, pp. 79-90.
    https://doi.org/10.1007/978-3-319-75826-8_7
  34. Englert, M., Klomann, M., Weber, K., Grimm, P. and Jung, Y. (2019), “Enhancing the AR experience with machine learning services”, The 24th International Conference on 3D Web Technology, pp. 1-9.
    https://doi.org/10.1145/3329714.3338134
  35. Zhou, F., Duh, H.B.-L. and Billinghurst, M. (2008), “Trends in augmented reality tracking, interaction and display: a review of ten years of ISMAR”, Proceedings of the 7th IEEE/ACM International Symposium on Mixed and Augmented Reality, pp. 193-202.
  36. Krizhevsky, A., Sutskever, I. and Hinton, G.E. (2012), “Imagenet classification with deep convolutional neural networks”, Advances in Neural Information Processing Systems, pp. 1097-1105.
  37. Guenter, B., Finch, M., Drucker, S., Tan, D. and Snyder, J. (2012), “Foveated 3D graphics”, ACM Trans. Graph., Vol. 31, no. 6, pp. 1-10.
    https://doi.org/10.1145/2366145.2366183

Full text: PDF