##plugins.themes.bootstrap3.article.main##

Target tracking is a process that may find applications in different domains such as video surveillance, robot navigation and human computer interaction. In this work we have considered the problem of tracking a moving object in a multi agent environment. The environment is a rectangular space bounded by walls. The first agent is the target and it moves randomly in the space. The second agent should follow the target, keeping as close as possible without crashing with it. It uses sensors to detect the position of the target. The sensor readings give the distance and the angle from the target. We use reinforcement learning to train the tracker to detect any change in the movement of the target and stay within a certain range from it. Reinforcement learning is a form of machine learning in which the agent learns by interacting with the environment. By doing so, for each action taken, the agent receives a reward from the environment, which is used to determine positive or negative behaviour. The goal of the agent is to maximise the total reward received during the interaction. This form of machine learning has applications in different areas, such as: game solving with the most known game being AlphaGO; robotics, for design of hard-to engineer behaviours; traffic light control, personalized recommendations, etc. The sensor readings may have continuous values, making a very large state space. We approximate the value function using neural networks and use different reward functions for learning the best policy.

Downloads

Download data is not yet available.

References

  1. Benavidez, P. and Jamshidi, M. (2011) "Mobile robot navigation and target tracking system," 2011 6th International Conference on System of Systems Engineering, Albuquerque, NM, 2011, pp. 299-304.
     Google Scholar
  2. DOI=https://doi.org/10.1109/SYSOSE.2011.5966614.
     Google Scholar
  3. Glorennec, P. Y. (2000). Reinforcement Learning: An Overview. In Proceedings European Sym. on Intelligent Techniques, (2000).
     Google Scholar
  4. Kunz, F. (2013). An Introduction to Temporal Difference Learning. Seminar on autonomous learning systems. Department of Computer Science. TU Darmstad.
     Google Scholar
  5. Lund, H. H., Ves Cuenca, E., Hallman, J. (1996) A Simple Real- Time Mobile Robot Tracking System. Techincal Paper no. 41, Department of Artificial Intelligence, University of Edinburgh, 1996.
     Google Scholar
  6. Luo, W., Sun, P., Zhong, F., Liu, W., Zhang, T., Wang, Y., (2018). End-to-end Active Object Tracking via Reinforcement Learning. In Proceedings of the 35th International Conference on Machine Learning, Stocholm, Sweden, 2018. arXiv:1705.10561 [cs.CV].
     Google Scholar
  7. Mazo, M., Speranzon, A., Johansson, K. H. and Xiaoming. H., (2004) "Multi-robot tracking of a moving object using directional sensors," IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004, New Orleans, LA, USA, 2004, pp. 1103-1108 Vol.2.
     Google Scholar
  8. DOI= https://doi.org/10.1109/ROBOT.2004.1307972.
     Google Scholar
  9. Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Veness, J., Bellemare, M. G., Graves, A., Riedmiller, M., Fidjeland, A. K., Ostrovski, G., Petersen, S., Beattie, C., Sadik, A., Antonoglou, I., King, H., Kumaran, D., Wierstra, D., Legg, S. & Hassabis, D. (2015). Human-level control through deep reinforcement learning. Nature, 518, 529--533. DOI=https://doi.org/10.1038/nature14236.
     Google Scholar
  10. Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., & Riedmiller, M.A. (2013). Playing Atari with Deep Reinforcement Learning. arXiv:1312.5602 [cs.LG].
     Google Scholar
  11. Pinciroli, C., Trianni, V., O’Grady, R., Pini, G., Brutschy, A., Brambilla, M., Mathews, N., Ferrante, E., Di Caro, G., Ducatelle, F., Birattari, M., Gambardella, L. M., Dorigo. M., (2012). ARGoS: a Modular, Parallel, Multi-Engine Simulator for Multi-Robot Systems. Swarm Intelligence, volume 6, number 4, pages 271-295. Springer, Berlin, Germany. DOI=https://doi.org/10.1007/s11721-012-0072-5.
     Google Scholar
  12. Riedmiller M. (2005) Neural Fitted Q Iteration – First Experiences with a Data Efficient Neural Reinforcement Learning Method. In: Gama J., Camacho R., Brazdil P.B., Jorge A.M., Torgo L. (eds) Machine Learning: ECML 2005. ECML 2005. Lecture Notes in Computer Science, vol 3720. Springer, Berlin, Heidelberg. DOI=https://doi.org/10.1007/11564096_32.
     Google Scholar
  13. Ren, Liangliang & Lu, Jiwen & Wang, Zifeng & Tian, Qi & Zhou, Jie. (2018). Collaborative Deep Reinforcement Learning for Multi-object Tracking: 15th European Conference, Munich, Germany, September 8–14, 2018, Proceedings, Part III. DOI=https://doi.org/10.1007/978-3-030-01219-9_36.
     Google Scholar
  14. Sankar S, Tsai C-Y. (2019). ROS-Based Human Detection and Tracking from a Wireless Controlled Mobile Robot Using Kinect. Applied System Innovation. 2019; 2(1):5; DOI=https://doi.org/10.3390/asi2010005.
     Google Scholar
  15. Sutton, R. S. and Barto, A. G. (2018) Reinforcement Learning: An Introduction (2nd Edition, in preparation). MIT Press.
     Google Scholar
  16. Watkins, C.J.C.H. and Dayan, P. (1992) Mach Learn 8: 279. DOI=http://dx.doi.org/10.1007/BF00992698.
     Google Scholar
  17. Xiang, Y., Alahi, A., Savarese, S., (2015). "Learning to Track: Online Multi-object Tracking by Decision Making," 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, 2015, pp. 4705-4713. DOI=https://doi.org/10.1109/ICCV.2015.534.
     Google Scholar
  18. Y, Li. (2018) Deep Reinforcement Learning: An Overview. arXiv:1701.07274v6 [cs.LG], Nov. 2018.
     Google Scholar
  19. Zhang, D., Maei, H., Wang, X., & Wang, Y. (2017). Deep Reinforcement Learning for Visual Object Tracking in Videos. arXiv:1701.08936[cs.CV].
     Google Scholar
  20. Zhong, Z., Yang, Z., Feng, W., Wu, W., Hu, Y., and Liu, C., (2019). "Decision Controller for Object Tracking With Deep Reinforcement Learning," in IEEE Access, vol. 7, pp. 28069-28079, 2019. DOI=https://doi.org/10.1109/ACCESS.2019.2900476.
     Google Scholar