Alfimtsev A.N., Pitikin A.R. Emergency properties of multi-agent reinforcement learning // Informacionnye tehnologii i matematicheskoe modelirovanie v upravlenii slozhnymi sistemami: elektronnyj nauchnyj zhurnal [Information technology and mathematical modeling in the management of complex systems: electronic scientific journal], 2023. No. 1(17). P. 1-10. DOI: 10.26731/2658‑3704.2023.1(17).1-10 [Accessed 31/03/23]
10.26731/2658‑3704.2023.1(17).1-10
This paper presents ten emergent properties of multi-agent reinforcement learning. Each property is formalized using Markov decision processes and presented as a formula. It has been suggested that such a formalization will allow further targeted training of a multi-agent system to obtain the necessary emergent properties. It has been established that emergence in multi-agent reinforcement learning is weak. Highly cited publications on the topic of multi-agent learning were analyzed in order to check the presence of the formulated properties. Based on the results of the work done, a summary table of properties is presented indicating the algorithm in which the property was discovered, the environment for which the algorithm was created, the architecture of the agent's neural network, and the reinforcement learning scheme used.
- Jiang H. et al. Applications and development of artificial intelligence system from the perspective of system science: A bibliometric review // Systems Research and Behavioral Science. 2022. Vol. 39. №. 3. pp. 361-378.
- Busoniu L., Babuska R., De Schutter B. A comprehensive survey of multiagent reinforcement learning. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews). IEEE, 2008, vol. 38 (2), pp. 156–172. DOI: https://doi.org/10.1109/TSMCC.2007.913919
- Jaderberg M., Czarnecki W.M., Dunning I., et al. Human-level performance in 3D multiplayer games with population-based reinforcement learning. Science, 2019, vol. 364, no. 6443, pp. 859–865. DOI: https://doi.org/10.1126/science.aau6249
- Du W., Ding S. A survey on multi-agent deep reinforcement learning: from the perspective of challenges and applications. Artificial Intelligence Review, 2021, vol. 54, no. 5, pp. 3215–3238.
- Goodfellow I., Bengio Y., Courville A. Deep learning. New York, MIT press, 2016, 800 p.
- Hernandez-Leal P., Kartal B., Taylor M.E. A survey and critique of multiagent deep reinforcement learning. Autonomous Agents and Multi-Agent Systems, 2019, vol. 33, no. 6, pp. 750–797.
- Neural Information Processing Systems, NeurlIPS, https://papers.nips.cc/
- Kalantari S., Nazemi E., Masoumi B. Emergence phenomena in self-organizing systems: a systematic literature review of concepts, researches, and future prospects // Journal of Organizational Computing and Electronic Commerce. 2020. Vol. 30. №. 3. pp. 224-265.
- O’Connor, Timothy, "Emergent Properties", The Stanford Encyclopedia of Philosophy (Winter 2021 Edition), Edward N. Zalta (ed.), https://plato.stanford.edu/archives/win2021/entries/properties-emergent/.
- Tsvetkov V.Ya. Emergence // International Journal of Applied and Fundamental Research. – 2017. – № 2-1. – pp. 137-138; https://applied-research.ru/ru/article/view?id=11234.
- Wilson, Jessica M., 2015, “Metaphysical Emergence: Weak and Strong”, in Tomasz Bigaj and Christian Wüthrich (eds.), Metaphysical Emergence in Contemporary Physics, Amsterdam: Rodopi, 251–306.
- Moncion T., Amar P., Hutzler G. Automatic characterization of emergent phenomena in complex systems // Journal of Biological Physics and Chemistry. 2010. Vol. 10. pp. 16--23.
- Zeigler B. P., Muzy A. Some modeling & simulation perspectives on emergence in system-of-systems // Spring Simulation Multi-conference (SpringSim'16). 2016. pp. 1-5.
- Chen C. C., Nagl S. B., Clack C. D. Specifying, detecting and analysing emergent behaviours in multi-level agent-based simulations // Summer Computer Simulation Conference 2007, SCSC'07. Vol. 2. pp. 969-976.
- Alfimtsev A. N. Multi-agent reinforcement learning. BMSTU Publ., 2021. 224 p.
- Vinyals O. et al. Grandmaster level in StarCraft II using multi-agent reinforcement learning // Nature. 2019. Vol. 575. №. 7782. pp. 350-354.
- Tonghan Wang, Heng Dong, Victor Lesser, Chongjie Zhang. ROMA: Multi-Agent Reinforcement Learning with Emergent Roles, 2020, ICML(2020). DOI: https://doi.org/10.48550/arXiv.2003.08039
- Tonghan Wang, Tarun Gupta, Anuj Mahajan, Bei Peng, Shimon Whiteson, Chongjie Zhang. RODE: Learning Roles to Decompose Multi-Agent Tasks, 2020, DOI: https://doi.org/10.48550/arXiv.2010.01523
- Havrylov S., Titov I. Emergence of language with multi-agent games: Learning to communicate with sequences of symbols // Advances in neural information processing systems. 2017. Vol. 30. pp. 1-11.
- Heechang Ryu, Hayong Shin, Jinkyoo Park. Multi-Agent Actor-Critic with Hierarchical Graph Attention Network, 2020, AAAI(2020). https://ojs.aaai.org/index.php/AAAI/article/view/6214
- Yali Du, Lei Han, Meng Fang, Ji Liu, Tianhong Dai, Dacheng Tao. LIIR: Learning Individual Intrinsic Reward in Multi-Agent Reinforcement Learning, 2019, NeurlIPS (2019).
- Joel Z. Leibo, Edgar Duéñez-Guzmán, Alexander Sasha Vezhnevets, John P. Agapiou, Peter Sunehag, Raphael Koster, Jayd Matyas, Charles Beattie, Igor Mordatch, Thore Graepel. Scalable Evaluation of Multi-Agent Reinforcement Learning with Melting Pot, 2021, International Conference on Machine Learning 2021 (pp. 6187-6199).
- Siqi Liu, Guy Lever, Josh Merel, Saran Tunyasuvunakool, Nicolas Heess, Thore Graepel. Emergent Coordination Through Competition, 2019, ICLR(2019). DOI: https://doi.org/10.48550/arXiv.1902.07151
- Michael Bradley Johanson, Edward Hughes, Finbarr Timbers, Joel Z. Leibo. Emergent Bartering Behaviour in Multi-Agent Reinforcement Learning, 2022, DOI: https://doi.org/10.48550/arXiv.2205.06760.
- John P. Agapiou, Alexander Sasha Vezhnevets, Edgar A. Duéñez-Guzmán, Jayd Matyas, Yiran Mao, Peter Sunehag, Raphael Köster, Udari Madhushani, Kavya Kopparapu, Ramona Comanescu, DJ Strouse, Michael B. Johanson, Sukhdeep Singh, Julia Haas, Igor Mordatch, Dean Mobbs, Joel Z. Leibo. Melting Pot 2.0, 2022, https://doi.org/10.48550/arXiv.2211.13746