Many modern software systems are built as a set of autonomous software components (also called agents) that collaborate with each other and are situated in an environment. To keep these multiagent systems operational under abnormal circumstances, it is crucial to make them resilient. Existing solutions are often centralised and rely on information manually provided by experts at design time, making such solutions rigid and limiting the autonomy and adaptability of the system. In this work, we propose a cooperative strategy focused on the identification of the root causes of quality requirement violations in multiagent systems. This strategy allows agents to cooperate with each other in order to identify whether these violations come from service providers, associated components, or the communication infrastructure. From this identification process, agents are able to adapt their behaviour in order to mitigate and solve existing abnormalities with the aim of normalising system operation. This strategy consists of an interaction protocol that, together with the proposed algorithms, allow agents playing the protocol roles to diagnose problems to be repaired. We evaluate our proposal with the implementation of a service-oriented system. The results demonstrate that our solution enables the correct identification of different sources of failures, favouring the selection of the most suitable actions to be taken to overcome abnormal situations.
翻译:暂无翻译