URL/ACM: A unified reinforcement learning approach for autonomic cloud management


Cloud computing, unlocked by virtualization, is emerging as an increasingly important service-oriented computing paradigm. The goal of this project is to develop a unified learning approach, namely URL, to automate the configuration processes of virtualized machines and applications running on the virtual machines and adapt the systems configuration to the dynamics of cloud. The URL approach features three innovations: First is a reinforcement learning (RL) methodology for auto-configuration of virtual machines (VMs) on distributed computing resources in a real-time manner. Second is a unified RL approach for auto-configuration of both VMs and multi-tier web appliances. It is able to adapt the VM resource budget and appliance parameter settings in a coordinated way to the cloud dynamics and the changing workload for the objective of service quality assurance. Third is a distributed, cooperative RL approach that allows the RL-based learning and optimization agents running on different servers and with independent action choices to make an optimal joint configuration policy in large-scale systems.

Deliverables that emerge from this project will advance discovery and understanding of autonomic management of large-scale complex systems with profound technical, economic, and societal impact. In addition, this project has an integral educational component. It will raise the level of awareness of system management issues and the power of machine learning technology, and prepare the students to enter the industry with adequate understanding of the challenges and opportunities in cloud computing. This award is funded under the American Recovery and Reinvestment Act of 2009 (Public Law 111-5).


This URL/ACM project was funded by U.S. National Science Foundation under grant CNS-0914330, period: 9/2009-8/2012.


People
Publications