URL/ACM: A unified reinforcement learning approach for autonomic cloud
management
Cloud computing, unlocked by virtualization, is emerging as an increasingly
important service-oriented computing paradigm. The goal of this project
is to develop a unified learning approach, namely URL, to automate the
configuration processes of virtualized machines and applications running
on the virtual machines and adapt the systems configuration to the dynamics
of cloud. The URL approach features three innovations:
First is a reinforcement learning (RL) methodology for auto-configuration
of virtual machines (VMs) on distributed computing resources
in a real-time manner. Second is a unified RL approach for auto-configuration
of both VMs and multi-tier web appliances. It is able to adapt the VM resource
budget and appliance parameter settings in a coordinated way to the cloud
dynamics and the changing workload for the objective of service quality
assurance. Third is a distributed, cooperative RL approach that allows the
RL-based learning and optimization agents running on different servers and
with independent action choices to make an optimal joint
configuration policy in large-scale systems.
Deliverables that emerge from this project will advance discovery and
understanding of autonomic management of large-scale complex systems
with profound technical, economic, and societal impact.
In addition, this project has an integral educational component.
It will raise the level of awareness of system management issues and
the power of machine learning technology, and prepare the students to
enter the industry with adequate understanding of the challenges and
opportunities in cloud computing. This award is funded under the American Recovery and Reinvestment Act of 2009 (Public Law 111-5).
This URL/ACM project was funded by U.S. National Science Foundation under grant CNS-0914330, period: 9/2009-8/2012.
People
- Cheng-Zhong Xu (faculty)
- Zhen Kong (Postdoc fellow)
- Jia Rao (PhD student)
- Xiangping Bu (PhD student)
- Kun Wang (PhD Student)
- Jiayu Gong (PhD Student)
Publications
- X. Bu, J. Rao, C.-Z. Xu, A reinforcement learning approach to online web systems management, ICDCS'09
- J. Rao, X. Bu, C.-Z. Xu, L. Wang, and G. Yin, VCONF: a reinforcement learning approach to online VM autoconfiguration, ICAC'09
- J. Rao and C.-Z. Xu, Online measurement of the capacity of multi-tier websites using hardware performance counters, ICDCS'09
- J. Rao, C.-Z. Xu, CoSL: a coordinated statistical learning approach to measuing the capacity of multi-tier websites, IPDPS'08