Dynamic scalability and contention prediction in public infrastructure using Internet application profiling

Abstract

Recent advances in cloud computing have been attracting more customers to host their applications in public and private clouds, where resources are shared among several users by means of virtualization and exposed remotely as services. Infrastructure-as-a-Service (IaaS) comes on top of these services, where it gives more control over the provisioned resources – typically – based on online monitoring of specific metrics (e.g., CPU, Memory, and Network). One of the key challenges in such shared environments is identifying bottlenecks in customers’ applications and predicting potential resource contentions. Identifying real bottlenecks in enterprise applications entails understanding the behavior of these applications in the first place. Predicting potential resource contentions requires additional information rather than traditional metrics that are used to guide resource provisioning. Another factor that exacerbates the challenge of resource prediction in enterprise applications is the use of multi-tiers architecture, because the impact of contention in one tier might propagate to other tiers.

Intellectual Merit

Our approach targets customers (enterprises) who host or consider hosting their applications in enterprise virtualized infrastructures (e.g., Amazon EC2) and require high application’s performance and reliability levels. For such cases, we introduce our approach that involves:
• A mechanism for proactive scaling up/down of resources based on workload variations.
• A set of algorithms that predict and eliminate potential anomalies in the behavior of multi-tier enterprise applications.
One of the key innovations of our approach is the notion of “behavior models” that we build for each tier of the considered enterprise application. These models help us guide infrastructure scalability and discover (and abandon) VM instances that could degrade application’s performance.

Broader Impact

Typically, Service Level Agreement (SLA) of IaaS provider describes only the annual up time of the instances, but it does not discuss potential performance degradation caused by contention for resources. Our aim of this research is to develop algorithms and components that allows public infrastructures users to maintain their Internet applications performance.

Use of FutureGrid

I will run a benchmark, initially RUBiS, on many different machines. I should have the ability to measure performance metrics including CPU, Memory, I/O, and Harddisk utilization. The APIs of the private cloud will be consumed by my algorithms to scale resources dynamically according to workload variation.

Scale Of Use

Because the idea of my research is to maintain the Internet application performance, in spite of the variation of the workload, I need 12 to 20 VM instances to be ready for provisioning along the project run time!

Publications


FG-236
Wesam Dawoud
Hasso Plattner Institute
Active

Timeline

2 years 14 weeks ago