Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Cost-driven provisioning and execution of a computing-intensive service on the Amazon EC2

Cost-driven provisioning and execution of a computing-intensive service on the Amazon EC2 Abstract The decision of migrating a service to a cloud-based system must take into consideration many different aspects. Among them, economical costs is one of the most important. This paper describes how a computing-intensive service, based on a bag-of-tasks approach, has been migrated from a grid infrastructure to the Amazon Elastic Compute Cloud (EC2) infrastructure. A cost-based model for the evaluation of the economic costs of providing the service on the cloud has also been proposed, which considers computing costs as well as storage and platform deployment costs. The model includes a wide range of different instance types and purchasing policies provided by the Amazon EC2, as well as the deadline and problem size provided by the service user. The paper also shows how the proposed cost-based model is integrated into the framework used for the service deployment and execution, making possible the interaction with the Amazon Web services for hiring the required cloud resources and efficiently use them for the execution of the service requests. 1. INTRODUCTION Cloud computing offers several advantages for the deployment and execution of applications: it reduces the spending on IT infrastructures of organizations, facilitates its management, improves and scales the accessibility of applications, minimizes the payment of salaries and energy-consumption bills, etc. Nevertheless, the decision of migrating an application to a cloud-based hosting is complex [1–3]. Even when the migration has been decided, multiple alternatives in which an application might be deployed using the facilities offered by the cloud providers must be evaluated. Without a doubt, the economic cost is a decisive factor that influences the adoption of these decisions from the customer’s point of view. The first stage when planning the execution of an application in a cloud environment is to determine the resources that will be provisioned. Nowadays, Infrastructure-as-a-Service (IaaS) cloud providers offer a wide variety of computing and storage resources for executing applications. From an economical perspective, to find the best resource combination for executing an application is a challenge in the field of the provisioning of cloud resources. The price and performance of resources, the application requirements and the customer constraints (budget and time constraints, mainly) are critical issues for determining a well-suited provisioning policy. Some research proposals define models to estimate those costs when the cloud resources are selected according to a concrete provisioning policy [4–6]. Nevertheless, and in spite of the interest of these solutions, the goal is to determine the cheapest provisioning of resources. References [7–15] define different methods for minimizing the cost of provisioning cloud resources. However, current proposals suffer from some relevant shortcomings: either the cost of data involved in the application execution is ignored or partially considered, or the variety of provided resources and payment models offered by public cloud providers are not considered, or the prototypes are used to just evaluate simple study cases, without being integrated into application engines. The two first shortcomings can cause the computed total cost to be inaccurate (specially for data-intensive applications), or that some provisioning options are ignored (the cost of executing a long-term application in the Amazon Elastic Compute Cloud (EC2), for instance, can drastically vary depending on the way instances are hired and payed). The third shortcoming constrains the applicability of the proposals and the possibilities of validation of the results in real scenarios. In [16], we presented a bag-of-tasks application to annotate a repository containing more than 15 million of educational resources using cluster and grid computing infrastructures. It was fully annotated in 178 days (more than 1.5 million of CPU hours were required). The interest of the previous results motivated to think about transforming the semantic-annotation application into a pay-per-use service executable in the Amazon EC2. Each service customer will pay for the annotation of an input workload under certain users’ constraints, so that the final price will depend on the type of computing and storage resources that have been hired for completing the annotation process. The problem of migrating the first version of the application to a cloud environment entailed to address some of the previously discussed shortcomings about the cost-aware provisioning of resources. Our research focus on two open research questions: How to reduce the price of executing bag-of-task applications in IaaS Cloud providers considering all the involved cost factors and the different providers offers?. And, once a cost model has been defined, How to design and implement a cost-driven provisioning mechanism for the efficient execution of computing-intensive and data-intensive applications?. The contributions of this paper are aligned with that two questions. First, we propose a model for computing the execution costs of a bag-of-task service (or application) which considers all the factors impacting on the final price (computing and data storage resources, data transfer costs, networking, etc.) and also considering the variety of resources offered by public cloud providers (the wide catalog of resources and services, the different ways to hire/pay for, the possibility of hiring resources in different geographical areas, etc.). Besides, the generality of the model facilitates its application to different cloud service providers considering the resource heterogeneity and the variety of payment options offered by today’s IaaS cloud platforms. Second, we have implemented a cost-driven engine for the execution of bag-of-task applications in cloud environments. The cost model has been programmed as a Mixed Integer NonLinear Programming Problem (MINLP) and integrated into a framework for the execution of service-oriented applications [17]. A new cost-aware provisioning component manages the provisioning of resources as a part of the applications’ life-cycle. Besides, a second component responsible for adapting the framework configuration in order to optimize the management of hired cloud resources during the applications’ execution has also been implemented and integrated into the final solution. Finally, in the paper, we apply the proposed solution to the development of a network-accessible service for the annotation of big sets of terms. The remainder of the paper is structured as follows. Section 2 describes the process by which the previous grid-based solution has been migrated to a cloud-based version. Section 3 introduces the method used to model the execution costs and provision resources with minimal cost. Once the set of required resources has been establish, the system must be deployed. Section 4 depicts this process. Section 5 presents a review of the main published contributions related to the problem the paper concentrates on. Finally, Section 6 presents some conclusions and future work. 2. MOVING FROM A GRID-BASED SOLUTION TO A CLOUD-BASED ENVIRONMENT This section introduces the motivation and the complete process for the migration of a previous grid-based solution to a cloud-based environment. 2.1. Motivation: migrating to Amazon AWS In [16], a repository of more than 15 million of educational resources was semantically annotated. The ADEGA algorithm [18] and the categories of DBpedia [19] were used to create the metadata of these resources. For each educational resource, a set of significant terms were extracted and semantically annotated by means of an RDF-based graph constructed from the DBpedia instances. The resulting graphs were used to classify the corresponding educational resources. An application was programmed to annotate all the educational resources stored in the repository. This application consisted of a (big) set of tasks, being each one in charge of analyzing a set of educational resources in order to identify a set of relevant terms and, then, associating an RDF graph with these terms. In that work, different and heterogeneous computing infrastructures (two grids and a cluster) were used to execute the annotation application for the whole set of learning units. Afterwards, we concentrated on estimating the cost of solving the same problem using on-demand instances in the Amazon EC2 cloud [20], being result that a cloud-based solution would be quite more expensive. That estimation motivated us to propose a model for computing more accurate computing costs [21]. The resulting model was very limited because it did not consider neither the cost of data storage and transfer nor the cost of managing the applications’s life-cycle when deploying it in an execution engine. These costs can have a significant impact in the total cost of executing complex applications, such as the annotation service considered in this paper. From a business point of view, most companies do not have access to grid environments. However, the pay-per-use cloud is a feasible way of having computing resources as they are needed. This is the reason for the present work: Could the (massive) annotation process be offered as an affordable service using cloud resources? To answer this question, Amazon AWS [22] was chosen because of its maturity and its stability as a commercial cloud and its rich catalog of facilities for the deployment of services and computing instances. 2.2. Life-cycle of a service request Let us now depict the life-cycle of a service request for the annotation service. Figure 1 sketches the main phases of such request. First, the customer negotiates with the service the time and price constraints at the Negotiation phase. The service will reach an agreement with the customer, and accept a request if it is possible to reserve a set of cloud resources to solve it according to the constraints established by the customer. As a part of this decision, the service computes the cheapest combination of cloud resources that will be provisioned. Figure 1. View largeDownload slide Phases of a service request. Figure 1. View largeDownload slide Phases of a service request. After that, the Configuration phase requests the computing, storage and network resources to the provider. Once they are available, the service deploys the annotation application on the Execution phase. A Watching process is conducted in order to monitor and track the usage of the cloud resources and their derived costs. Finally, once the annotation process has been completed, the cloud resources are released and the results and the service invoice are sent to the client at the Completion phase. 2.3. Architectural design Figure 2 depicts the high-level architecture of the proposed service. The service is offered via a Web services-based interface. A service request consists of two parameters: an URI to the collection of data to be annotated and the maximum time that the client is willing to wait for the service response. These parameters are used by the service to compute the minimum price the request can be solved. Figure 2. View largeDownload slide Cloud-based architecture. Figure 2. View largeDownload slide Cloud-based architecture. Internally, the service consists of two layers: the execution layer and the computing infrastructure layer. The execution layer integrates a set of components to support the main phases of a service request: it is responsible of making the correct decisions to meet service agreements, provisioning and releasing the required cloud resources, splitting a service request into a set of executable tasks and, finally, managing and monitoring the execution of tasks on the reserved resources. More specifically, the execution layer consists of the service front-end and a resource management framework. The front-end integrates a Cost-driven decision maker component in charge of evaluating whether the service will be able to successfully respond to the user request (considering the customer constraints). The Workflow execution environment implements the service execution logic. It first notifies the framework the required computing and storage resources. Then, it creates a set of executable tasks and submits them to the framework for execution. Finally, it waits for the results of the submitted tasks. The description of the required resources, tasks, and partial and final results are stored in the message bus component. Before starting the execution of tasks, the framework must configure itself. The Self-configuration component reads from the bus the description of the required resources and, then, interacts with the mediation layer to create a mediator, who is responsible for the provisioning of computing, network and storage resources as well as the submission of pending annotation tasks (stored in the message bus). Internally, it also schedules the mapping of these tasks to the provisioned cloud resources. There are some additional and more general aspects of the life-cycle of all these executable tasks which are managed by other management components (scheduling, fault handling or data-movement components, for instance) integrated in the framework. The computing infrastructure layer provides the infrastructure and resources that will be used to execute the annotation tasks. This layer may be composed of different and heterogeneous computing infrastructures (grid, cluster or cloud environments, for instance). However, in this paper we concentrate on a cloud perspective. All the components of the framework, the message bus, the management components and the set of mediators, as well as other complementary services to support dynamic balancing, DNS handling, storage support, etc., have been deployed using cloud instances in Amazon EC2. Finally, we would like to remark the new components that have been developed as a part of this work. In the framework, the self-configuring component and the mediators for the interaction with the cloud provider (including the provisioning, scheduling and usage-monitoring facilities integrated into them) have been added. On the other hand, in the service’s front-end, the cost-driven decision maker and the workflow that implements the logic of the annotation process have also been implemented. The new method for minimizing the cost of provisioned cloud resources has been included as a part of the decision maker component. These new components are described in the following sections. 3. ESTIMATING COSTS OF THE CLOUD-BASED SOLUTION Let us now introduce the cost factors that must be considered in order to estimate the budget for a service request. The price depends on three factors: the cost of executing the framework itself ( costfw), the cost of cloud storage services ( costdata) and the cost of hiring the set of cloud computing instances needed to compute the request ( costinstances). The sum of these costs will define the total cost of the approach. For clarity, the main variables used in this section are summarized in Table 1. Table 1. Variables used in the cost estimation. Variable Definition Nterms Number of terms to be annotated per request Tmax Deadline of the annotation request (hours) sizeinput Size of input data (GB) sizeoutput Size of results (GB) ni Number of type-i instances ti Computing hours of type-i instances tphi Throughput of type-i instances cfi The fixed cost of type-i instances chi The euro/hour cost of type-i instances size The storage disk attached to each instance (GB) cti The total cost of type-i instances pricej The euro/hour cost of a type-j service pricej,k The cost of executing a type-k task in a type-j service Variable Definition Nterms Number of terms to be annotated per request Tmax Deadline of the annotation request (hours) sizeinput Size of input data (GB) sizeoutput Size of results (GB) ni Number of type-i instances ti Computing hours of type-i instances tphi Throughput of type-i instances cfi The fixed cost of type-i instances chi The euro/hour cost of type-i instances size The storage disk attached to each instance (GB) cti The total cost of type-i instances pricej The euro/hour cost of a type-j service pricej,k The cost of executing a type-k task in a type-j service View Large Table 1. Variables used in the cost estimation. Variable Definition Nterms Number of terms to be annotated per request Tmax Deadline of the annotation request (hours) sizeinput Size of input data (GB) sizeoutput Size of results (GB) ni Number of type-i instances ti Computing hours of type-i instances tphi Throughput of type-i instances cfi The fixed cost of type-i instances chi The euro/hour cost of type-i instances size The storage disk attached to each instance (GB) cti The total cost of type-i instances pricej The euro/hour cost of a type-j service pricej,k The cost of executing a type-k task in a type-j service Variable Definition Nterms Number of terms to be annotated per request Tmax Deadline of the annotation request (hours) sizeinput Size of input data (GB) sizeoutput Size of results (GB) ni Number of type-i instances ti Computing hours of type-i instances tphi Throughput of type-i instances cfi The fixed cost of type-i instances chi The euro/hour cost of type-i instances size The storage disk attached to each instance (GB) cti The total cost of type-i instances pricej The euro/hour cost of a type-j service pricej,k The cost of executing a type-k task in a type-j service View Large 3.1. Cost of the framework From an architectural point of view, the framework is a middleware layer shared by all customer requests that are being concurrently executed by the service. Therefore, its cost should be proportionally distributed among all these customers. Although some customer-specific costs can be determined with more or less effort, to accurately monitor the resources used by each customer might be expensive and lead to additional costs [23]. For simplicity, in our estimations, we have supposed the worst case from an economic perspective: a framework instance will be deployed for managing each service request. As described, the framework consists of three main types of components: the message bus, a set of management components and a set of mediators. Equation (1) estimates the cost of executing the framework ( costfw). It mainly depends on the number of computing instances needed to execute the different architectural components and the Amazon services used by the message bus to coordinate these components ( costinst and costbus, respectively): costfw=costinst+costbus (1) costinst=(ctM50·nM50+ctXL50·nXL50+ctmc50·nmc50)·Tmax (2) costbus=(priceSQStask+priceELBtask)·#tasks+priceELB·Tmax (3) First, let us explain the costinst factor. We use three different types of computing instances in order to execute the architectural components: m3.medium instances (the subscript M50), m3.xlarge instances (XL50) and t2.micro instances (mc50). In [20], we experimentally determined the most adequate types of instances for the different types of components, and also stated that the US West (Oregon) Amazon region was the best candidate. Therefore, all those instances will be hosted in that region and configured to have a local storage capacity of 50 GB. Considering these requirements, their prices per usage-hour have been calculated from the basic prices defined by the cloud provider: ctM50=0.0642€ per hour, ctXL50=0.2411€ per hour and ctmc50=0.0156€ per hour. The number of instances of each type needed to solve a request depends on the input workload; more specifically, the number of mediators needed to execute the considered workload. In Section 4, we will determine how to calculate the number of mediators per request. Let us assume for now that this number is known. In particular, two m3.medium instances ( nM50) are needed for the execution of the management components (two components are being currently used: a fault-tolerance component and a data-movement component), and the number of m3.xlarge instances ( nXL50) to be provisioned is an instance per mediator. Besides, the bus uses one t2.micro instance for interacting with each mediator and another one to execute its interface ( nmc50). Finally, the usage time determined in the cost equation is the maximum time defined by the customer, Tmax, in hours. On the other hand, Equation (3) estimates the cost of the bus. The implementation of the message bus is based on the Amazon Simple Queue (SQS) and the Elastic Load Balancing (ELB) services. We have calculated the prices of these services per executable task. Thus, a task performs 12 SQS requests and requires the ELB to process six messages of 64 kB each. Considering these requirements: the priceSQStask is 4.65·10−6€ per task, and the priceELBtask is 2.27·10−6€. Besides, the usage time of the ELB service must be added to these costs per task (the priceELB is 0.025€ per hour being Tmax the total time). 3.2. Data related cost Equation (4) estimates the cost of data requirements involved in a service request ( costdata). The Amazon Simple Storage Service (S3) is used to store the request input and output data (the workload and the resulting annotations). Currently, Amazon charges customers for the amount of data stored, the number of write and read requests and the amount of data transferred out of S3. Considering this pricing policy, the first term in the equation corresponds to the cost of storing input and output data. It depends on the data size and the time they are stored (we consider the worst case, and, therefore, the maximum time defined by the customer, Tmax, as well as the total input and output data generated in the request are used). Next, we consider the cost of the GET requests used to read input data from the computing instances and the cost of the PUT requests used to store the workload in S3 and to save the output data. The total number of requests can be easily calculated since one request is used to read and write every term of the input workload, meanwhile one request per each ten terms is used to save the output data. Finally, the last term in Equation (4) refers to the movement of the input data to the computing instances (from S3 to the EC2 service) to process the terms (note that transferring data into S3 is free of charges). These costs will be estimated considering the following Amazon prices: priceS3=0.02327€ per GB-month, priceget=0.0077€ per each 10,000 requests, priceput=0.0074€ per each 1,000 requests, and priceS3−>EC2=0.0156€ per GB, respectively costdata=(sizeinput+sizeoutput)·Tmax·priceS3+Nterms·priceget+1+110·Nterms·priceput+sizeinput·priceS3−>EC2. (4) As stated before, data related costs have been estimated assuming a worst case scenario where all data are considered during the whole request. Obviously, in a real scenario, the cost would be lower since output data are generated progressively and input data can be deleted when they were already used. Furthermore, compression and other techniques could be used in order to reduce the data stored and transferred, as well as the number of requests performed. 3.3. Minimization of the cost of computing instances In [20] we concluded that, for the considered problem, the cost of computing instances represented more than 90% of the total cost, whereas the infrastructure management and data related costs correspond to the remaining 10%. Therefore we aimed at minimizing it considering the wide range of computing resources and purchasing models offered by the cloud provider. The total price will depend on the number of hired instances and the time they are used. To compute the cost of a given annotation request, we should consider its workload (the number of terms to be annotated) as well as the deadline imposed by the customer. The cost of the computing instances depends on two factors. First, the cost of the instance itself, which depends on the selected purchasing model. Second, the cost of the elastic block storage (EBS) attached to each instance. The EBS cost depends on the size of the attached volume and the time the instance is running. In that respect, the computing instances require volumes of 70 GB [20]. The current price offered by Amazon for the EBS volumes is 0.088€ per GB-month. From the computing instances point of view, we are interested in obtaining the number of terms per time unit each type of instance can process (which is related to how powerful the instance is). In order to measure the computational power of the instances, 10 000 terms were executed in different instances types deploying as many jobs as the number of processors in the instances. Thereby, the mean execution time was used as an indicator of the computational power of each instance type considered. Table 2 summarizes, for each instance type, its number of processors, the mean time to annotate a term observed in the experiments (including data-movement and other management delays) and the resulting computational power expressed as the number of terms that each instance can process in an hour. Table 2. Evaluation of the time required to annotate a term using different Amazon EC2 instances (prices to 1 November, 2016). Instance type m3.xlarge m3.2xlarge i2.xlarge i2.2xlarge m4.xlarge m4.2xlarge c4.xlarge c4.2xlarge Number of cores 4 8 4 8 4 8 4 8 Execution time (sec/term) 38.04 37.69 33.80 33.04 36.37 35.99 30.72 29.85 Computational power (terms/hour) 378.54 764.12 426.03 871.67 395.91 800.23 468.68 964.86 Instance type m3.xlarge m3.2xlarge i2.xlarge i2.2xlarge m4.xlarge m4.2xlarge c4.xlarge c4.2xlarge Number of cores 4 8 4 8 4 8 4 8 Execution time (sec/term) 38.04 37.69 33.80 33.04 36.37 35.99 30.72 29.85 Computational power (terms/hour) 378.54 764.12 426.03 871.67 395.91 800.23 468.68 964.86 View Large Table 2. Evaluation of the time required to annotate a term using different Amazon EC2 instances (prices to 1 November, 2016). Instance type m3.xlarge m3.2xlarge i2.xlarge i2.2xlarge m4.xlarge m4.2xlarge c4.xlarge c4.2xlarge Number of cores 4 8 4 8 4 8 4 8 Execution time (sec/term) 38.04 37.69 33.80 33.04 36.37 35.99 30.72 29.85 Computational power (terms/hour) 378.54 764.12 426.03 871.67 395.91 800.23 468.68 964.86 Instance type m3.xlarge m3.2xlarge i2.xlarge i2.2xlarge m4.xlarge m4.2xlarge c4.xlarge c4.2xlarge Number of cores 4 8 4 8 4 8 4 8 Execution time (sec/term) 38.04 37.69 33.80 33.04 36.37 35.99 30.72 29.85 Computational power (terms/hour) 378.54 764.12 426.03 871.67 395.91 800.23 468.68 964.86 View Large Obviously, the total cost of each instance will depend on the selected purchasing model [22] (on-demand or reserved). Intuitively, for short deadlines, reserved instances would be a bad solution, while long deadlines would prefer that kind of instances (the effective hourly price is less in that cases). So, the deadline must be considered. In that respect, we have considered the on-demand purchasing model and the two models offered by Amazon for reserved instances where instances can be hired for 1 or 3 years under three payment methods: All Upfront, Partial Upfront and No Upfront (this last for 1-year reserved instances). Therefore, these purchasing options give up to 8×2×2+8+8=48 logical instances. The aim of considering both purchasing models is to show that the proposed model is flexible enough to support changes in purchasing models and can be used with different providers. Once the cost associated to the computing instances is obtained, we are interested in selecting the best combination of instances allowing us to meet the temporal request requirements. For that, the problem is defined as a quadratic function where there are two sets of variables. One set corresponds to the number of logical machines (the term ‘logical’ has been defined as an instance of a specific type, located in a specific region and hired according to a purchasing model and a payment option) that will be used to solve the annotation problem. The second set of variables establishes, for each one of the machines, how long it will have to be computing. The objective will be to minimize the cost ensuring the whole set of terms will be annotated and the deadline will be met. The problem is shown in Equation (5). It belongs to the class of MINLP and includes the following elements: As stated, we are considering that there are LI=48 different types of logical instances. Let N=(n1,n2,…,ni,…) be the integer variables indicating how many instances of each logical machine will be engaged in solving the problem and T=(t1,t2,…,ti,…) the time, in hours, the instances will be used. Let Tmax, in hours, the deadline provided by the customer and Nterm the number of terms to be annotated. Let CH=(ch1,ch2,…,chi,…) the real vector of euro/hour costs and CF=(cf1,cf2,…,cfi,…) the real vector of fixed costs ( euro). Fixed costs include the upfront cost and the monthly cost, if any. Let TPH=(tph1,tph2,…,tphi,…) the real vector of the throughput of each instance type, in terms/hour: tphi indicates how many terms instance type i processes per hour. Let size, in GB, the size of the disk attached to each instance (in this problem this value is 70 GB) and cEBS, in GB/hour, the cost per hour of the local storage. minimize∑i=1LI((size·cEBS+chi)·ti+cfi)·nisubjectto∑i=1LIti·ni·tphi≥Nterm0≤ni,i∈{1..LI}0≤ti≤Tmax,i∈{1..LI}ti≤24·365,icorrespondstoRes-1-yearinst.ti≤24·3·365,icorrespondstoRes-3-yearinst. (5) Constraints ti≤24·365 and ti≤24·3·365 impose that reserved instances can not be used more time that the hired one. As a result, the MINLP problem (Equation (5)) returns the combination of instances and the time each instance must be running to meet the problem constraints and requirements with a minimal cost. The problem has been programmed using the AMPL language1 and executed by the filterSQP solver and the NEOS server (Network-Enabled Optimization System) [24]. 3.4. The computing cost of some annotation requests Let us for instance consider the case of processing 10 million terms in 30 days. The minimization problem gives as a result that the terms can be annotated with a minimum computing instances cost of 3,447.27€. The combination of logical instances that achieve that cost is 43 c4.2xlarge on-demand instances running for 10.04 days (241 h) and three c4.xlarge on-demand instances running for 1 h. The total capacity hired will process up to 10,000,260 terms. As a second example, let us consider the case of processing 150 million terms (the complete Universia dataset) in 540 days. The minimization problem gives as a result that the terms can be annotated with a minimum computing instances cost of 33,216.59€. The solution corresponds to a combination composed of one c4.xlarge 1-year reserved (All upfront) instance running for 365 days (8760 h), 17 c4.2xlarge 1-year reserved (All upfront) instance running for 365 days (8760 h) and eight c4.2xlarge on-demand instances running for 286 h (2288 h). The total capacity hired will process up to 150,000,399 terms. Finally, a third example could be processing the same dataset composed of 150 million terms but in 2 years (730 days). For that, the result of the minimization problem is to use nine c4.2xlarge 3-year reserved (All upfront) instances running for 2 years. With that configuration, the computing instances cost would be 31,126.24€ and the total capacity hired would process up to 152,139,274 terms. Therefore, the best price for the given deadline could be reached with a combination of logical instances providing computing capacity for more than the required terms (in this example there is an excess of more than 2 million terms). This excess could be reused to serve other requests and, thus, it could have an effect in the final cost proposed to the customer. The solution proposed here is quite simple, yet effective. However, as stated, the performance of a given virtual resource for a given task is just an estimation with at least two varying aspects: the cost of executing a task has been estimated experimentally, and the performance of instances of a given virtual machine can also vary. Therefore, at the system execution time, one could adopt a more flexible approach, as proposed in [14, 15], monitoring the execution evolution so as to hire or free engaged (on-demand) resources with the aim of ensuring the deadline constraint. 4. SELF-CONFIGURATION OF THE FRAMEWORK At the configuration phase, the resource management framework must be parametrized to provision, use and release the required instances. As a part of this work, we have integrated a new self-configuring mechanism able to manage the life-cycle of cloud instances. The scalability of the framework must be considered in order to take the appropriate configuration decisions. 4.1. Design of the configuration mechanisms Figure 3 shows the components involved in the configuration of the execution environment. When a request is accepted, the service workflow puts, into the message bus, a set of task execution calls as well as a configuration message describing the computing requirements (the number, type and region of the computing instances as well as the purchasing model that should be used to provision them). These computing requirements are the result of the negotiation phase. Figure 3. View largeDownload slide Components involved in the configuration phase. Figure 3. View largeDownload slide Components involved in the configuration phase. The Self-configuring component takes configuration messages from the bus. Once a message is taken, the component configures the framework for the executable tasks involved in the request. Internally, a manager determines the most adequate configuration option considering a set of scalability rules. These rules have been experimentally established (they will be described in the next subsection). One of those rules determines the number of mediators that will be needed for managing the involved computing instances and submitting executable tasks to them. The manager interacts with the Amazon Web Services (AWS) to hire a computing instance per mediator. Each mediator is deployed in a specific Amazon Machine Image (AMI) we have programmed for the mediators. Once the mediators are deployed, the manager sends each one of them a description of the computing resources it must hire and locally manage (the way of distributing the computing capacity among all the mediators is also determined by the scalability rules). Internally, the Controller of a mediator is responsible for provisioning the corresponding computing instances and storage resources via the AWS interface. All these resources are locally managed by the Job Manager in order to execute pending tasks. This component takes a pending task from the bus, schedules its execution and submits it to the selected computing instance; then, when the task execution has finished, the manager receives its results and writes them into the bus. Besides, a Fault handling mechanism has been integrated into the mediator to recover from possible execution faults. Simultaneously, the availability of cloud instances, the state and cost of executing tasks and other operational parameters are monitored by the Monitor component. This component has been implemented using the Amazon Cloud Watch service, and integrates a data collector that is used to calculate the real cost of cloud resources. 4.2. Framework scalability As previously stated, the scalability of the framework must be studied in order to find the most appropriate configuration options and to analyse its performance from the service perspective. On the basis of our experience, the two architectural components that must be studied in terms of scalability are mediators and the message bus. From the mediation point of view, we are interested in identifying the number of computing instances that a mediator is able to handle efficiently. Regarding the message bus, its storage capabilities and the performance of its input/output operations must be studied. 4.2.1. Configuring the computing capacity of a mediator The goal and the requirements of this first experiment are Goal: To determine the time required by a mediator to handle an annotation task depending on the number of computing instances it manages. Configuration of used resources: The experiments were performed in the AWS EC2 Oregon region. A m3.xlarge instance (4 vCPUs, 15 GB RAM) was hired for each mediator and the annotation algorithm was executed by a set—between 1 and 400 instances—of m3.large instances (2 vCPUs, 7.5 GB RAM). In order to provide persistent block storage, a local Amazon Elastic Block Store (EBS) disk of 70 GB was attached to each of these computing instances. These instances are launched from the custom AMI that was previously detailed. Input workload: A bag of annotation tasks reused from previous executions was executed. The number of tasks is enough to have the computing instances continuously working. The mean execution time of each task is near 45 min (10 terms are annotated per task). A mediator creates a dedicated SSH connection to interact with each provisioned cloud instance it is in charge of. This connection is used to configure the execution environment of the instance, to submit it the tasks to be executed and, then, to receive the corresponding results. To experimentally estimate the number of computing instances that a mediator can manage we have hosted a mediator in a m3.xlarge instance and analyzed its behavior with respect to the number of instances. Figure 4 shows the mean time required by a mediator to handle a task request depending on the number of computing instances it handles. This time includes looking for an available computing instance, sending a task to the selected instance and recovering the results. As it was expected, when the number of handled instances increases, the required time also increases because of the management overheads. Besides, these experiments have shown that when the number of concurrent SSH connections is higher than 150, the number of connection faults significantly increases. Taking that into consideration, Figure 5 shows the throughput of a mediator against the number of instances it manages. The throughput increases in a linear way until 100 instances, showing little improvement beyond that limit. Figure 4. View largeDownload slide Mean time required for managing a task request versus the number of instances. Figure 4. View largeDownload slide Mean time required for managing a task request versus the number of instances. Figure 5. View largeDownload slide Throughput versus the number of instances. Figure 5. View largeDownload slide Throughput versus the number of instances. Considering these experiments, we have established that the maximum number of instances per mediator will be 150. This decision seems to be contradictory: the behavior of a mediator with a lower number of instances is better. Nevertheless, we have also considered another relevant issues for this decision. First, we try to minimize the cost of executing the mediators (the usage cost of a mediator is 173,60€ per month). If 150 computing instances are required during 6 months in order to respond to a request, it is six times cheaper to have a mediator with 150 instances than six mediators with 25 instances. Second, the execution time of an small-sized annotation task is near 45 min. Therefore, the time required by a mediator to handle its submission is insignificant and the management overheads have little impact on the total time (near 2 s when the mediator integrates 150 instances or near 0.8 s when it integrates 25 instances). Third, the current implementation of the mediator is able to handle that number of SSH connections. And, finally, from the perspective of our application domain, only more than 150 instances are required when the customer’s deadline is very short. 4.2.2. Performance of a SQS-based message bus The goal and the requirements of the second experiment are Goal: To determine the time of reading/writing a task from/into the bus depending on the number of connected clients. Configuration of used resources: The experiments were performed in the AWS EC2 Oregon region. The bus was deployed using the Amazon Simple Queue Service (SQS) and configured to have a simple Amazon Elastic Load balancer and a maximum of five request managers. t2.micro (1 vCPU, 1 GB RAM) instances were hired to execute each one of these managers. On the other hand, clients were executed in m3.large (2 vCPUs, 7.5 GB RAM) computing instances. These instances are launched from the custom AMI that was previously detailed. Input workload: A bag of simulated tasks was programmed. The mean execution time of these tasks is less than one second because the goal is maximize the flow of tasks between the bus and their clients. In [25], we presented an implementation of the message bus based on the Amazon Simple Queue Service (SQS). This service guarantees the proposed cloud-based bus to be highly available, scalable and reliable, with an extensible storage capacity for messages, proving to be an appropriate solution to solve computing-intensive problems. Let us now evaluate the performance of the bus from the perspective of this work. We are interested in studying the performance of its input/output operations (submit/read a task into/from the bus) versus the number of requests that are being concurrently processed. First, we study the mean time of reading a task from the bus related to the number of mediators that are being executed. In this experiment, each mediator is continuously taking tasks from the bus. Figure 6 shows that the time of reading a task remains more or less constant (between 30 and 35 ms) independently of the number of mediators that are concurrently accessing the bus. Figure 6. View largeDownload slide Mean time of reading a job from the bus versus the number of deployed mediators. Figure 6. View largeDownload slide Mean time of reading a job from the bus versus the number of deployed mediators. On the other hand, for each customer request, the annotation service creates a workflow which submits a set of executable tasks to the framework. This submission consists of writing into the message bus the description of each task. In a first experiment, we have concluded that the best throughput is reached when the workflow consists of 25 threads submitting tasks to the framework (more specifically, the throughput is 76 tasks per second). In a subsequent experiment, we have studied the mean time of writing a task into the bus with respect to the number of workflows that are being executed at the service level. Figure 7 shows that the mean time that a thread needs to write a task into the bus remains more or less constant (between 245 and 260 ms) independently on the number of workflows. Figure 7. View largeDownload slide Mean time that a thread needs to write a task into the bus depending on the number of workflows. Figure 7. View largeDownload slide Mean time that a thread needs to write a task into the bus depending on the number of workflows. Finally, on the basis of the previous experiments, we would like to analyse whether our decisions are compatible: in other words, Is a workflow composed by 25 threads able to provide enough executable tasks to keep all the computing instances managed by the mediators busy? The throughput of that workflow is near 100 task submissions per second. The throughput of a mediator that integrates 150 instances is 66 task readings per second in the best case (assuming that the execution time of a task is zero seconds). Therefore, a workflow could provide tasks to keep busy more than 225 cloud instances in the worst case. Nevertheless, in a real scenario, the execution time of each annotation task is near 45 min and, therefore, this type of workflow would be able to submit tasks for a significantly higher number of computing instances. As a result of the experiments, we can conclude that using workflows with 25 threads, the proposed cloud-based bus, and the creation of mediators composed of a maximum of 150 instances is a good parametrization to successfully provide the proposed service. 5. RELATED WORK In this section, two different types of research proposals are discussed. First, we review economic models to estimate the cost of executing an application in the cloud. These models require that users decide the cloud resources to hire for deploying their applications. On the other hand, instead of estimating the execution cost, another proposals determine directly the most economical combination of cloud resources needed to execute an application. 5.1. Estimation of the execution cost of cloud-based applications Experimentation-based techniques can be used to estimate the cost of executing an application on a given cloud (a use-case of these techniques was presented in [26], for instance). The main disadvantage of these techniques is the price one has to pay for the execution of the experiments (generally, the validity of the estimations depends on the money spent in the experimentations). The definition of cost models is an alternative approach. These models help customers in deciding what parts of their applications should be executed in the cloud, and when. In general, the proposed approaches [3, 4, 5] require that customers know or estimate the requirements of their applications, such as execution times, input and output data size, or storage requirements. Once known, those operational parameters are mapped onto the basic prices provided by cloud providers in order to estimate the execution costs. As a particular case, in [6], authors discuss the need of defining different cost formulas for families of applications, and propose a set of formulas for sequential, multi-thread, parallel or MPI programs and workflows. Those formulas are used for the development of a service able to capture the operational information of an application and estimate its cost. In some cases, it is not easy to know or determine the application requirements in order to estimate its executing costs. CloudTracker recommends scientists the type of computing instance that must be hired to execute and replay their large scale experiments [27]. When an experiment is executed for the first time, the system tracks it and stores the information needed to replay it in the future. CloudTracker uses this information to automatically run the experiment in different types of instances and to determine the cheapest and the fastest execution option. These options will help scientists to decide the instances to hire each time the experiment is replayed. Besides, CloudTracker determines the cheapest time instant to store the experiment’s results, avoiding the same computations to be repeated. Finally, [28] presents a model to decide where to place a set of services on a federated hybrid cloud. The model estimates the costs of using an in-house cloud-enabled data center (private cloud) and different public clouds. In the first case, the cost factors are related to electricity consumption, software licenses, hardware maintenance and the equipment required by the data center; while in the case of public providers, the factors involve the hiring of computing instances and the data transfer between private and public resources. Then, this estimation model is integrated in an optimization algorithm to evaluate the different options of service placement and to determine the cheaper deployment. 5.2. Minimization of the cost of cloud resource provisioning A different approach is adopted for those works that look for a function describing the cost in terms of the tasks to be executed, the resources proposed by the service providers and the time constrains, with the goal of optimizing some given parameters. These approaches, usually, focus on IaaS clouds (Infrastructure as a Service). Using different techniques they try to find the best combination of cloud resources that should be provisioned for the execution of an application. This combination is computed considering the various types of resources offered by the cloud providers, the application requirements as well as the user requirements (mainly, its budget and deadline). Most of those solutions are integrated into scheduling algorithms for IaaS clouds, and consider the possibility of scaling the provisioned resources to manage the uncertain behaviors of the executed applications. Let us introduce in Table 3, a taxonomy of some important works related to the optimization of resource provisioning in cloud systems. For each method, we are going to consider the following classification criteria: the input constraints, the different costs and provider options considered, the uncertainty of the provided provisioning plans, the concrete technique used to compute the optimal resource combination, the type of application the method focuses on and, finally, whether the method is integrated into an application engine. Table 3. Comparative analysis of the methods for minimizing the costs of provisioning. Criteria of the taxonomy Papers about the minimization of costs in cloud Our approach [7] [11] [8] [9] [10] [12] [13] [14] [15] Input constraints Workload. Deadline Workload. Deadline. Max.Resources Workload. Deadline. Budget Workload. Deadline Workload. Deadline Workload. Deadline Workload. Deadline Deadline. Budget Num. VM per VM class Workload. Deadline The cost of cloud services Cost of computing ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ Cost of local storage ✓ ✓ ✗ ✗ ✗ ✗ ✗ ✗ ✓ ✗ Cost of data services ✓ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ The options offered by cloud providers Different computing resources ✓ ✓ ✗ ✓ ✗ ✗ ✓ ✗ ✗ ✓ Different payment models ✓ ✗ ✗ ✗ ✓ ✗ ✗ ✗ ✓ ✗ Different service providers ✓ ✓ ✗ ✗ ✗ ✓ ✓ ✗ ✓ ✗ The uncertainty of provisioning plans Effects of virtualization ✗ ✗ ✗ ✓ ✗ ✓ ✓ ✗ ✓ ✗ Reconfigure the provisioning ✗ ✗ ✓ ✓ ✓ ✓ ✗ ✓ ✓ ✓ Type of technique MINLP BLP Provisioning & Planning Heuristic Optimization model Heuristic Meta-heuristic Provisioning & Planning Stochastic ILP ILP Type of application Bag-of-task Bag-of-task Workflow MapReduce Independent jobs Workflow Workflow Independent jobs Class of VM Independent jobs Applicability of the technique Integrated into an app. engine ✓ ✗ ✗ ✓ ✗ ✗ ✗ ✗ ✗ ✗ Integrated into a scheduling alg. ✗ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✗ ✓ Integrated with scaling ✓ ✗ ✓ ✓ ✓ ✗ ✗ ✗ ✓ ✓ Criteria of the taxonomy Papers about the minimization of costs in cloud Our approach [7] [11] [8] [9] [10] [12] [13] [14] [15] Input constraints Workload. Deadline Workload. Deadline. Max.Resources Workload. Deadline. Budget Workload. Deadline Workload. Deadline Workload. Deadline Workload. Deadline Deadline. Budget Num. VM per VM class Workload. Deadline The cost of cloud services Cost of computing ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ Cost of local storage ✓ ✓ ✗ ✗ ✗ ✗ ✗ ✗ ✓ ✗ Cost of data services ✓ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ The options offered by cloud providers Different computing resources ✓ ✓ ✗ ✓ ✗ ✗ ✓ ✗ ✗ ✓ Different payment models ✓ ✗ ✗ ✗ ✓ ✗ ✗ ✗ ✓ ✗ Different service providers ✓ ✓ ✗ ✗ ✗ ✓ ✓ ✗ ✓ ✗ The uncertainty of provisioning plans Effects of virtualization ✗ ✗ ✗ ✓ ✗ ✓ ✓ ✗ ✓ ✗ Reconfigure the provisioning ✗ ✗ ✓ ✓ ✓ ✓ ✗ ✓ ✓ ✓ Type of technique MINLP BLP Provisioning & Planning Heuristic Optimization model Heuristic Meta-heuristic Provisioning & Planning Stochastic ILP ILP Type of application Bag-of-task Bag-of-task Workflow MapReduce Independent jobs Workflow Workflow Independent jobs Class of VM Independent jobs Applicability of the technique Integrated into an app. engine ✓ ✗ ✗ ✓ ✗ ✗ ✗ ✗ ✗ ✗ Integrated into a scheduling alg. ✗ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✗ ✓ Integrated with scaling ✓ ✗ ✓ ✓ ✓ ✗ ✗ ✗ ✓ ✓ View Large Table 3. Comparative analysis of the methods for minimizing the costs of provisioning. Criteria of the taxonomy Papers about the minimization of costs in cloud Our approach [7] [11] [8] [9] [10] [12] [13] [14] [15] Input constraints Workload. Deadline Workload. Deadline. Max.Resources Workload. Deadline. Budget Workload. Deadline Workload. Deadline Workload. Deadline Workload. Deadline Deadline. Budget Num. VM per VM class Workload. Deadline The cost of cloud services Cost of computing ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ Cost of local storage ✓ ✓ ✗ ✗ ✗ ✗ ✗ ✗ ✓ ✗ Cost of data services ✓ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ The options offered by cloud providers Different computing resources ✓ ✓ ✗ ✓ ✗ ✗ ✓ ✗ ✗ ✓ Different payment models ✓ ✗ ✗ ✗ ✓ ✗ ✗ ✗ ✓ ✗ Different service providers ✓ ✓ ✗ ✗ ✗ ✓ ✓ ✗ ✓ ✗ The uncertainty of provisioning plans Effects of virtualization ✗ ✗ ✗ ✓ ✗ ✓ ✓ ✗ ✓ ✗ Reconfigure the provisioning ✗ ✗ ✓ ✓ ✓ ✓ ✗ ✓ ✓ ✓ Type of technique MINLP BLP Provisioning & Planning Heuristic Optimization model Heuristic Meta-heuristic Provisioning & Planning Stochastic ILP ILP Type of application Bag-of-task Bag-of-task Workflow MapReduce Independent jobs Workflow Workflow Independent jobs Class of VM Independent jobs Applicability of the technique Integrated into an app. engine ✓ ✗ ✗ ✓ ✗ ✗ ✗ ✗ ✗ ✗ Integrated into a scheduling alg. ✗ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✗ ✓ Integrated with scaling ✓ ✗ ✓ ✓ ✓ ✗ ✗ ✗ ✓ ✓ Criteria of the taxonomy Papers about the minimization of costs in cloud Our approach [7] [11] [8] [9] [10] [12] [13] [14] [15] Input constraints Workload. Deadline Workload. Deadline. Max.Resources Workload. Deadline. Budget Workload. Deadline Workload. Deadline Workload. Deadline Workload. Deadline Deadline. Budget Num. VM per VM class Workload. Deadline The cost of cloud services Cost of computing ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ Cost of local storage ✓ ✓ ✗ ✗ ✗ ✗ ✗ ✗ ✓ ✗ Cost of data services ✓ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ The options offered by cloud providers Different computing resources ✓ ✓ ✗ ✓ ✗ ✗ ✓ ✗ ✗ ✓ Different payment models ✓ ✗ ✗ ✗ ✓ ✗ ✗ ✗ ✓ ✗ Different service providers ✓ ✓ ✗ ✗ ✗ ✓ ✓ ✗ ✓ ✗ The uncertainty of provisioning plans Effects of virtualization ✗ ✗ ✗ ✓ ✗ ✓ ✓ ✗ ✓ ✗ Reconfigure the provisioning ✗ ✗ ✓ ✓ ✓ ✓ ✗ ✓ ✓ ✓ Type of technique MINLP BLP Provisioning & Planning Heuristic Optimization model Heuristic Meta-heuristic Provisioning & Planning Stochastic ILP ILP Type of application Bag-of-task Bag-of-task Workflow MapReduce Independent jobs Workflow Workflow Independent jobs Class of VM Independent jobs Applicability of the technique Integrated into an app. engine ✓ ✗ ✗ ✓ ✗ ✗ ✗ ✗ ✗ ✗ Integrated into a scheduling alg. ✗ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✗ ✓ Integrated with scaling ✓ ✗ ✓ ✓ ✓ ✗ ✗ ✗ ✓ ✓ View Large All the considered approaches minimize execution costs from the application’s users point of view. The provider side has also been studied with the goal of reducing the operating costs of running their infrastructures [29, 30]. However, this perspective is out of the scope of this paper. Let us now discuss some important aspects of the classification. 5.2.1. Input constraints Most proposals require the application workload and the execution deadline as input parameters. References [11, 13] also require the user’s budget because their strategies of cost reduction consist of hiring the maximum number of computing resources and, then, scaling down these resources according to how well they are used by the application. 5.2.2. Costs of cloud services All the approaches are mainly concerned with reducing the costs of computing resources. Nevertheless, these minimization methods also compute provisioning plans for the execution of applications handling high volumes of data (Bag-of-Task applications, scientific workflows or MapReduce systems, for instance). In data-intensive applications, the cost of data storage and transfer can be an essential part of the total cost and, therefore, it should be not ignored. As an exception, [7, 14] include the cost of local storage devices, while [7, 9, 12] estimate the cost of data transfers. As stated, we consider the inclusion of data storage and movement an important aspect for the case of Bag-of-Task applications. In the optimization problem here proposed such costs are considered (although in a nonlinear way). Let us explain by means of a concrete example why it is important. We have estimated the cost of solving the 150 million terms problem in 540 days as 33,216.59€, included 8760 h of storage required. If we solve the problem without considering data storage (just remove the time variables in Problem 5, in which case the problem is similar to the one proposed in [15]), the cost will be 149.98€ less. It is important to remark that integrating storage costs in the optimization problem is computed is necessary. On the contrary, one should have to pay for the storage during the complete (optimal) machine hiring time, getting a worse solution since it is possible that a machine would not be necessary for its whole hiring time, but just for a part of it. 5.2.3. Options offered by providers Cloud providers offer different types of virtual computing resources with different prices. Many approaches ignore this variety of resources and calculate their provisioning plans considering a unique type of resource [ 9, 10, 11, 13, 14]. On the other hand, providers also offer different payment options. For instance, in Amazon EC2, there are three ways to pay for computing instances: On-demand, Reserved instances, and Spot instances (other providers offer similar payment models, such as Google Cloud platform). In many cases a suitable choice of the payment method can involve an important saving of execution costs. This possibility has been considered in the proposed solution and in the ones in [9, 14]. The classification also considers whether the optimization methods could be reused to compute the requirements of an application in different providers or in an execution environment integrating multiple providers. 5.2.4. Uncertainty of provisioning plans Many approaches have assumed that the performance of computing resources of the same type is homogeneous. Nevertheless, [31–33] demonstrate that this assumption is not true due to the shared nature of the cloud infrastructure as well as the use of virtualization techniques and the heterogeneity of the underlying hardware. The case of the Amazon EC2 infrastructure was studied in [31] concluding that the performance of a virtual instance is relatively stable while the performance of multiple virtual machines of the same type is rather heterogeneous. Additionally, other issue that the usually models ignore is the start-up latency of cloud computing instances. This latency has been reduced in cloud environments restoring previously created virtual-machine snapshots with fully initialized application [34] or reusing the purchased computing instances for the execution of the new applications [35]. Table 3 shows the approaches that consider the performance uncertainty of resources and are able to reconfigure their provisioning decisions according to these changing conditions. 5.2.5. Optimization technique Different techniques have been used to determine the cheapest combination of cloud resources that should be provisioned for the execution of a given application. Nevertheless, Integer Programming Problems [14, 15] (IPP) (or Binary Linear Programming problems [7], as a particular case of IPP), heuristics techniques [8, 10, 12] and optimization models [9] are usually programmed. In contrast with these solutions, [11, 13] propose two different algorithms to manage the cost-driven provisioning and plan the application’s execution. 5.2.6. Type of applications These methods reduce the cost of executing highly parallel applications, such as Bag-of-Task (BoT) applications [7], Scientific workflows [10, 11, 12], MapReduce applications [8] or independent jobs [9, 13, 15] (these could be considered a particular case of BoT application where the jobs’ requirements are heterogeneous). The interest in those classes of problems is due to the fact that cloud has proved to be an extremely efficient environment for executing them. On the other hand, [14] minimizes to cost of provisioning different classes of virtual machines (VM). Authors assume that a VM class corresponds to a concrete type of application (a Web server, for example) and, therefore, the provisioning is planned for the execution of multiple instances of different (and simple) applications. 5.2.7. Applications and execution engines Another relevant criteria is the real applicability of the optimization method. The method described in this paper and the one proposed by [8] have been integrated into an engine for the execution of applications. In our case the engine is a part of a service-oriented framework described in Section 2. Reference [8] integrates its solution into a service, called Cura, specialized in executing MapReduce applications. Both engines are able to manage the scaling features provided by the cloud system in order to change the combination of resources computed by their optimization method. For the framework described in this paper, the adaptation of the hired resources to the application’s requirements is very important because we don’t consider the uncertainty of plans as a part of the optimization method. The rest of considered approaches focus on combining their provisioning methods with scheduling algorithms, but they are not subsequently integrated into execution tools. The presented analysis emphasizes the relevant differences of the proposed method for cost-driven provisioning with respect to the existing approaches. As shown in the second column of Table 3, our approach is the only one considering the cost of data storage and transfer, as well as the different options for provisioning and payment offered by public cloud providers (see the cost of cloud services and the options offered by providers criteria, respectively). The first issue can have a significant impact on the total cost, mainly in the case of data-intensive applications. On the other hand, an adequate choice of the instance types and payment models can also lead to important savings in the execution costs. The fact that the proposed provisioning method can be applied in a natural way to different providers can help in obtaining lower costs since a wider set of resources can be considered. Another relevant contribution is that the method has been integrated into an execution framework. This is important since it demonstrates the applicability of the proposed solution. The hiring of resources is now driven by the cost and the user’s deadline. As Table 3 shows, the main lack of the proposed approach is the lack of a way of dealing with the uncertainty of provisioning plans, which comes from the uncertainty of the performances of different instances of the same instance-type. 6. CONCLUSIONS AND FUTURE WORK We have presented the case study of the migration of a semantic-annotation service to the Amazon cloud. The aim was to offer the service as a pay-per-use application. We estimated the costs of deploying the service and concluded that hired resources for solving the computation part of the problem was the most expensive one, much more than the resources required for the necessary deployment infrastructure. We have then concentrated on defining a method for minimizing that cost considering the time constraints imposed by the client and the variety of resources and facilities offered by public cloud providers (the wide catalog of resources and services, the different ways to hire/pay for, the possibility of hiring resources in different geographical areas, etc.). Unlike other existing approaches, the proposed method also considers other cost factors involved in the execution of a service request: the cost of data management (data storage as well as input/output operations) and the cost of the framework components that are needed to control the execution of each service request, for instance. On the other hand, from an execution point of view, the service is also able of dynamically managing the life-cycle of the involved computing instances (provision, use and release) and storage resources. For that, a new extended version of our previous management framework has been developed so as to include cloud-based management and mediation components (more specifically, a mediator for the integration of cloud environments and a component for self-configuring the framework in cloud environments). The proposed methodology has also been applied to study the cost of executing the semantic annotation process in the Microsoft Azure cloud. This case of study tried to demonstrate that the approach can be used on different cloud service providers. The performance of Azure computing instances has been experimentally estimated from the annotation process’ point of view, and a cost-driven provisioning plan computed for some of the cases of Amazon-based processing. We have concluded that the method is a sufficiently general solution and, therefore, it could be applied to different public providers. As future work, we are interested in extending and improving monitoring aspects, which could be useful for improving both cost estimations and the provisioning process [3]. We are also interested in integrating into the framework mechanisms for sharing computing resources between requests that are being concurrently executed, as well as auto-scaling mechanisms for monitoring and managing the changing workload of these requests [36, 37]. On the other hand, as stated in Section 5, the framework lacks of a way to deal with the uncertainty generated by the different performances of different instances of the same instance-type. This is an important aspect to be considered for the near future. The big variety and rapid changes in the resources that can be used can make more interesting the implementation of some monitoring system so as to compare at some points in time the deviation between the number of actually executed tasks and the number of tasks that should have to be executed. This way some additional resources could be provisioned with the aim of reaching the established deadline (or, alternatively, some resources could be released). Finally, we would like to address some open issues closely related to the Amazon instances: new types of computing instances are being continuously added to the Amazon instance catalog whose performance/cost ratios and purchasing options should be dynamically analyzed and integrated (spot instances [38] or dedicated instances, for instance), the same way obsolete types should be removed from the problem formulation. Another line of research must point towards refining the business model of the annotation service from the provider and user perspective [39]. FUNDING This work has been supported by the research projects TIN2014-56633-C3-2-R, TIN2015-72241-EXP and TIN2017-84796-C2-2-R, granted by the Spanish Ministerio de Economía y Competitividad. REFERENCES 1 Hajjat , M. , Sun , X. , Sung , Y.-W.E. , Maltz , D. , Rao , S. , Sripanidkulchai , K. and Tawarmalani , M. ( 2010 ) Cloudward bound: planning for beneficial migration of enterprise applications to the cloud . SIGCOMM Comput. Commun. Rev. , 40 , 243 – 254 . Google Scholar CrossRef Search ADS 2 Tak , B.C. , Urgaonkar , B. and Sivasubramaniam , A. ( 2011 ) To Move or Not to Move: The Economics of Cloud Computing. Proc. 3rd USENIX Conf. Hot Topics in Cloud Computing, Portland, OR, USA, June 14–15, pp. 5–5. USENIX Association, Berkeley, CA, USA. 3 Truong , H.-L. and Dustdar , S. ( 2011 ) Cloud computing for small research groups in computational science and engineering: current status and outlook . Computing , 91 , 75 – 91 . Google Scholar CrossRef Search ADS 4 De Alfonso , C. , Caballer , M. , Alvarruiz , F. and Moltó , G. ( 2013 ) An economic and energy-aware analysis of the viability of outsourcing cluster computing to a cloud . Future Generation Comput. Syst. , 29 , 704 – 712 . Google Scholar CrossRef Search ADS 5 Kashef , M.M. and Altmann , J. ( 2012 ) A Cost Model for Hybrid Clouds. Proc. 8th Int. Conf. Economics of Grids, Clouds, Systems, and Services, Paphos, Cyprus, 5 December, pp. 46–60. Springer, Berlin, Heidelberg. 6 Truong , H.-L. and Dustdar , S. ( 2010 ) Composable cost estimation and monitoring for computational applications in cloud computing environments . Procedia Comput. Sci. , 1 , 2175 – 2184 . Google Scholar CrossRef Search ADS 7 Abdi , S. , PourKarimi , L. , Ahmadi , M. and Zargari , F. ( 2017 ) Cost minimization for deadline-constrained bag-of-tasks applications in federated hybrid clouds . Future Generation Comput. Syst. , 71 , 113 – 128 . Google Scholar CrossRef Search ADS 8 Palanisamy , B. , Singh , A. and Liu , L. ( 2015 ) Cost-effective resource provisioning for mapreduce in a cloud . IEEE Trans. Parallel Distributed Syst. , 26 , 1265 – 1279 . Google Scholar CrossRef Search ADS 9 Li , S. , Zhou , Y. , Jiao , L. , Yan , X. , Wang , X. and Lyu , M.R.T. ( 2015 ) Towards operational cost minimization in hybrid clouds for dynamic resource provisioning with delay-aware optimization . IEEE Trans. Serv. Comput. , 8 , 398 – 409 . Google Scholar CrossRef Search ADS 10 Pietri , I. and Sakellariou , R. ( 2015 ) Cost-efficient CPU provisioning for scientific workflows on clouds. In Altmann , J. , Silaghi , G.C. and Rana , O.F. (eds.) Proc. 12th Int. Conf. Economics of Grids, Clouds, Systems, and Services, Cluj-Napoca, Romania, September 15–17 , pp. 49 – 64 . Springer , Berlin, Heidelberg . 11 Malawski , M. , Juve , G. , Deelman , E. and Nabrzyski , J. ( 2015 ) Algorithms for cost- and deadline-constrained provisioning for scientific workflow ensembles in IaaS clouds . Future Generation Computer Systems , 48 , 1 – 18 . Google Scholar CrossRef Search ADS 12 Rodríguez , M.A. and Buyya , R. ( 2014 ) Deadline based resource provisioning and scheduling algorithm for scientific workflows on clouds . IEEE Trans. Cloud Comput. , 2 , 222 – 235 . Google Scholar CrossRef Search ADS 13 Trivedi , N. and Chudasama , D. ( 2013 ) Dynamic resource provisioning for deadline and budget constrained application in cloud environment . Int. J. Comput. Technol. Appl. , 4 , 462 – 465 . 14 Chaisiri , S. , Lee , B.S. and Niyato , D. ( 2012 ) Optimization of resource provisioning cost in cloud computing . IEEE Trans. Serv. Comput. , 5 , 164 – 177 . Google Scholar CrossRef Search ADS 15 Mao , M. , Li , J. and Humphrey , M. ( 2010 ) Cloud Auto-Scaling with Deadline and Budget Constraints. In Proc. 11th IEEE/ACM Int. Conf. Grid Computing, pp. 41–48. 16 Fabra , J. , Hernández , S. , Otero , E. , Vidal , J.C. , Lama , M. and Álvarez , P. ( 2015 ) Integration of grid, cluster and cloud resources to semantically annotate a large-sized repository of learning objects . Concurrency Comput. , 27 , 4603 – 4629 . Google Scholar CrossRef Search ADS 17 Fabra , J. , Hernández , S. , Ezpeleta , J. and Álvarez , P. ( 2013 ) Solving the interoperability problem by means of a bus. An experience on the integration of grid, cluster and cloud infrastructures . J. Grid Comput. , 12 , 41 – 65 . Google Scholar CrossRef Search ADS 18 Lama , M. , Vidal , J.C. , Otero-Garca , E. , Bugarn , A. and Barro , S. ( 2012 ) Semantic linking of learning object repositories to DBpedia . Educ. Technol. Soc. , 15 , 47 – 61 . 19 DBpedia ( 2017 ). http://dbpedia.org/. Accessed 29 May 2017. 20 Hernández , S. , Fabra , J. , Álvarez , P. and Ezpeleta , J. ( 2013 ) Cost Evaluation of Migrating a Computation Intensive Problem from Clusters to Cloud. In Proc. 10th Int. Conf. Economics of Grids, Clouds, Systems and Services, Zaragoza, Spain, September 18–20, pp. 90–105. Springer. 21 Álvarez , P. , Hernández , S. , Fabra , J. and Ezpeleta , J. ( 2016 ) Cost Estimation for the Provisioning of Computing Resources to Execute Bag-of-Tasks Applications in the Amazon Cloud. In Altmann , J. , Silaghi , G.C. and Rana , O.F. (eds.) Economics of Grids, Clouds, Systems, and Services: 12th International Conference, GECON 2015, Cluj-Napoca, Romania, September 15-17, 2015, Revised Selected Papers . Springer International Publishing , Cham . 22 Amazon Elastic Compute Cloud (Amazon EC2) ( 2017 ). http://aws.amazon.com/ec2/. Accessed 29 May 2017. 23 Schwanengel , A. and Hohenstein , U. ( 2013 ) Challenges with Tenant-Specific Cost Determination in Multi-tenant Applications. In Proc. Third Int. Conf. Cloud Computing, GRIDs, and Virtualization, Valencia, Spain, 27 May–1 June, pp. 36–42. IARIA, Red Hook, NY, USA. 24 Czyzyk , J. , Mesnier , M.P. and Moré , J.J. ( 1998 ) The NEOS server . IEEE J. Comput. Sci. Eng. , 5 , 68 – 75 . Google Scholar CrossRef Search ADS 25 Hernández , S. , Fabra , J. , Álvarez , P. and Ezpeleta , J. ( 2013 ) A Reliable and Scalable Service Bus Based on Amazon SQS. In Proc. 2nd Eur. Conf. Service-Oriented and Cloud Computing, Málaga, Spain, 11–13 September, pp. 196–211. Springer, Berlin, Heidelberg. 26 Juve , G. , Deelman , E. , Vahi , K. , Mehta , G. , Berriman , G.B. , Berman , B.P. and Maechling , P. ( 2009 ) Scientific Workflow Applications on Amazon EC2. In Proc. 5th IEEE Int. Conf. e-Science, Oxford, United Kingdom, December 9–11, pp. 59–66. IEEE, Washington, DC, USA. 27 Douglas , G. , Drawert , B. , Krintz , C. and Wolski , R. ( 2014 ) CloudTracker: Using Execution Provenance to Optimize the Cost of Cloud Use. In Altmann , J. , Vanmechelen , K. and Rana , O.F. (eds.) Economics of Grids, Clouds, Systems, and Services: 11th International Conference, GECON 2014, Cardiff, UK, September 16–18, 2014. Revised Selected Papers . Springer International Publishing , Cham . 28 Altmann , J. and Kashef , M.M. ( 2014 ) Cost model based service placement in federated hybrid clouds . Future Generation Comput. Syst. , 41 , 79 – 90 . Google Scholar CrossRef Search ADS 29 Patel , K. , Patel , H. and Patel , N. ( 2017 ) Achieving Energy Aware Mechanism in Cloud Computing Environment. In Modi , N. , Verma , P. and Trivedi , B. (eds.) Proc. Int. Conf. Communication and Networks: ComNet 2016 . Springer Singapore , Singapore . 30 Grygorenko , D. , Farokhi , S. and Brandic , I. ( 2016 ) Cost-Aware VM Placement Across Distributed DCs Using Bayesian Networks. In Altmann , J. , Silaghi , G.C. and Rana , O.F. (eds.) Economics of Grids, Clouds, Systems, and Services: 12th International Conference, GECON 2015, Cluj-Napoca, Romania, September 15–17, 2015, Revised Selected Papers . Springer International Publishing , Cham . 31 Dejun , J. , Pierre , G. and Chi , C.-H. ( 2009 ) EC2 Performance Analysis for Resource Provisioning of Service-Oriented Applications. In Proc. 2009 Int. Conf. Service-oriented Computing, Stockholm, Sweden, November 23–27, pp. 197–207. Springer-Verlag, Berlin, Heidelberg. 32 Dejun , J. , Pierre , G. and Chi , C.-H. ( 2011 ) Resource Provisioning of Web Applications in Heterogeneous Clouds. In Proc. 2Nd USENIX Conf. Web Application Development, Portland, OR, USA, June 15–16, pp. 5–5. USENIX Association, Berkeley, CA, USA. 33 Schad , J. , Dittrich , J. and Quiané-Ruiz , J.-A. ( 2010 ) Runtime measurements in the cloud: Observing, analyzing, and reducing variance . Proc. VLDB Endow. , 3 , 460 – 471 . Google Scholar CrossRef Search ADS 34 Zhu , J. , Jiang , Z. and Xiao , Z. ( 2011 ) Twinkle: A Fast Resource Provisioning Mechanism for Internet Services. In Proc. IEEE INFOCOM 2011, Shanghai, China, April 10–15, pp. 802–810. IEEE, Washington, DC, USA. 35 Wu , L. , Garg , S.K. and Buyya , R. ( 2011 ) SLA-Based Resource Allocation for Software As a Service Provider (SaaS) in Cloud Computing Environments. In Proc. 2011 11th IEEE/ACM Int. Symp. Cluster, Cloud and Grid Computing, Newport Beach, CA, USA, May 23–26, pp. 195–204. IEEE Computer Society, Washington, DC, USA. 36 Kim , H. , el Khamra , Y. , Rodero , I. , Jha , S. and Parashar , M. ( 2011 ) Autonomic management of application workflows on hybrid computing infrastructure . Sci. Programming , 19 , 75 – 89 . Google Scholar CrossRef Search ADS 37 Mao , M. and Humphrey , M. ( 2011 ) Auto-Scaling to Minimize Cost and Meet Application Deadlines in Cloud Workflows. In Proc. 2011 Int. Conf. for High Performance Computing, Networking, Storage and Analysis, Seattle, Washington, 12–18 November, pp. 49:1–49:12. ACM, New York, NY, USA. 38 Son , S. and Sim , K.M. ( 2012 ) A price- and-time-slot-negotiation mechanism for cloud service reservations . IEEE Trans. Syst. Man Cybern. Part B , 42 , 713 – 728 . Google Scholar CrossRef Search ADS 39 Joha , A. and Janssen , M. ( 2012 ) Design choices underlying the software as a Service (SaaS) business model from the user perspective: exploring the fourth wave of outsourcing . J. UCS , 18 , 1501 – 1522 . Footnotes 1 http://ampl.com/ © The British Computer Society 2018. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model) http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png The Computer Journal Oxford University Press

Cost-driven provisioning and execution of a computing-intensive service on the Amazon EC2

The Computer Journal , Volume 61 (9) – Sep 1, 2018

Loading next page...
 
/lp/ou_press/cost-driven-provisioning-and-execution-of-a-computing-intensive-e9DhSd4t0x
Publisher
Oxford University Press
Copyright
© The British Computer Society 2018. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com
ISSN
0010-4620
eISSN
1460-2067
DOI
10.1093/comjnl/bxy006
Publisher site
See Article on Publisher Site

Abstract

Abstract The decision of migrating a service to a cloud-based system must take into consideration many different aspects. Among them, economical costs is one of the most important. This paper describes how a computing-intensive service, based on a bag-of-tasks approach, has been migrated from a grid infrastructure to the Amazon Elastic Compute Cloud (EC2) infrastructure. A cost-based model for the evaluation of the economic costs of providing the service on the cloud has also been proposed, which considers computing costs as well as storage and platform deployment costs. The model includes a wide range of different instance types and purchasing policies provided by the Amazon EC2, as well as the deadline and problem size provided by the service user. The paper also shows how the proposed cost-based model is integrated into the framework used for the service deployment and execution, making possible the interaction with the Amazon Web services for hiring the required cloud resources and efficiently use them for the execution of the service requests. 1. INTRODUCTION Cloud computing offers several advantages for the deployment and execution of applications: it reduces the spending on IT infrastructures of organizations, facilitates its management, improves and scales the accessibility of applications, minimizes the payment of salaries and energy-consumption bills, etc. Nevertheless, the decision of migrating an application to a cloud-based hosting is complex [1–3]. Even when the migration has been decided, multiple alternatives in which an application might be deployed using the facilities offered by the cloud providers must be evaluated. Without a doubt, the economic cost is a decisive factor that influences the adoption of these decisions from the customer’s point of view. The first stage when planning the execution of an application in a cloud environment is to determine the resources that will be provisioned. Nowadays, Infrastructure-as-a-Service (IaaS) cloud providers offer a wide variety of computing and storage resources for executing applications. From an economical perspective, to find the best resource combination for executing an application is a challenge in the field of the provisioning of cloud resources. The price and performance of resources, the application requirements and the customer constraints (budget and time constraints, mainly) are critical issues for determining a well-suited provisioning policy. Some research proposals define models to estimate those costs when the cloud resources are selected according to a concrete provisioning policy [4–6]. Nevertheless, and in spite of the interest of these solutions, the goal is to determine the cheapest provisioning of resources. References [7–15] define different methods for minimizing the cost of provisioning cloud resources. However, current proposals suffer from some relevant shortcomings: either the cost of data involved in the application execution is ignored or partially considered, or the variety of provided resources and payment models offered by public cloud providers are not considered, or the prototypes are used to just evaluate simple study cases, without being integrated into application engines. The two first shortcomings can cause the computed total cost to be inaccurate (specially for data-intensive applications), or that some provisioning options are ignored (the cost of executing a long-term application in the Amazon Elastic Compute Cloud (EC2), for instance, can drastically vary depending on the way instances are hired and payed). The third shortcoming constrains the applicability of the proposals and the possibilities of validation of the results in real scenarios. In [16], we presented a bag-of-tasks application to annotate a repository containing more than 15 million of educational resources using cluster and grid computing infrastructures. It was fully annotated in 178 days (more than 1.5 million of CPU hours were required). The interest of the previous results motivated to think about transforming the semantic-annotation application into a pay-per-use service executable in the Amazon EC2. Each service customer will pay for the annotation of an input workload under certain users’ constraints, so that the final price will depend on the type of computing and storage resources that have been hired for completing the annotation process. The problem of migrating the first version of the application to a cloud environment entailed to address some of the previously discussed shortcomings about the cost-aware provisioning of resources. Our research focus on two open research questions: How to reduce the price of executing bag-of-task applications in IaaS Cloud providers considering all the involved cost factors and the different providers offers?. And, once a cost model has been defined, How to design and implement a cost-driven provisioning mechanism for the efficient execution of computing-intensive and data-intensive applications?. The contributions of this paper are aligned with that two questions. First, we propose a model for computing the execution costs of a bag-of-task service (or application) which considers all the factors impacting on the final price (computing and data storage resources, data transfer costs, networking, etc.) and also considering the variety of resources offered by public cloud providers (the wide catalog of resources and services, the different ways to hire/pay for, the possibility of hiring resources in different geographical areas, etc.). Besides, the generality of the model facilitates its application to different cloud service providers considering the resource heterogeneity and the variety of payment options offered by today’s IaaS cloud platforms. Second, we have implemented a cost-driven engine for the execution of bag-of-task applications in cloud environments. The cost model has been programmed as a Mixed Integer NonLinear Programming Problem (MINLP) and integrated into a framework for the execution of service-oriented applications [17]. A new cost-aware provisioning component manages the provisioning of resources as a part of the applications’ life-cycle. Besides, a second component responsible for adapting the framework configuration in order to optimize the management of hired cloud resources during the applications’ execution has also been implemented and integrated into the final solution. Finally, in the paper, we apply the proposed solution to the development of a network-accessible service for the annotation of big sets of terms. The remainder of the paper is structured as follows. Section 2 describes the process by which the previous grid-based solution has been migrated to a cloud-based version. Section 3 introduces the method used to model the execution costs and provision resources with minimal cost. Once the set of required resources has been establish, the system must be deployed. Section 4 depicts this process. Section 5 presents a review of the main published contributions related to the problem the paper concentrates on. Finally, Section 6 presents some conclusions and future work. 2. MOVING FROM A GRID-BASED SOLUTION TO A CLOUD-BASED ENVIRONMENT This section introduces the motivation and the complete process for the migration of a previous grid-based solution to a cloud-based environment. 2.1. Motivation: migrating to Amazon AWS In [16], a repository of more than 15 million of educational resources was semantically annotated. The ADEGA algorithm [18] and the categories of DBpedia [19] were used to create the metadata of these resources. For each educational resource, a set of significant terms were extracted and semantically annotated by means of an RDF-based graph constructed from the DBpedia instances. The resulting graphs were used to classify the corresponding educational resources. An application was programmed to annotate all the educational resources stored in the repository. This application consisted of a (big) set of tasks, being each one in charge of analyzing a set of educational resources in order to identify a set of relevant terms and, then, associating an RDF graph with these terms. In that work, different and heterogeneous computing infrastructures (two grids and a cluster) were used to execute the annotation application for the whole set of learning units. Afterwards, we concentrated on estimating the cost of solving the same problem using on-demand instances in the Amazon EC2 cloud [20], being result that a cloud-based solution would be quite more expensive. That estimation motivated us to propose a model for computing more accurate computing costs [21]. The resulting model was very limited because it did not consider neither the cost of data storage and transfer nor the cost of managing the applications’s life-cycle when deploying it in an execution engine. These costs can have a significant impact in the total cost of executing complex applications, such as the annotation service considered in this paper. From a business point of view, most companies do not have access to grid environments. However, the pay-per-use cloud is a feasible way of having computing resources as they are needed. This is the reason for the present work: Could the (massive) annotation process be offered as an affordable service using cloud resources? To answer this question, Amazon AWS [22] was chosen because of its maturity and its stability as a commercial cloud and its rich catalog of facilities for the deployment of services and computing instances. 2.2. Life-cycle of a service request Let us now depict the life-cycle of a service request for the annotation service. Figure 1 sketches the main phases of such request. First, the customer negotiates with the service the time and price constraints at the Negotiation phase. The service will reach an agreement with the customer, and accept a request if it is possible to reserve a set of cloud resources to solve it according to the constraints established by the customer. As a part of this decision, the service computes the cheapest combination of cloud resources that will be provisioned. Figure 1. View largeDownload slide Phases of a service request. Figure 1. View largeDownload slide Phases of a service request. After that, the Configuration phase requests the computing, storage and network resources to the provider. Once they are available, the service deploys the annotation application on the Execution phase. A Watching process is conducted in order to monitor and track the usage of the cloud resources and their derived costs. Finally, once the annotation process has been completed, the cloud resources are released and the results and the service invoice are sent to the client at the Completion phase. 2.3. Architectural design Figure 2 depicts the high-level architecture of the proposed service. The service is offered via a Web services-based interface. A service request consists of two parameters: an URI to the collection of data to be annotated and the maximum time that the client is willing to wait for the service response. These parameters are used by the service to compute the minimum price the request can be solved. Figure 2. View largeDownload slide Cloud-based architecture. Figure 2. View largeDownload slide Cloud-based architecture. Internally, the service consists of two layers: the execution layer and the computing infrastructure layer. The execution layer integrates a set of components to support the main phases of a service request: it is responsible of making the correct decisions to meet service agreements, provisioning and releasing the required cloud resources, splitting a service request into a set of executable tasks and, finally, managing and monitoring the execution of tasks on the reserved resources. More specifically, the execution layer consists of the service front-end and a resource management framework. The front-end integrates a Cost-driven decision maker component in charge of evaluating whether the service will be able to successfully respond to the user request (considering the customer constraints). The Workflow execution environment implements the service execution logic. It first notifies the framework the required computing and storage resources. Then, it creates a set of executable tasks and submits them to the framework for execution. Finally, it waits for the results of the submitted tasks. The description of the required resources, tasks, and partial and final results are stored in the message bus component. Before starting the execution of tasks, the framework must configure itself. The Self-configuration component reads from the bus the description of the required resources and, then, interacts with the mediation layer to create a mediator, who is responsible for the provisioning of computing, network and storage resources as well as the submission of pending annotation tasks (stored in the message bus). Internally, it also schedules the mapping of these tasks to the provisioned cloud resources. There are some additional and more general aspects of the life-cycle of all these executable tasks which are managed by other management components (scheduling, fault handling or data-movement components, for instance) integrated in the framework. The computing infrastructure layer provides the infrastructure and resources that will be used to execute the annotation tasks. This layer may be composed of different and heterogeneous computing infrastructures (grid, cluster or cloud environments, for instance). However, in this paper we concentrate on a cloud perspective. All the components of the framework, the message bus, the management components and the set of mediators, as well as other complementary services to support dynamic balancing, DNS handling, storage support, etc., have been deployed using cloud instances in Amazon EC2. Finally, we would like to remark the new components that have been developed as a part of this work. In the framework, the self-configuring component and the mediators for the interaction with the cloud provider (including the provisioning, scheduling and usage-monitoring facilities integrated into them) have been added. On the other hand, in the service’s front-end, the cost-driven decision maker and the workflow that implements the logic of the annotation process have also been implemented. The new method for minimizing the cost of provisioned cloud resources has been included as a part of the decision maker component. These new components are described in the following sections. 3. ESTIMATING COSTS OF THE CLOUD-BASED SOLUTION Let us now introduce the cost factors that must be considered in order to estimate the budget for a service request. The price depends on three factors: the cost of executing the framework itself ( costfw), the cost of cloud storage services ( costdata) and the cost of hiring the set of cloud computing instances needed to compute the request ( costinstances). The sum of these costs will define the total cost of the approach. For clarity, the main variables used in this section are summarized in Table 1. Table 1. Variables used in the cost estimation. Variable Definition Nterms Number of terms to be annotated per request Tmax Deadline of the annotation request (hours) sizeinput Size of input data (GB) sizeoutput Size of results (GB) ni Number of type-i instances ti Computing hours of type-i instances tphi Throughput of type-i instances cfi The fixed cost of type-i instances chi The euro/hour cost of type-i instances size The storage disk attached to each instance (GB) cti The total cost of type-i instances pricej The euro/hour cost of a type-j service pricej,k The cost of executing a type-k task in a type-j service Variable Definition Nterms Number of terms to be annotated per request Tmax Deadline of the annotation request (hours) sizeinput Size of input data (GB) sizeoutput Size of results (GB) ni Number of type-i instances ti Computing hours of type-i instances tphi Throughput of type-i instances cfi The fixed cost of type-i instances chi The euro/hour cost of type-i instances size The storage disk attached to each instance (GB) cti The total cost of type-i instances pricej The euro/hour cost of a type-j service pricej,k The cost of executing a type-k task in a type-j service View Large Table 1. Variables used in the cost estimation. Variable Definition Nterms Number of terms to be annotated per request Tmax Deadline of the annotation request (hours) sizeinput Size of input data (GB) sizeoutput Size of results (GB) ni Number of type-i instances ti Computing hours of type-i instances tphi Throughput of type-i instances cfi The fixed cost of type-i instances chi The euro/hour cost of type-i instances size The storage disk attached to each instance (GB) cti The total cost of type-i instances pricej The euro/hour cost of a type-j service pricej,k The cost of executing a type-k task in a type-j service Variable Definition Nterms Number of terms to be annotated per request Tmax Deadline of the annotation request (hours) sizeinput Size of input data (GB) sizeoutput Size of results (GB) ni Number of type-i instances ti Computing hours of type-i instances tphi Throughput of type-i instances cfi The fixed cost of type-i instances chi The euro/hour cost of type-i instances size The storage disk attached to each instance (GB) cti The total cost of type-i instances pricej The euro/hour cost of a type-j service pricej,k The cost of executing a type-k task in a type-j service View Large 3.1. Cost of the framework From an architectural point of view, the framework is a middleware layer shared by all customer requests that are being concurrently executed by the service. Therefore, its cost should be proportionally distributed among all these customers. Although some customer-specific costs can be determined with more or less effort, to accurately monitor the resources used by each customer might be expensive and lead to additional costs [23]. For simplicity, in our estimations, we have supposed the worst case from an economic perspective: a framework instance will be deployed for managing each service request. As described, the framework consists of three main types of components: the message bus, a set of management components and a set of mediators. Equation (1) estimates the cost of executing the framework ( costfw). It mainly depends on the number of computing instances needed to execute the different architectural components and the Amazon services used by the message bus to coordinate these components ( costinst and costbus, respectively): costfw=costinst+costbus (1) costinst=(ctM50·nM50+ctXL50·nXL50+ctmc50·nmc50)·Tmax (2) costbus=(priceSQStask+priceELBtask)·#tasks+priceELB·Tmax (3) First, let us explain the costinst factor. We use three different types of computing instances in order to execute the architectural components: m3.medium instances (the subscript M50), m3.xlarge instances (XL50) and t2.micro instances (mc50). In [20], we experimentally determined the most adequate types of instances for the different types of components, and also stated that the US West (Oregon) Amazon region was the best candidate. Therefore, all those instances will be hosted in that region and configured to have a local storage capacity of 50 GB. Considering these requirements, their prices per usage-hour have been calculated from the basic prices defined by the cloud provider: ctM50=0.0642€ per hour, ctXL50=0.2411€ per hour and ctmc50=0.0156€ per hour. The number of instances of each type needed to solve a request depends on the input workload; more specifically, the number of mediators needed to execute the considered workload. In Section 4, we will determine how to calculate the number of mediators per request. Let us assume for now that this number is known. In particular, two m3.medium instances ( nM50) are needed for the execution of the management components (two components are being currently used: a fault-tolerance component and a data-movement component), and the number of m3.xlarge instances ( nXL50) to be provisioned is an instance per mediator. Besides, the bus uses one t2.micro instance for interacting with each mediator and another one to execute its interface ( nmc50). Finally, the usage time determined in the cost equation is the maximum time defined by the customer, Tmax, in hours. On the other hand, Equation (3) estimates the cost of the bus. The implementation of the message bus is based on the Amazon Simple Queue (SQS) and the Elastic Load Balancing (ELB) services. We have calculated the prices of these services per executable task. Thus, a task performs 12 SQS requests and requires the ELB to process six messages of 64 kB each. Considering these requirements: the priceSQStask is 4.65·10−6€ per task, and the priceELBtask is 2.27·10−6€. Besides, the usage time of the ELB service must be added to these costs per task (the priceELB is 0.025€ per hour being Tmax the total time). 3.2. Data related cost Equation (4) estimates the cost of data requirements involved in a service request ( costdata). The Amazon Simple Storage Service (S3) is used to store the request input and output data (the workload and the resulting annotations). Currently, Amazon charges customers for the amount of data stored, the number of write and read requests and the amount of data transferred out of S3. Considering this pricing policy, the first term in the equation corresponds to the cost of storing input and output data. It depends on the data size and the time they are stored (we consider the worst case, and, therefore, the maximum time defined by the customer, Tmax, as well as the total input and output data generated in the request are used). Next, we consider the cost of the GET requests used to read input data from the computing instances and the cost of the PUT requests used to store the workload in S3 and to save the output data. The total number of requests can be easily calculated since one request is used to read and write every term of the input workload, meanwhile one request per each ten terms is used to save the output data. Finally, the last term in Equation (4) refers to the movement of the input data to the computing instances (from S3 to the EC2 service) to process the terms (note that transferring data into S3 is free of charges). These costs will be estimated considering the following Amazon prices: priceS3=0.02327€ per GB-month, priceget=0.0077€ per each 10,000 requests, priceput=0.0074€ per each 1,000 requests, and priceS3−>EC2=0.0156€ per GB, respectively costdata=(sizeinput+sizeoutput)·Tmax·priceS3+Nterms·priceget+1+110·Nterms·priceput+sizeinput·priceS3−>EC2. (4) As stated before, data related costs have been estimated assuming a worst case scenario where all data are considered during the whole request. Obviously, in a real scenario, the cost would be lower since output data are generated progressively and input data can be deleted when they were already used. Furthermore, compression and other techniques could be used in order to reduce the data stored and transferred, as well as the number of requests performed. 3.3. Minimization of the cost of computing instances In [20] we concluded that, for the considered problem, the cost of computing instances represented more than 90% of the total cost, whereas the infrastructure management and data related costs correspond to the remaining 10%. Therefore we aimed at minimizing it considering the wide range of computing resources and purchasing models offered by the cloud provider. The total price will depend on the number of hired instances and the time they are used. To compute the cost of a given annotation request, we should consider its workload (the number of terms to be annotated) as well as the deadline imposed by the customer. The cost of the computing instances depends on two factors. First, the cost of the instance itself, which depends on the selected purchasing model. Second, the cost of the elastic block storage (EBS) attached to each instance. The EBS cost depends on the size of the attached volume and the time the instance is running. In that respect, the computing instances require volumes of 70 GB [20]. The current price offered by Amazon for the EBS volumes is 0.088€ per GB-month. From the computing instances point of view, we are interested in obtaining the number of terms per time unit each type of instance can process (which is related to how powerful the instance is). In order to measure the computational power of the instances, 10 000 terms were executed in different instances types deploying as many jobs as the number of processors in the instances. Thereby, the mean execution time was used as an indicator of the computational power of each instance type considered. Table 2 summarizes, for each instance type, its number of processors, the mean time to annotate a term observed in the experiments (including data-movement and other management delays) and the resulting computational power expressed as the number of terms that each instance can process in an hour. Table 2. Evaluation of the time required to annotate a term using different Amazon EC2 instances (prices to 1 November, 2016). Instance type m3.xlarge m3.2xlarge i2.xlarge i2.2xlarge m4.xlarge m4.2xlarge c4.xlarge c4.2xlarge Number of cores 4 8 4 8 4 8 4 8 Execution time (sec/term) 38.04 37.69 33.80 33.04 36.37 35.99 30.72 29.85 Computational power (terms/hour) 378.54 764.12 426.03 871.67 395.91 800.23 468.68 964.86 Instance type m3.xlarge m3.2xlarge i2.xlarge i2.2xlarge m4.xlarge m4.2xlarge c4.xlarge c4.2xlarge Number of cores 4 8 4 8 4 8 4 8 Execution time (sec/term) 38.04 37.69 33.80 33.04 36.37 35.99 30.72 29.85 Computational power (terms/hour) 378.54 764.12 426.03 871.67 395.91 800.23 468.68 964.86 View Large Table 2. Evaluation of the time required to annotate a term using different Amazon EC2 instances (prices to 1 November, 2016). Instance type m3.xlarge m3.2xlarge i2.xlarge i2.2xlarge m4.xlarge m4.2xlarge c4.xlarge c4.2xlarge Number of cores 4 8 4 8 4 8 4 8 Execution time (sec/term) 38.04 37.69 33.80 33.04 36.37 35.99 30.72 29.85 Computational power (terms/hour) 378.54 764.12 426.03 871.67 395.91 800.23 468.68 964.86 Instance type m3.xlarge m3.2xlarge i2.xlarge i2.2xlarge m4.xlarge m4.2xlarge c4.xlarge c4.2xlarge Number of cores 4 8 4 8 4 8 4 8 Execution time (sec/term) 38.04 37.69 33.80 33.04 36.37 35.99 30.72 29.85 Computational power (terms/hour) 378.54 764.12 426.03 871.67 395.91 800.23 468.68 964.86 View Large Obviously, the total cost of each instance will depend on the selected purchasing model [22] (on-demand or reserved). Intuitively, for short deadlines, reserved instances would be a bad solution, while long deadlines would prefer that kind of instances (the effective hourly price is less in that cases). So, the deadline must be considered. In that respect, we have considered the on-demand purchasing model and the two models offered by Amazon for reserved instances where instances can be hired for 1 or 3 years under three payment methods: All Upfront, Partial Upfront and No Upfront (this last for 1-year reserved instances). Therefore, these purchasing options give up to 8×2×2+8+8=48 logical instances. The aim of considering both purchasing models is to show that the proposed model is flexible enough to support changes in purchasing models and can be used with different providers. Once the cost associated to the computing instances is obtained, we are interested in selecting the best combination of instances allowing us to meet the temporal request requirements. For that, the problem is defined as a quadratic function where there are two sets of variables. One set corresponds to the number of logical machines (the term ‘logical’ has been defined as an instance of a specific type, located in a specific region and hired according to a purchasing model and a payment option) that will be used to solve the annotation problem. The second set of variables establishes, for each one of the machines, how long it will have to be computing. The objective will be to minimize the cost ensuring the whole set of terms will be annotated and the deadline will be met. The problem is shown in Equation (5). It belongs to the class of MINLP and includes the following elements: As stated, we are considering that there are LI=48 different types of logical instances. Let N=(n1,n2,…,ni,…) be the integer variables indicating how many instances of each logical machine will be engaged in solving the problem and T=(t1,t2,…,ti,…) the time, in hours, the instances will be used. Let Tmax, in hours, the deadline provided by the customer and Nterm the number of terms to be annotated. Let CH=(ch1,ch2,…,chi,…) the real vector of euro/hour costs and CF=(cf1,cf2,…,cfi,…) the real vector of fixed costs ( euro). Fixed costs include the upfront cost and the monthly cost, if any. Let TPH=(tph1,tph2,…,tphi,…) the real vector of the throughput of each instance type, in terms/hour: tphi indicates how many terms instance type i processes per hour. Let size, in GB, the size of the disk attached to each instance (in this problem this value is 70 GB) and cEBS, in GB/hour, the cost per hour of the local storage. minimize∑i=1LI((size·cEBS+chi)·ti+cfi)·nisubjectto∑i=1LIti·ni·tphi≥Nterm0≤ni,i∈{1..LI}0≤ti≤Tmax,i∈{1..LI}ti≤24·365,icorrespondstoRes-1-yearinst.ti≤24·3·365,icorrespondstoRes-3-yearinst. (5) Constraints ti≤24·365 and ti≤24·3·365 impose that reserved instances can not be used more time that the hired one. As a result, the MINLP problem (Equation (5)) returns the combination of instances and the time each instance must be running to meet the problem constraints and requirements with a minimal cost. The problem has been programmed using the AMPL language1 and executed by the filterSQP solver and the NEOS server (Network-Enabled Optimization System) [24]. 3.4. The computing cost of some annotation requests Let us for instance consider the case of processing 10 million terms in 30 days. The minimization problem gives as a result that the terms can be annotated with a minimum computing instances cost of 3,447.27€. The combination of logical instances that achieve that cost is 43 c4.2xlarge on-demand instances running for 10.04 days (241 h) and three c4.xlarge on-demand instances running for 1 h. The total capacity hired will process up to 10,000,260 terms. As a second example, let us consider the case of processing 150 million terms (the complete Universia dataset) in 540 days. The minimization problem gives as a result that the terms can be annotated with a minimum computing instances cost of 33,216.59€. The solution corresponds to a combination composed of one c4.xlarge 1-year reserved (All upfront) instance running for 365 days (8760 h), 17 c4.2xlarge 1-year reserved (All upfront) instance running for 365 days (8760 h) and eight c4.2xlarge on-demand instances running for 286 h (2288 h). The total capacity hired will process up to 150,000,399 terms. Finally, a third example could be processing the same dataset composed of 150 million terms but in 2 years (730 days). For that, the result of the minimization problem is to use nine c4.2xlarge 3-year reserved (All upfront) instances running for 2 years. With that configuration, the computing instances cost would be 31,126.24€ and the total capacity hired would process up to 152,139,274 terms. Therefore, the best price for the given deadline could be reached with a combination of logical instances providing computing capacity for more than the required terms (in this example there is an excess of more than 2 million terms). This excess could be reused to serve other requests and, thus, it could have an effect in the final cost proposed to the customer. The solution proposed here is quite simple, yet effective. However, as stated, the performance of a given virtual resource for a given task is just an estimation with at least two varying aspects: the cost of executing a task has been estimated experimentally, and the performance of instances of a given virtual machine can also vary. Therefore, at the system execution time, one could adopt a more flexible approach, as proposed in [14, 15], monitoring the execution evolution so as to hire or free engaged (on-demand) resources with the aim of ensuring the deadline constraint. 4. SELF-CONFIGURATION OF THE FRAMEWORK At the configuration phase, the resource management framework must be parametrized to provision, use and release the required instances. As a part of this work, we have integrated a new self-configuring mechanism able to manage the life-cycle of cloud instances. The scalability of the framework must be considered in order to take the appropriate configuration decisions. 4.1. Design of the configuration mechanisms Figure 3 shows the components involved in the configuration of the execution environment. When a request is accepted, the service workflow puts, into the message bus, a set of task execution calls as well as a configuration message describing the computing requirements (the number, type and region of the computing instances as well as the purchasing model that should be used to provision them). These computing requirements are the result of the negotiation phase. Figure 3. View largeDownload slide Components involved in the configuration phase. Figure 3. View largeDownload slide Components involved in the configuration phase. The Self-configuring component takes configuration messages from the bus. Once a message is taken, the component configures the framework for the executable tasks involved in the request. Internally, a manager determines the most adequate configuration option considering a set of scalability rules. These rules have been experimentally established (they will be described in the next subsection). One of those rules determines the number of mediators that will be needed for managing the involved computing instances and submitting executable tasks to them. The manager interacts with the Amazon Web Services (AWS) to hire a computing instance per mediator. Each mediator is deployed in a specific Amazon Machine Image (AMI) we have programmed for the mediators. Once the mediators are deployed, the manager sends each one of them a description of the computing resources it must hire and locally manage (the way of distributing the computing capacity among all the mediators is also determined by the scalability rules). Internally, the Controller of a mediator is responsible for provisioning the corresponding computing instances and storage resources via the AWS interface. All these resources are locally managed by the Job Manager in order to execute pending tasks. This component takes a pending task from the bus, schedules its execution and submits it to the selected computing instance; then, when the task execution has finished, the manager receives its results and writes them into the bus. Besides, a Fault handling mechanism has been integrated into the mediator to recover from possible execution faults. Simultaneously, the availability of cloud instances, the state and cost of executing tasks and other operational parameters are monitored by the Monitor component. This component has been implemented using the Amazon Cloud Watch service, and integrates a data collector that is used to calculate the real cost of cloud resources. 4.2. Framework scalability As previously stated, the scalability of the framework must be studied in order to find the most appropriate configuration options and to analyse its performance from the service perspective. On the basis of our experience, the two architectural components that must be studied in terms of scalability are mediators and the message bus. From the mediation point of view, we are interested in identifying the number of computing instances that a mediator is able to handle efficiently. Regarding the message bus, its storage capabilities and the performance of its input/output operations must be studied. 4.2.1. Configuring the computing capacity of a mediator The goal and the requirements of this first experiment are Goal: To determine the time required by a mediator to handle an annotation task depending on the number of computing instances it manages. Configuration of used resources: The experiments were performed in the AWS EC2 Oregon region. A m3.xlarge instance (4 vCPUs, 15 GB RAM) was hired for each mediator and the annotation algorithm was executed by a set—between 1 and 400 instances—of m3.large instances (2 vCPUs, 7.5 GB RAM). In order to provide persistent block storage, a local Amazon Elastic Block Store (EBS) disk of 70 GB was attached to each of these computing instances. These instances are launched from the custom AMI that was previously detailed. Input workload: A bag of annotation tasks reused from previous executions was executed. The number of tasks is enough to have the computing instances continuously working. The mean execution time of each task is near 45 min (10 terms are annotated per task). A mediator creates a dedicated SSH connection to interact with each provisioned cloud instance it is in charge of. This connection is used to configure the execution environment of the instance, to submit it the tasks to be executed and, then, to receive the corresponding results. To experimentally estimate the number of computing instances that a mediator can manage we have hosted a mediator in a m3.xlarge instance and analyzed its behavior with respect to the number of instances. Figure 4 shows the mean time required by a mediator to handle a task request depending on the number of computing instances it handles. This time includes looking for an available computing instance, sending a task to the selected instance and recovering the results. As it was expected, when the number of handled instances increases, the required time also increases because of the management overheads. Besides, these experiments have shown that when the number of concurrent SSH connections is higher than 150, the number of connection faults significantly increases. Taking that into consideration, Figure 5 shows the throughput of a mediator against the number of instances it manages. The throughput increases in a linear way until 100 instances, showing little improvement beyond that limit. Figure 4. View largeDownload slide Mean time required for managing a task request versus the number of instances. Figure 4. View largeDownload slide Mean time required for managing a task request versus the number of instances. Figure 5. View largeDownload slide Throughput versus the number of instances. Figure 5. View largeDownload slide Throughput versus the number of instances. Considering these experiments, we have established that the maximum number of instances per mediator will be 150. This decision seems to be contradictory: the behavior of a mediator with a lower number of instances is better. Nevertheless, we have also considered another relevant issues for this decision. First, we try to minimize the cost of executing the mediators (the usage cost of a mediator is 173,60€ per month). If 150 computing instances are required during 6 months in order to respond to a request, it is six times cheaper to have a mediator with 150 instances than six mediators with 25 instances. Second, the execution time of an small-sized annotation task is near 45 min. Therefore, the time required by a mediator to handle its submission is insignificant and the management overheads have little impact on the total time (near 2 s when the mediator integrates 150 instances or near 0.8 s when it integrates 25 instances). Third, the current implementation of the mediator is able to handle that number of SSH connections. And, finally, from the perspective of our application domain, only more than 150 instances are required when the customer’s deadline is very short. 4.2.2. Performance of a SQS-based message bus The goal and the requirements of the second experiment are Goal: To determine the time of reading/writing a task from/into the bus depending on the number of connected clients. Configuration of used resources: The experiments were performed in the AWS EC2 Oregon region. The bus was deployed using the Amazon Simple Queue Service (SQS) and configured to have a simple Amazon Elastic Load balancer and a maximum of five request managers. t2.micro (1 vCPU, 1 GB RAM) instances were hired to execute each one of these managers. On the other hand, clients were executed in m3.large (2 vCPUs, 7.5 GB RAM) computing instances. These instances are launched from the custom AMI that was previously detailed. Input workload: A bag of simulated tasks was programmed. The mean execution time of these tasks is less than one second because the goal is maximize the flow of tasks between the bus and their clients. In [25], we presented an implementation of the message bus based on the Amazon Simple Queue Service (SQS). This service guarantees the proposed cloud-based bus to be highly available, scalable and reliable, with an extensible storage capacity for messages, proving to be an appropriate solution to solve computing-intensive problems. Let us now evaluate the performance of the bus from the perspective of this work. We are interested in studying the performance of its input/output operations (submit/read a task into/from the bus) versus the number of requests that are being concurrently processed. First, we study the mean time of reading a task from the bus related to the number of mediators that are being executed. In this experiment, each mediator is continuously taking tasks from the bus. Figure 6 shows that the time of reading a task remains more or less constant (between 30 and 35 ms) independently of the number of mediators that are concurrently accessing the bus. Figure 6. View largeDownload slide Mean time of reading a job from the bus versus the number of deployed mediators. Figure 6. View largeDownload slide Mean time of reading a job from the bus versus the number of deployed mediators. On the other hand, for each customer request, the annotation service creates a workflow which submits a set of executable tasks to the framework. This submission consists of writing into the message bus the description of each task. In a first experiment, we have concluded that the best throughput is reached when the workflow consists of 25 threads submitting tasks to the framework (more specifically, the throughput is 76 tasks per second). In a subsequent experiment, we have studied the mean time of writing a task into the bus with respect to the number of workflows that are being executed at the service level. Figure 7 shows that the mean time that a thread needs to write a task into the bus remains more or less constant (between 245 and 260 ms) independently on the number of workflows. Figure 7. View largeDownload slide Mean time that a thread needs to write a task into the bus depending on the number of workflows. Figure 7. View largeDownload slide Mean time that a thread needs to write a task into the bus depending on the number of workflows. Finally, on the basis of the previous experiments, we would like to analyse whether our decisions are compatible: in other words, Is a workflow composed by 25 threads able to provide enough executable tasks to keep all the computing instances managed by the mediators busy? The throughput of that workflow is near 100 task submissions per second. The throughput of a mediator that integrates 150 instances is 66 task readings per second in the best case (assuming that the execution time of a task is zero seconds). Therefore, a workflow could provide tasks to keep busy more than 225 cloud instances in the worst case. Nevertheless, in a real scenario, the execution time of each annotation task is near 45 min and, therefore, this type of workflow would be able to submit tasks for a significantly higher number of computing instances. As a result of the experiments, we can conclude that using workflows with 25 threads, the proposed cloud-based bus, and the creation of mediators composed of a maximum of 150 instances is a good parametrization to successfully provide the proposed service. 5. RELATED WORK In this section, two different types of research proposals are discussed. First, we review economic models to estimate the cost of executing an application in the cloud. These models require that users decide the cloud resources to hire for deploying their applications. On the other hand, instead of estimating the execution cost, another proposals determine directly the most economical combination of cloud resources needed to execute an application. 5.1. Estimation of the execution cost of cloud-based applications Experimentation-based techniques can be used to estimate the cost of executing an application on a given cloud (a use-case of these techniques was presented in [26], for instance). The main disadvantage of these techniques is the price one has to pay for the execution of the experiments (generally, the validity of the estimations depends on the money spent in the experimentations). The definition of cost models is an alternative approach. These models help customers in deciding what parts of their applications should be executed in the cloud, and when. In general, the proposed approaches [3, 4, 5] require that customers know or estimate the requirements of their applications, such as execution times, input and output data size, or storage requirements. Once known, those operational parameters are mapped onto the basic prices provided by cloud providers in order to estimate the execution costs. As a particular case, in [6], authors discuss the need of defining different cost formulas for families of applications, and propose a set of formulas for sequential, multi-thread, parallel or MPI programs and workflows. Those formulas are used for the development of a service able to capture the operational information of an application and estimate its cost. In some cases, it is not easy to know or determine the application requirements in order to estimate its executing costs. CloudTracker recommends scientists the type of computing instance that must be hired to execute and replay their large scale experiments [27]. When an experiment is executed for the first time, the system tracks it and stores the information needed to replay it in the future. CloudTracker uses this information to automatically run the experiment in different types of instances and to determine the cheapest and the fastest execution option. These options will help scientists to decide the instances to hire each time the experiment is replayed. Besides, CloudTracker determines the cheapest time instant to store the experiment’s results, avoiding the same computations to be repeated. Finally, [28] presents a model to decide where to place a set of services on a federated hybrid cloud. The model estimates the costs of using an in-house cloud-enabled data center (private cloud) and different public clouds. In the first case, the cost factors are related to electricity consumption, software licenses, hardware maintenance and the equipment required by the data center; while in the case of public providers, the factors involve the hiring of computing instances and the data transfer between private and public resources. Then, this estimation model is integrated in an optimization algorithm to evaluate the different options of service placement and to determine the cheaper deployment. 5.2. Minimization of the cost of cloud resource provisioning A different approach is adopted for those works that look for a function describing the cost in terms of the tasks to be executed, the resources proposed by the service providers and the time constrains, with the goal of optimizing some given parameters. These approaches, usually, focus on IaaS clouds (Infrastructure as a Service). Using different techniques they try to find the best combination of cloud resources that should be provisioned for the execution of an application. This combination is computed considering the various types of resources offered by the cloud providers, the application requirements as well as the user requirements (mainly, its budget and deadline). Most of those solutions are integrated into scheduling algorithms for IaaS clouds, and consider the possibility of scaling the provisioned resources to manage the uncertain behaviors of the executed applications. Let us introduce in Table 3, a taxonomy of some important works related to the optimization of resource provisioning in cloud systems. For each method, we are going to consider the following classification criteria: the input constraints, the different costs and provider options considered, the uncertainty of the provided provisioning plans, the concrete technique used to compute the optimal resource combination, the type of application the method focuses on and, finally, whether the method is integrated into an application engine. Table 3. Comparative analysis of the methods for minimizing the costs of provisioning. Criteria of the taxonomy Papers about the minimization of costs in cloud Our approach [7] [11] [8] [9] [10] [12] [13] [14] [15] Input constraints Workload. Deadline Workload. Deadline. Max.Resources Workload. Deadline. Budget Workload. Deadline Workload. Deadline Workload. Deadline Workload. Deadline Deadline. Budget Num. VM per VM class Workload. Deadline The cost of cloud services Cost of computing ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ Cost of local storage ✓ ✓ ✗ ✗ ✗ ✗ ✗ ✗ ✓ ✗ Cost of data services ✓ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ The options offered by cloud providers Different computing resources ✓ ✓ ✗ ✓ ✗ ✗ ✓ ✗ ✗ ✓ Different payment models ✓ ✗ ✗ ✗ ✓ ✗ ✗ ✗ ✓ ✗ Different service providers ✓ ✓ ✗ ✗ ✗ ✓ ✓ ✗ ✓ ✗ The uncertainty of provisioning plans Effects of virtualization ✗ ✗ ✗ ✓ ✗ ✓ ✓ ✗ ✓ ✗ Reconfigure the provisioning ✗ ✗ ✓ ✓ ✓ ✓ ✗ ✓ ✓ ✓ Type of technique MINLP BLP Provisioning & Planning Heuristic Optimization model Heuristic Meta-heuristic Provisioning & Planning Stochastic ILP ILP Type of application Bag-of-task Bag-of-task Workflow MapReduce Independent jobs Workflow Workflow Independent jobs Class of VM Independent jobs Applicability of the technique Integrated into an app. engine ✓ ✗ ✗ ✓ ✗ ✗ ✗ ✗ ✗ ✗ Integrated into a scheduling alg. ✗ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✗ ✓ Integrated with scaling ✓ ✗ ✓ ✓ ✓ ✗ ✗ ✗ ✓ ✓ Criteria of the taxonomy Papers about the minimization of costs in cloud Our approach [7] [11] [8] [9] [10] [12] [13] [14] [15] Input constraints Workload. Deadline Workload. Deadline. Max.Resources Workload. Deadline. Budget Workload. Deadline Workload. Deadline Workload. Deadline Workload. Deadline Deadline. Budget Num. VM per VM class Workload. Deadline The cost of cloud services Cost of computing ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ Cost of local storage ✓ ✓ ✗ ✗ ✗ ✗ ✗ ✗ ✓ ✗ Cost of data services ✓ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ The options offered by cloud providers Different computing resources ✓ ✓ ✗ ✓ ✗ ✗ ✓ ✗ ✗ ✓ Different payment models ✓ ✗ ✗ ✗ ✓ ✗ ✗ ✗ ✓ ✗ Different service providers ✓ ✓ ✗ ✗ ✗ ✓ ✓ ✗ ✓ ✗ The uncertainty of provisioning plans Effects of virtualization ✗ ✗ ✗ ✓ ✗ ✓ ✓ ✗ ✓ ✗ Reconfigure the provisioning ✗ ✗ ✓ ✓ ✓ ✓ ✗ ✓ ✓ ✓ Type of technique MINLP BLP Provisioning & Planning Heuristic Optimization model Heuristic Meta-heuristic Provisioning & Planning Stochastic ILP ILP Type of application Bag-of-task Bag-of-task Workflow MapReduce Independent jobs Workflow Workflow Independent jobs Class of VM Independent jobs Applicability of the technique Integrated into an app. engine ✓ ✗ ✗ ✓ ✗ ✗ ✗ ✗ ✗ ✗ Integrated into a scheduling alg. ✗ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✗ ✓ Integrated with scaling ✓ ✗ ✓ ✓ ✓ ✗ ✗ ✗ ✓ ✓ View Large Table 3. Comparative analysis of the methods for minimizing the costs of provisioning. Criteria of the taxonomy Papers about the minimization of costs in cloud Our approach [7] [11] [8] [9] [10] [12] [13] [14] [15] Input constraints Workload. Deadline Workload. Deadline. Max.Resources Workload. Deadline. Budget Workload. Deadline Workload. Deadline Workload. Deadline Workload. Deadline Deadline. Budget Num. VM per VM class Workload. Deadline The cost of cloud services Cost of computing ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ Cost of local storage ✓ ✓ ✗ ✗ ✗ ✗ ✗ ✗ ✓ ✗ Cost of data services ✓ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ The options offered by cloud providers Different computing resources ✓ ✓ ✗ ✓ ✗ ✗ ✓ ✗ ✗ ✓ Different payment models ✓ ✗ ✗ ✗ ✓ ✗ ✗ ✗ ✓ ✗ Different service providers ✓ ✓ ✗ ✗ ✗ ✓ ✓ ✗ ✓ ✗ The uncertainty of provisioning plans Effects of virtualization ✗ ✗ ✗ ✓ ✗ ✓ ✓ ✗ ✓ ✗ Reconfigure the provisioning ✗ ✗ ✓ ✓ ✓ ✓ ✗ ✓ ✓ ✓ Type of technique MINLP BLP Provisioning & Planning Heuristic Optimization model Heuristic Meta-heuristic Provisioning & Planning Stochastic ILP ILP Type of application Bag-of-task Bag-of-task Workflow MapReduce Independent jobs Workflow Workflow Independent jobs Class of VM Independent jobs Applicability of the technique Integrated into an app. engine ✓ ✗ ✗ ✓ ✗ ✗ ✗ ✗ ✗ ✗ Integrated into a scheduling alg. ✗ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✗ ✓ Integrated with scaling ✓ ✗ ✓ ✓ ✓ ✗ ✗ ✗ ✓ ✓ Criteria of the taxonomy Papers about the minimization of costs in cloud Our approach [7] [11] [8] [9] [10] [12] [13] [14] [15] Input constraints Workload. Deadline Workload. Deadline. Max.Resources Workload. Deadline. Budget Workload. Deadline Workload. Deadline Workload. Deadline Workload. Deadline Deadline. Budget Num. VM per VM class Workload. Deadline The cost of cloud services Cost of computing ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ Cost of local storage ✓ ✓ ✗ ✗ ✗ ✗ ✗ ✗ ✓ ✗ Cost of data services ✓ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ The options offered by cloud providers Different computing resources ✓ ✓ ✗ ✓ ✗ ✗ ✓ ✗ ✗ ✓ Different payment models ✓ ✗ ✗ ✗ ✓ ✗ ✗ ✗ ✓ ✗ Different service providers ✓ ✓ ✗ ✗ ✗ ✓ ✓ ✗ ✓ ✗ The uncertainty of provisioning plans Effects of virtualization ✗ ✗ ✗ ✓ ✗ ✓ ✓ ✗ ✓ ✗ Reconfigure the provisioning ✗ ✗ ✓ ✓ ✓ ✓ ✗ ✓ ✓ ✓ Type of technique MINLP BLP Provisioning & Planning Heuristic Optimization model Heuristic Meta-heuristic Provisioning & Planning Stochastic ILP ILP Type of application Bag-of-task Bag-of-task Workflow MapReduce Independent jobs Workflow Workflow Independent jobs Class of VM Independent jobs Applicability of the technique Integrated into an app. engine ✓ ✗ ✗ ✓ ✗ ✗ ✗ ✗ ✗ ✗ Integrated into a scheduling alg. ✗ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✗ ✓ Integrated with scaling ✓ ✗ ✓ ✓ ✓ ✗ ✗ ✗ ✓ ✓ View Large All the considered approaches minimize execution costs from the application’s users point of view. The provider side has also been studied with the goal of reducing the operating costs of running their infrastructures [29, 30]. However, this perspective is out of the scope of this paper. Let us now discuss some important aspects of the classification. 5.2.1. Input constraints Most proposals require the application workload and the execution deadline as input parameters. References [11, 13] also require the user’s budget because their strategies of cost reduction consist of hiring the maximum number of computing resources and, then, scaling down these resources according to how well they are used by the application. 5.2.2. Costs of cloud services All the approaches are mainly concerned with reducing the costs of computing resources. Nevertheless, these minimization methods also compute provisioning plans for the execution of applications handling high volumes of data (Bag-of-Task applications, scientific workflows or MapReduce systems, for instance). In data-intensive applications, the cost of data storage and transfer can be an essential part of the total cost and, therefore, it should be not ignored. As an exception, [7, 14] include the cost of local storage devices, while [7, 9, 12] estimate the cost of data transfers. As stated, we consider the inclusion of data storage and movement an important aspect for the case of Bag-of-Task applications. In the optimization problem here proposed such costs are considered (although in a nonlinear way). Let us explain by means of a concrete example why it is important. We have estimated the cost of solving the 150 million terms problem in 540 days as 33,216.59€, included 8760 h of storage required. If we solve the problem without considering data storage (just remove the time variables in Problem 5, in which case the problem is similar to the one proposed in [15]), the cost will be 149.98€ less. It is important to remark that integrating storage costs in the optimization problem is computed is necessary. On the contrary, one should have to pay for the storage during the complete (optimal) machine hiring time, getting a worse solution since it is possible that a machine would not be necessary for its whole hiring time, but just for a part of it. 5.2.3. Options offered by providers Cloud providers offer different types of virtual computing resources with different prices. Many approaches ignore this variety of resources and calculate their provisioning plans considering a unique type of resource [ 9, 10, 11, 13, 14]. On the other hand, providers also offer different payment options. For instance, in Amazon EC2, there are three ways to pay for computing instances: On-demand, Reserved instances, and Spot instances (other providers offer similar payment models, such as Google Cloud platform). In many cases a suitable choice of the payment method can involve an important saving of execution costs. This possibility has been considered in the proposed solution and in the ones in [9, 14]. The classification also considers whether the optimization methods could be reused to compute the requirements of an application in different providers or in an execution environment integrating multiple providers. 5.2.4. Uncertainty of provisioning plans Many approaches have assumed that the performance of computing resources of the same type is homogeneous. Nevertheless, [31–33] demonstrate that this assumption is not true due to the shared nature of the cloud infrastructure as well as the use of virtualization techniques and the heterogeneity of the underlying hardware. The case of the Amazon EC2 infrastructure was studied in [31] concluding that the performance of a virtual instance is relatively stable while the performance of multiple virtual machines of the same type is rather heterogeneous. Additionally, other issue that the usually models ignore is the start-up latency of cloud computing instances. This latency has been reduced in cloud environments restoring previously created virtual-machine snapshots with fully initialized application [34] or reusing the purchased computing instances for the execution of the new applications [35]. Table 3 shows the approaches that consider the performance uncertainty of resources and are able to reconfigure their provisioning decisions according to these changing conditions. 5.2.5. Optimization technique Different techniques have been used to determine the cheapest combination of cloud resources that should be provisioned for the execution of a given application. Nevertheless, Integer Programming Problems [14, 15] (IPP) (or Binary Linear Programming problems [7], as a particular case of IPP), heuristics techniques [8, 10, 12] and optimization models [9] are usually programmed. In contrast with these solutions, [11, 13] propose two different algorithms to manage the cost-driven provisioning and plan the application’s execution. 5.2.6. Type of applications These methods reduce the cost of executing highly parallel applications, such as Bag-of-Task (BoT) applications [7], Scientific workflows [10, 11, 12], MapReduce applications [8] or independent jobs [9, 13, 15] (these could be considered a particular case of BoT application where the jobs’ requirements are heterogeneous). The interest in those classes of problems is due to the fact that cloud has proved to be an extremely efficient environment for executing them. On the other hand, [14] minimizes to cost of provisioning different classes of virtual machines (VM). Authors assume that a VM class corresponds to a concrete type of application (a Web server, for example) and, therefore, the provisioning is planned for the execution of multiple instances of different (and simple) applications. 5.2.7. Applications and execution engines Another relevant criteria is the real applicability of the optimization method. The method described in this paper and the one proposed by [8] have been integrated into an engine for the execution of applications. In our case the engine is a part of a service-oriented framework described in Section 2. Reference [8] integrates its solution into a service, called Cura, specialized in executing MapReduce applications. Both engines are able to manage the scaling features provided by the cloud system in order to change the combination of resources computed by their optimization method. For the framework described in this paper, the adaptation of the hired resources to the application’s requirements is very important because we don’t consider the uncertainty of plans as a part of the optimization method. The rest of considered approaches focus on combining their provisioning methods with scheduling algorithms, but they are not subsequently integrated into execution tools. The presented analysis emphasizes the relevant differences of the proposed method for cost-driven provisioning with respect to the existing approaches. As shown in the second column of Table 3, our approach is the only one considering the cost of data storage and transfer, as well as the different options for provisioning and payment offered by public cloud providers (see the cost of cloud services and the options offered by providers criteria, respectively). The first issue can have a significant impact on the total cost, mainly in the case of data-intensive applications. On the other hand, an adequate choice of the instance types and payment models can also lead to important savings in the execution costs. The fact that the proposed provisioning method can be applied in a natural way to different providers can help in obtaining lower costs since a wider set of resources can be considered. Another relevant contribution is that the method has been integrated into an execution framework. This is important since it demonstrates the applicability of the proposed solution. The hiring of resources is now driven by the cost and the user’s deadline. As Table 3 shows, the main lack of the proposed approach is the lack of a way of dealing with the uncertainty of provisioning plans, which comes from the uncertainty of the performances of different instances of the same instance-type. 6. CONCLUSIONS AND FUTURE WORK We have presented the case study of the migration of a semantic-annotation service to the Amazon cloud. The aim was to offer the service as a pay-per-use application. We estimated the costs of deploying the service and concluded that hired resources for solving the computation part of the problem was the most expensive one, much more than the resources required for the necessary deployment infrastructure. We have then concentrated on defining a method for minimizing that cost considering the time constraints imposed by the client and the variety of resources and facilities offered by public cloud providers (the wide catalog of resources and services, the different ways to hire/pay for, the possibility of hiring resources in different geographical areas, etc.). Unlike other existing approaches, the proposed method also considers other cost factors involved in the execution of a service request: the cost of data management (data storage as well as input/output operations) and the cost of the framework components that are needed to control the execution of each service request, for instance. On the other hand, from an execution point of view, the service is also able of dynamically managing the life-cycle of the involved computing instances (provision, use and release) and storage resources. For that, a new extended version of our previous management framework has been developed so as to include cloud-based management and mediation components (more specifically, a mediator for the integration of cloud environments and a component for self-configuring the framework in cloud environments). The proposed methodology has also been applied to study the cost of executing the semantic annotation process in the Microsoft Azure cloud. This case of study tried to demonstrate that the approach can be used on different cloud service providers. The performance of Azure computing instances has been experimentally estimated from the annotation process’ point of view, and a cost-driven provisioning plan computed for some of the cases of Amazon-based processing. We have concluded that the method is a sufficiently general solution and, therefore, it could be applied to different public providers. As future work, we are interested in extending and improving monitoring aspects, which could be useful for improving both cost estimations and the provisioning process [3]. We are also interested in integrating into the framework mechanisms for sharing computing resources between requests that are being concurrently executed, as well as auto-scaling mechanisms for monitoring and managing the changing workload of these requests [36, 37]. On the other hand, as stated in Section 5, the framework lacks of a way to deal with the uncertainty generated by the different performances of different instances of the same instance-type. This is an important aspect to be considered for the near future. The big variety and rapid changes in the resources that can be used can make more interesting the implementation of some monitoring system so as to compare at some points in time the deviation between the number of actually executed tasks and the number of tasks that should have to be executed. This way some additional resources could be provisioned with the aim of reaching the established deadline (or, alternatively, some resources could be released). Finally, we would like to address some open issues closely related to the Amazon instances: new types of computing instances are being continuously added to the Amazon instance catalog whose performance/cost ratios and purchasing options should be dynamically analyzed and integrated (spot instances [38] or dedicated instances, for instance), the same way obsolete types should be removed from the problem formulation. Another line of research must point towards refining the business model of the annotation service from the provider and user perspective [39]. FUNDING This work has been supported by the research projects TIN2014-56633-C3-2-R, TIN2015-72241-EXP and TIN2017-84796-C2-2-R, granted by the Spanish Ministerio de Economía y Competitividad. REFERENCES 1 Hajjat , M. , Sun , X. , Sung , Y.-W.E. , Maltz , D. , Rao , S. , Sripanidkulchai , K. and Tawarmalani , M. ( 2010 ) Cloudward bound: planning for beneficial migration of enterprise applications to the cloud . SIGCOMM Comput. Commun. Rev. , 40 , 243 – 254 . Google Scholar CrossRef Search ADS 2 Tak , B.C. , Urgaonkar , B. and Sivasubramaniam , A. ( 2011 ) To Move or Not to Move: The Economics of Cloud Computing. Proc. 3rd USENIX Conf. Hot Topics in Cloud Computing, Portland, OR, USA, June 14–15, pp. 5–5. USENIX Association, Berkeley, CA, USA. 3 Truong , H.-L. and Dustdar , S. ( 2011 ) Cloud computing for small research groups in computational science and engineering: current status and outlook . Computing , 91 , 75 – 91 . Google Scholar CrossRef Search ADS 4 De Alfonso , C. , Caballer , M. , Alvarruiz , F. and Moltó , G. ( 2013 ) An economic and energy-aware analysis of the viability of outsourcing cluster computing to a cloud . Future Generation Comput. Syst. , 29 , 704 – 712 . Google Scholar CrossRef Search ADS 5 Kashef , M.M. and Altmann , J. ( 2012 ) A Cost Model for Hybrid Clouds. Proc. 8th Int. Conf. Economics of Grids, Clouds, Systems, and Services, Paphos, Cyprus, 5 December, pp. 46–60. Springer, Berlin, Heidelberg. 6 Truong , H.-L. and Dustdar , S. ( 2010 ) Composable cost estimation and monitoring for computational applications in cloud computing environments . Procedia Comput. Sci. , 1 , 2175 – 2184 . Google Scholar CrossRef Search ADS 7 Abdi , S. , PourKarimi , L. , Ahmadi , M. and Zargari , F. ( 2017 ) Cost minimization for deadline-constrained bag-of-tasks applications in federated hybrid clouds . Future Generation Comput. Syst. , 71 , 113 – 128 . Google Scholar CrossRef Search ADS 8 Palanisamy , B. , Singh , A. and Liu , L. ( 2015 ) Cost-effective resource provisioning for mapreduce in a cloud . IEEE Trans. Parallel Distributed Syst. , 26 , 1265 – 1279 . Google Scholar CrossRef Search ADS 9 Li , S. , Zhou , Y. , Jiao , L. , Yan , X. , Wang , X. and Lyu , M.R.T. ( 2015 ) Towards operational cost minimization in hybrid clouds for dynamic resource provisioning with delay-aware optimization . IEEE Trans. Serv. Comput. , 8 , 398 – 409 . Google Scholar CrossRef Search ADS 10 Pietri , I. and Sakellariou , R. ( 2015 ) Cost-efficient CPU provisioning for scientific workflows on clouds. In Altmann , J. , Silaghi , G.C. and Rana , O.F. (eds.) Proc. 12th Int. Conf. Economics of Grids, Clouds, Systems, and Services, Cluj-Napoca, Romania, September 15–17 , pp. 49 – 64 . Springer , Berlin, Heidelberg . 11 Malawski , M. , Juve , G. , Deelman , E. and Nabrzyski , J. ( 2015 ) Algorithms for cost- and deadline-constrained provisioning for scientific workflow ensembles in IaaS clouds . Future Generation Computer Systems , 48 , 1 – 18 . Google Scholar CrossRef Search ADS 12 Rodríguez , M.A. and Buyya , R. ( 2014 ) Deadline based resource provisioning and scheduling algorithm for scientific workflows on clouds . IEEE Trans. Cloud Comput. , 2 , 222 – 235 . Google Scholar CrossRef Search ADS 13 Trivedi , N. and Chudasama , D. ( 2013 ) Dynamic resource provisioning for deadline and budget constrained application in cloud environment . Int. J. Comput. Technol. Appl. , 4 , 462 – 465 . 14 Chaisiri , S. , Lee , B.S. and Niyato , D. ( 2012 ) Optimization of resource provisioning cost in cloud computing . IEEE Trans. Serv. Comput. , 5 , 164 – 177 . Google Scholar CrossRef Search ADS 15 Mao , M. , Li , J. and Humphrey , M. ( 2010 ) Cloud Auto-Scaling with Deadline and Budget Constraints. In Proc. 11th IEEE/ACM Int. Conf. Grid Computing, pp. 41–48. 16 Fabra , J. , Hernández , S. , Otero , E. , Vidal , J.C. , Lama , M. and Álvarez , P. ( 2015 ) Integration of grid, cluster and cloud resources to semantically annotate a large-sized repository of learning objects . Concurrency Comput. , 27 , 4603 – 4629 . Google Scholar CrossRef Search ADS 17 Fabra , J. , Hernández , S. , Ezpeleta , J. and Álvarez , P. ( 2013 ) Solving the interoperability problem by means of a bus. An experience on the integration of grid, cluster and cloud infrastructures . J. Grid Comput. , 12 , 41 – 65 . Google Scholar CrossRef Search ADS 18 Lama , M. , Vidal , J.C. , Otero-Garca , E. , Bugarn , A. and Barro , S. ( 2012 ) Semantic linking of learning object repositories to DBpedia . Educ. Technol. Soc. , 15 , 47 – 61 . 19 DBpedia ( 2017 ). http://dbpedia.org/. Accessed 29 May 2017. 20 Hernández , S. , Fabra , J. , Álvarez , P. and Ezpeleta , J. ( 2013 ) Cost Evaluation of Migrating a Computation Intensive Problem from Clusters to Cloud. In Proc. 10th Int. Conf. Economics of Grids, Clouds, Systems and Services, Zaragoza, Spain, September 18–20, pp. 90–105. Springer. 21 Álvarez , P. , Hernández , S. , Fabra , J. and Ezpeleta , J. ( 2016 ) Cost Estimation for the Provisioning of Computing Resources to Execute Bag-of-Tasks Applications in the Amazon Cloud. In Altmann , J. , Silaghi , G.C. and Rana , O.F. (eds.) Economics of Grids, Clouds, Systems, and Services: 12th International Conference, GECON 2015, Cluj-Napoca, Romania, September 15-17, 2015, Revised Selected Papers . Springer International Publishing , Cham . 22 Amazon Elastic Compute Cloud (Amazon EC2) ( 2017 ). http://aws.amazon.com/ec2/. Accessed 29 May 2017. 23 Schwanengel , A. and Hohenstein , U. ( 2013 ) Challenges with Tenant-Specific Cost Determination in Multi-tenant Applications. In Proc. Third Int. Conf. Cloud Computing, GRIDs, and Virtualization, Valencia, Spain, 27 May–1 June, pp. 36–42. IARIA, Red Hook, NY, USA. 24 Czyzyk , J. , Mesnier , M.P. and Moré , J.J. ( 1998 ) The NEOS server . IEEE J. Comput. Sci. Eng. , 5 , 68 – 75 . Google Scholar CrossRef Search ADS 25 Hernández , S. , Fabra , J. , Álvarez , P. and Ezpeleta , J. ( 2013 ) A Reliable and Scalable Service Bus Based on Amazon SQS. In Proc. 2nd Eur. Conf. Service-Oriented and Cloud Computing, Málaga, Spain, 11–13 September, pp. 196–211. Springer, Berlin, Heidelberg. 26 Juve , G. , Deelman , E. , Vahi , K. , Mehta , G. , Berriman , G.B. , Berman , B.P. and Maechling , P. ( 2009 ) Scientific Workflow Applications on Amazon EC2. In Proc. 5th IEEE Int. Conf. e-Science, Oxford, United Kingdom, December 9–11, pp. 59–66. IEEE, Washington, DC, USA. 27 Douglas , G. , Drawert , B. , Krintz , C. and Wolski , R. ( 2014 ) CloudTracker: Using Execution Provenance to Optimize the Cost of Cloud Use. In Altmann , J. , Vanmechelen , K. and Rana , O.F. (eds.) Economics of Grids, Clouds, Systems, and Services: 11th International Conference, GECON 2014, Cardiff, UK, September 16–18, 2014. Revised Selected Papers . Springer International Publishing , Cham . 28 Altmann , J. and Kashef , M.M. ( 2014 ) Cost model based service placement in federated hybrid clouds . Future Generation Comput. Syst. , 41 , 79 – 90 . Google Scholar CrossRef Search ADS 29 Patel , K. , Patel , H. and Patel , N. ( 2017 ) Achieving Energy Aware Mechanism in Cloud Computing Environment. In Modi , N. , Verma , P. and Trivedi , B. (eds.) Proc. Int. Conf. Communication and Networks: ComNet 2016 . Springer Singapore , Singapore . 30 Grygorenko , D. , Farokhi , S. and Brandic , I. ( 2016 ) Cost-Aware VM Placement Across Distributed DCs Using Bayesian Networks. In Altmann , J. , Silaghi , G.C. and Rana , O.F. (eds.) Economics of Grids, Clouds, Systems, and Services: 12th International Conference, GECON 2015, Cluj-Napoca, Romania, September 15–17, 2015, Revised Selected Papers . Springer International Publishing , Cham . 31 Dejun , J. , Pierre , G. and Chi , C.-H. ( 2009 ) EC2 Performance Analysis for Resource Provisioning of Service-Oriented Applications. In Proc. 2009 Int. Conf. Service-oriented Computing, Stockholm, Sweden, November 23–27, pp. 197–207. Springer-Verlag, Berlin, Heidelberg. 32 Dejun , J. , Pierre , G. and Chi , C.-H. ( 2011 ) Resource Provisioning of Web Applications in Heterogeneous Clouds. In Proc. 2Nd USENIX Conf. Web Application Development, Portland, OR, USA, June 15–16, pp. 5–5. USENIX Association, Berkeley, CA, USA. 33 Schad , J. , Dittrich , J. and Quiané-Ruiz , J.-A. ( 2010 ) Runtime measurements in the cloud: Observing, analyzing, and reducing variance . Proc. VLDB Endow. , 3 , 460 – 471 . Google Scholar CrossRef Search ADS 34 Zhu , J. , Jiang , Z. and Xiao , Z. ( 2011 ) Twinkle: A Fast Resource Provisioning Mechanism for Internet Services. In Proc. IEEE INFOCOM 2011, Shanghai, China, April 10–15, pp. 802–810. IEEE, Washington, DC, USA. 35 Wu , L. , Garg , S.K. and Buyya , R. ( 2011 ) SLA-Based Resource Allocation for Software As a Service Provider (SaaS) in Cloud Computing Environments. In Proc. 2011 11th IEEE/ACM Int. Symp. Cluster, Cloud and Grid Computing, Newport Beach, CA, USA, May 23–26, pp. 195–204. IEEE Computer Society, Washington, DC, USA. 36 Kim , H. , el Khamra , Y. , Rodero , I. , Jha , S. and Parashar , M. ( 2011 ) Autonomic management of application workflows on hybrid computing infrastructure . Sci. Programming , 19 , 75 – 89 . Google Scholar CrossRef Search ADS 37 Mao , M. and Humphrey , M. ( 2011 ) Auto-Scaling to Minimize Cost and Meet Application Deadlines in Cloud Workflows. In Proc. 2011 Int. Conf. for High Performance Computing, Networking, Storage and Analysis, Seattle, Washington, 12–18 November, pp. 49:1–49:12. ACM, New York, NY, USA. 38 Son , S. and Sim , K.M. ( 2012 ) A price- and-time-slot-negotiation mechanism for cloud service reservations . IEEE Trans. Syst. Man Cybern. Part B , 42 , 713 – 728 . Google Scholar CrossRef Search ADS 39 Joha , A. and Janssen , M. ( 2012 ) Design choices underlying the software as a Service (SaaS) business model from the user perspective: exploring the fourth wave of outsourcing . J. UCS , 18 , 1501 – 1522 . Footnotes 1 http://ampl.com/ © The British Computer Society 2018. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model)

Journal

The Computer JournalOxford University Press

Published: Sep 1, 2018

References