Cyber-Physical Systems (CPSs) are increasingly composed of services and applications deployed across a range of communication topologies, computing platforms, and sensing and actuation devices. These services and applications often form parts of multiple end-to-end cyber-physical flows (i.e., end-to-end task chains) that operate in resource constrained environments. These systems can be classified as dynamic/open (i.e., comprising varying system workloads) and thus experience fluctuating resource availabilities (e.g. due to resource failures or overloads). In such operating conditions, each service within the end-to-end cyber-physical flows must process events belonging to other services or applications, while providing quality of service (QoS) assurance (e.g., timeliness, reliability, and trustworthiness) within the constraints of limited resources, or with the ability to fail over to providers of last resort (e.g., a public utility in the case of a smart grid).
CPSs have traditionally been designed and implemented using resources procured and maintained in-house. Significant budget constraints are driving the researchers and practitioners to consider cost-effective alternatives, yet ensure mission- and safety-critical properties. Cloud computing enables the consideration of new factors in the design and operation of CPSs, including offering economic incentives, aggregating and disaggregating behaviors dynamically to reduce risk, consolidating and sharing physical hardware among different applications to reduce power consumption and heat generation, and auto-scaling computing, communication, and even sensing and actuation resources on-demand, to ensure that CPSs can use the optimum number of resources without incurring costs when resources are idle.
Despite the promise held by cloud computing, however, supporting the real-time, safety, stability, and reliability requirements of CPSs is hard. First, cloud providers require CPSs to make the right decisions a priori based on a configuration model that best suits their requirements. This decision is hard, however, due to the plethora of choices available; the problem is exacerbated since the workloads in CPSs rarely exhibit consistent workload and arrival patterns, implying that no single configuration suffices, or that if it does it must describe system characteristics in excruciating detail. Second, the injection of individual participants’ decisions into the system behavior overall requires an unprecedentedly nuanced treatment of both utility to the participants and assurance of the reliability of the system overall. Consequently, CPSs require elasticity and auto-scaling capabilities of the cloud platforms as workloads change, but with precise control over cyber-physical properties even as those changes are enacted and evolve.
A number of technical challenges emerge in this context, including:
- Precise auto-scaling of resources with a system-wide focus. Auto-scaling requires CPSs to indicate resource needs a priori so cloud providers can effectively scale up or down the resources, e.g., when load increases, services are provisioned with higher demand on existing resources and potentially given access to new ones, while resources also can be de-provisioned or loaded less heavily when load reduces. Due to lack of effective mechanisms to predict the workload patterns, however, it is hard to inform cloud providers of resource requirements, which means auto-scaling may be complicated or even infeasible in practice. As a result, extra resources could be allocated (which is wasteful) or insufficient resources could be allocated (which will adversely affect system deadlines and response times). Moreover, current state-of-the-art in auto-scaling algorithms typically manage one service at a time in isolation. CPSs are often composed of interacting services, so they require auto-scaling algorithms that operate at the level of service groups working together in end-to-end task chains, while ensuring that end-to-end (cyber-physical) QoS requirements are met. Finally, properties such as physical stability and safety may require that exceedingly complex analyses (such as reachability of hybrid cyber-physical states) must be evaluated and enforced in practice, which in many cases exceeds the current state-of-the-art.
- Flexible optimization algorithms to balance real-time constraints with cost and other goals. Since CPSs are realized as end-to-end real-time task chains, their deployment on cloud resources must be schedulable on all resources acquired from cloud providers to ensure real-time response times while optimizing desired objective functions such as minimizing operational costs. These requirements must be met in the context of the enacted auto-scaling algorithms. Due to different criticalities of task chains that could be deployed on the resources, principled means for co-scheduling or performing admission control and/or eviction of mixed-criticality task sets is also needed. CC for CPS offers new ways to think about these issues, from the perspective of trading off the flexibility to commission additional resources at least temporarily with the need to maintain that flexibility by not over-relying on that approach to the point where flexibility is exhausted.
- Improved fault-tolerance fail-over to support real-time requirements. Although some cloud platforms support fault tolerance for provisioned resources, this level of reliability may be insufficient for CPSs where real-time and fault-tolerance of the end-to-end task chains must be met simultaneously while minimizing costs. Since different cloud providers have diverse models with respect to the flexibility offered to the cloud users, a one-size-fits-all solution is impractical. In addition, the complex, large-scale, and potentially stochastic nature of some CPSs means that even reasoning about the consequences of faults and the trajectories of system behaviors resulting from them is an important open area of research.
- Data provisioning and load balancing algorithms that rely on physical properties of computations. CPSs generate load on a cloud computing environment due to physical stimuli, such as traffic, power grid fluctuations, human movement, and changing weather patterns. In order to build the most scalable and high-performance systems, algorithms and techniques are needed to exploit physical characteristics of data and computation to improve the distribution of work in a cloud environment. For example, data may need to be clustered onto nodes based on geographic associations, social network linkages, or other physical world aspects. Understanding the relationship between physical world aspects and cyber optimizations to improve scalability and response time of cloud systems is critical to support CPSs
The invitation only workshop will be held at the Virginia Tech Reseach Center, 900 N. Glebe RD, Arlington, VA 22203. The workshop will have four sessions on the first day (Thursday, March 14th, 2013) covering the following topics:
The workshop will also have two sessions on the second day (Friday March 15th, 2013) that will cover the following topics:
Stability, Safety, Reliability, Security, and Privacy Requirements for Computing Clouds that Support CPS
Programming Models and Paradigms for Computing Clouds that Support CPS
Submission: November 30th, 2012
Notification: December 15th, 2012
Workshop: March 14th-15th, 2013
ISIS, Vanderbilt University
Submit to CFP:
Douglas C. Schmidt, Vanderbilt University
Chris Gill, Washington University
Jules White, Virginia Tech