The resiliency and reliability of critical cyber physical systems like electrical power grids are of paramount importance. These systems are often equipped with specialized protection devices to detect anomalies and isolate faults in order to arrest failure propagation and protect the healthy parts of the system. However, due to the limited situational awareness and hidden failures the protection devices themselves, through their operation (or mis-operation) may cause overloading and the disconnection of parts of an otherwise healthy system. This can result in cascading failures that lead to a blackout. Diagnosis of failures in such systems is extremely challenging because of the need to account for faults in both the physical systems as well as the protection devices, and the failure-effect propagation across the system. Our approach for diagnosing such cyber-physical systems is based on the concept of Temporal Causal Diagrams (TCD-s) that capture the timed discrete models of protection devices and their interactions with a system failure propagation graph. In this paper we present a refinement of the TCD language with a layer of independent local observers that aid in diagnosis. We describe a hierarchical two-tier failure diagnosis approach and showcase the results for 4 different scenarios involving both cyber and physical faults in a standard Western System Coordinating Council (WSCC) 9 bus system.
Despite the known benefits of simulations in the study of mixed energy systems in the context of smart grid, the lack of collaboration facilities between multiple domain experts prevents a holistic analysis of smart grid operations. Current solutions do not provide a unified tool-chain that supports a secure and collaborative platform for not only the modeling and simulation of mixed electrical energy systems, but also the elastic execution of co-simulation experiments. To address above limitations, this paper proposes a design studio that provides an online collaborative platform for modeling and simulation of smart grids with mixed energy resources.
Owing to an immense growth of internet-connected and learning-enabled cyber-physical systems (CPSs) , several new types of attack vectors have emerged. Analyzing security and resilience of these complex CPSs is difficult as it requires evaluating many subsystems and factors in an integrated manner. Integrated simulation of physical systems and communication network can provide an underlying framework for creating a reusable and configurable testbed for such analyses. Using a model-based integration approach and the IEEE High-Level Architecture (HLA)  based distributed simulation software; we have created a testbed for integrated evaluation of large-scale CPS systems. Our tested supports web-based collaborative metamodeling and modeling of CPS system and experiments and a cloud computing environment for executing integrated networked co-simulations. A modular and extensible cyber-attack library enables validating the CPS under a variety of configurable cyber-attacks, such as DDoS and integrity attacks. Hardware-in-the-loop simulation is also supported along with several hardware attacks. Further, a scenario modeling language allows modeling of alternative paths for what-if scenarios. These capabilities make our testbed well suited for analyzing security and resilience of CPS. In addition, the web-based modeling and cloud-hosted execution infrastructure enables one to exercise the entire testbed using simply a web-browser, with integrated live experimental results display.
Unpredictability is one of the top reasons that prevent people from using public transportation. To improve the on-time performance of transit systems, prior work focuses on updating schedule periodically in the long-term and providing arrival delay prediction in real-time. But when no real-time transit and traffic feed is available (e.g., one day ahead), there is a lack of effective contextual prediction mechanism that can give alerts of possible delay to commuters. In this paper, we propose a generic tool-chain that takes standard General Transit Feed Specification (GTFS) transit feeds and contextual information (recurring delay patterns before and after big events in the city and the contextual information such as scheduled events and forecasted weather conditions) as inputs and provides service alerts as output. Particularly, we utilize shared route segment networks and multi-task deep neural networks to solve the data sparsity and generalization issues. Experimental evaluation shows that the proposed toolchain is effective at predicting severe delay with a relatively high recall of 76% and F1 score of 55%.
Transportation management platforms provide communities the ability to integrate the available mobility options and localized transportation demand management policies. A central component of a transportation management platform is the mobility planning application. Given the societal relevance of these platforms, it is necessary to ensure that they operate resiliently. Modularity and extensibility are also critical properties that are required for manageability. Modularity allows to isolate faults easily. Extensibility enables update of policies and integration of new mobility modes or new routing algorithms. However, state of the art mobility planning applications like open trip planner, are monolithic applications, which makes it difficult to scale and modify them dynamically. This paper describes a microservices based modular multi-modal mobility platform Mobilytics, that integrates mobility providers, commuters, and community stakeholders. We describe our requirements, architecture, and discuss the resilience challenges, and how our platform functions properly in presence of failure. Conceivably, the patterns and principles manifested in our system can serve as guidelines for current and future practitioners in this field.
Modeling of HVAC components and energy flows for energy prediction purposes can be computationally expensive in large commercial buildings. More recently, the increased availability of building operational data has made it possible to develop data-driven methods for predicting and reducing energy use for these buildings. In this paper, we present such an approach, where we combine unsupervised and supervised learning algorithms to develop a robust method for energy reduction for large buildings operating under different environmental conditions. We compare our method against other energy prediction models that have been discussed in the literature using (1) a benchmark data set and (2) a real data set obtained from a building on the Vanderbilt University campus. A Stochastic Gradient Descent method is then applied to tune the controlled variable i.e., the AHU discharge temperature set point so that energy consumption is "minimized".
Internet of Things and data sciences are fueling the development of innovative solutions for various applications in Smart and Connected Communities (SCC). These applications provide participants with the capability to exchange not only data but also resources, which raises the concerns of integrity, trust, and above all the need for fair and optimal solutions to the problem of resource allocation. This exchange of information and resources leads to a problem where the stakeholders of the system may have limited trust in each other. Thus, collaboratively reaching consensus on when, how, and who should access certain resources becomes problematic. This paper presents SolidWorx, a blockchain-based platform that provides key mechanisms required for arbitrating resource consumption across different SCC applications in a domain agnostic manner. For example, it introduces and implements a hybrid-solver pattern, where complex optimization computation is handled off-blockchain while solution validation is performed by a smart contract. To ensure correctness, the smart contract of SolidWorx is generated and veriﬁed using a model-based approach.
In the past couple of years, railway infrastructure has been growing more connected, resembling more of a traditional Cyber-Physical System model. Due to the tightly coupled nature between the cyber and physical domains, new attack vectors are emerging that create an avenue for remote hijacking of system components not designed to withstand such attacks. As such, best practice cybersecurity techniques need to be put in place to ensure the safety and resiliency of future railway designs, as well as infrastructure already in the field. However, traditional large-scale experimental evaluation that involves evaluating a large set of variables by running a design of experiments (DOE) may not always be practical and might not provide conclusive results. In addition, to achieve scalable experimentation, the modeling abstractions, simulation configurations, and experiment scenarios must be designed according to the analysis goals of the evaluations. Thus, it is useful to target a set of key operational metrics for evaluation and configure and extend the traditional DOE methods using these metrics. In this work, we present a metrics-driven evaluation approach for evaluating the security and resilience of railway critical infrastructure using a distributed simulation framework. A case study with experiment results is provided that demonstrates the capabilities of our testbed.
In the next coming years, the International Space Station (ISS) plans to launch several small-sat missions powered by lithium-ion battery packs. An extended version of such mission requires dependable, energy dense, and durable power sources as well as system health monitoring. Hence a good health estimation framework to increase mission success is absolutely necessary as the devices are subjected to high demand operating conditions. This paper describes a hierarchical architecture which combines data-driven anomaly detection methods with a fine-grained model-based diagnosis and prognostics architecture. At the core of the architecture is a distributed stack of deep neural network that detects and classifies the data traces from nearby satellites based on prior observations. Any identified anomaly is transmitted to the ground, which then uses model-based diagnosis and prognosis framework to make health state estimation. In parallel, periodically the data traces from the satellites are transported to the ground and analyzed using model-based techniques. This data is then used to train the neural networks, which are run from ground systems and periodically updated. The collaborative architecture enables quick data-driven inference on the satellite and more intensive analysis on the ground where often time and power consumption are not constrained. The current work demonstrates implementation of this architecture through an initial battery data set. In the future we propose to apply this framework to other electric and electronic components on-board the small satellites.
If last decade viewed computational services as a utilitythen surely this decade has transformed computation into a commodity. Computation is now progressively integrated into the physical networks in a seamless way that enables cyber-physical systems (CPS) and the Internet of Things (IoT) meet their latency requirements. Similar to the concept of “platform as a service” or “software as a service”, both cloudlets and fog computing have found their own use cases. Edge devices (that we call end or user devices for disambiguation) play the role of personal computers, dedicated to a user and to a set of correlated applications. In this new scenario, the boundaries between the network node, the sensor, and the actuator are blurring, driven primarily by the computation power of IoT nodes like single board computers and the smartphones. The bigger data generated in this type of networks needs clever, scalable, and possibly decentralized computing solutions that can scale independently as required. Any node can be seen as part of a graph, with the capacity to serve as a computing or network router node, or both. Complex applications can possibly be distributed over this graph or network of nodes to improve the overall performance like the amount of data processed over time. In this paper, we identify this new computing paradigm that we call Social Dispersed Computing, analyzing key themes in it that includes a new outlook on its relation to agent based applications. We architect this new paradigm by providing supportive application examples that include next generation electrical energy distribution networks, next generation mobility services for transportation, and applications for distributed analysis and identification of non-recurring traffic congestion in cities. The paper analyzes the existing computing paradigms (e.g., cloud, fog, edge, mobile edge, social, etc.), solving the ambiguity of their definitions; and analyzes and discusses the relevant foundational software technologies, the remaining challenges, and research opportunities.
Power grids are undergoing major changes due to rapid growth in renewable energy and improvements in battery technology. Prompted by the increasing complexity of power systems, decentralized IoT solutions are emerging, which arrange local communities into transactive microgrids. The core functionality of these solutions is to provide mechanisms for matching producers with consumers while ensuring system safety. However, there are multiple challenges that these solutions still face: privacy, trust, and resilience. The privacy challenge arises because the time series of production and consumption data for each participant is sensitive and may be used to infer personal information. Trust is an issue because a producer or consumer can renege on the promised energy transfer. Providing resilience is challenging due to the possibility of failures in the infrastructure that is required to support these market based solutions. In this paper, we develop a rigorous solution for transactive microgrids that addresses all three challenges by providing an innovative combination of MILP solvers, smart contracts, and publish-subscribe middleware within a framework of a novel distributed application platform, called Resilient Information Architecture Platform for Smart Grid. Towards this purpose, we describe the key architectural concepts, including fault tolerance, and show the trade-off between market efficiency and resource requirements.
Accurately analyzing the sources of performance anomalies in cloud-based applications is a hard problem due both to the multi tenant nature of cloud deployment and changing application workloads. To that end many different resource instrumentation and application performance modeling frameworks have been developed in recent years to help in the effective deployment and resource management decisions. Yet, the significant differences among these frameworks in terms of their APIs, their ability to instrument resources at different levels of granularity, and making sense of the collected information make it extremely hard to effectively use these frameworks. Not addressing these complexities can result in operators providing incompatible and incorrect configurations leading to inaccurate diagnosis of performance issues and hence incorrect resource management. To address these challenges, we present UPSARA, a model-driven generative framework that provides an extensible, lightweight and scalable performance monitoring, analysis and testing framework for cloud-hosted applications. UPSARA helps alleviate the accidental complexities in configuring the right resource monitoring and performance testing strategies for the underlying instrumentation frameworks used. We evaluate the effectiveness of UPSARA in the context of representative use cases highlighting its features and benefits.
Although many provisioning tools are available for deployment and management of composite cloud services to overcome the manual efforts that are tedious and error-prone, users are often required to specify Infrastructure-as-Code (IAC) solutions via low-level scripting. IAC demands domain knowledge for provisioning the services across heterogeneous cloud platforms and incurs a steep learning curve. To address these challenges, we present a technology-and platform-agnostic self-service framework called CloudCAMP. It incorporates domain-specific modeling so that the specifications and dependencies imposed by the cloud platform and application architecture can be specified at an intuitive, higher level of abstraction without the need for domain expertise. CloudCAMP transforms the partial specifications into deployable Infrastructure-as-Code (IAC) using the Transformational-Generative paradigm and by leveraging an extensible and reusable knowledge base. The auto-generated IAC can be handled by existing tools to provision the services components automatically. We validate our approach quantitatively by showing a comparative study of savings in manual and scripting efforts versus using CloudCAMP.
Users of cloud platforms often must expend significant manual efforts in the deployment and orchestration of their services on cloud platforms due primarily to having to deal with the high variabilities in the configuration options for virtualized environment setup and meeting the software dependencies for each service. Despite the emergence of many DevOps cloud automation and orchestration tools, users must still rely on specifying low-level scripting details for service deployment and management. Using these tools required domain expertise along with a steep learning curve. To address these challenges in a tool-and-technology agnostic manner, which helps promote interoperability and portability of services hosted across cloud platforms, we present initial ideas on a GUI based cloud automation and orchestration framework called CloudCAMP. CloudCAMP uses model-driven engineering techniques to provide users with intuitive and higher-level modeling abstractions that preclude the need to specify all the low-level details. CloudCAMP's generative capabilities leverage a built-in knowledge-base to automate the synthesis of Infrastructure-as-Code (IAC) solution that subsequently can be used to deploy and orchestrate services in the cloud. Preliminary results from a small user study are presented in the paper.
Reliable operation of power systems is a primary challenge for the system operators. With the advancement in technology and grid automation, power systems are becoming more vulnerable to cyber-attacks. The main goal of adversaries is to take advantage of these vulnerabilities and destabilize the system. This paper describes a game-theoretic approach to attacker / defender modeling in power systems. In our models, the attacker can strategically identify the subset of substations that maximize damage when compromised. However, the defender can identify the critical subset of substations to protect in order to minimize the damage when an attacker launches a cyber-attack. The algorithms for these models are applied to the standard IEEE-14, 39, and 57 bus examples to identify the critical set of substations given an attacker and a defender budget.
This article presents an overview of the collaborative Transit Hub project between Vanderbilt University, the Nashville Metropolitan Transit Authority (MTA) and Siemens, Corporate Technology. This project commenced as part of the NIST Global Cities Team Challenge (GCTC) . The goal of this project is to leverage technology effectively to improve public engagement with transit operations and increase the overall efficiency of the system. In the process we want to identify key technical challenges that will require new research to advance the state of the art.
As the number of low cost computing devices at the edge of network increases, there are greater opportunities to enable novel, innovative capabilities, especially in decentralized cyber-physical systems. For example, a set of networked, collaborating processors at the edge can be used to dynamically detect traffic densities via image processing and then use those densities to control the traffic flow by coordinating traffic light sequences; in a decentralized architecture. In this paper we describe a testbed and an application framework for such applications. Furthermore, we describe a queuing theory-based model for analyzing and optimizing workload placement across the fog nodes and available cloud resources.