My research lies at Big Data Analytics, Cloud Computing Systems with a particular focus on Performance Analysis of Cloud Computing Systems (CCS) and Data Security in CCS.
Big Data Analytics
Cloud Computing is a technology aimed at processing and storing very large amounts of data, which are also referred to as Big Data. One of the most important challenges in Cloud Computing is how to process Big Data. By the end of 2012, the total data generated was 2.8 Zettabytes (ZB) (2.8 trillion Gigabytes). One of the areas contributes to the analysis of Big Data is Data Science. This new study area, called Big Data Science (BDS), has recently become a very important topic in organizations because of the value it can generate, both for themselves and for their customers. One of the challenges in implementing BDS is the current lack of information to help in understanding this new study area. In response, we developed the DIPAR framework, which proposes a means to implement BDS in organizations, and defines its requirements and elements. The framework consists of five stages: Define, Ingest, Preprocess, Analyze, and Report, and is based on the ISO 15939 Systems and software engineering – Measurement process standard, the purpose of which is to collect, analyze, and report data relating to products to be developed.
Performance Analysis of Cloud Computing Systems
Cloud Computing is defined by ISO and IEC as the paradigm for enabling ubiquitous, convenient, on-demand network access to a shared pool of configurable cloud resources accessed through services which can be rapidly provisioned and released with minimal management effort or service provider interaction. Cloud services are categorized into three service models: 1) Infrastructure as a Service (IaaS), 2) Platform as a Service (PaaS), and 3) Software as a Service (SaaS). Each of these service models includes all the technical resources that clouds need to process information, including software, hardware, and network elements. For example, the service model that most relates to the software engineering community is the SaaS model, while the IaaS model is most related to hardware architectures and virtualization. Software engineers focus on software components, and customers use IT provider applications running on a cloud infrastructure to process information according to their processing and storage requirements.
To improve the quality of CC services, major providers must be able to offer services with Quality of Service (QoS). QoS can be assured through a Service Level Agreement (SLA) between the cloud service provider (CSP) and the customer. SLAs vary from one CSP to the next, and in some cases, different customers can negotiate different contract terms with the same CSP for the same service. One of the main problems in measuring cloud services performance is the lack of models that can integrate the perspectives of provider, maintainer and customer in the same model. For example, in a typical IaaS such as Amazon EC2, customers can monitor different parameters that show the state of their virtual machine (VM) and provide system performance information such as memory usage, data transmitted, system load, etc. In this same IaaS, the provider will need to know the performance measurement of all VM instances in order to continually meet the SLA requirements.
In response, we propose the P2M2C-3D model, a three-dimensional Performance Measurement Model for Cloud Computing, which consolidates performance measurement from the perspectives of provider, maintainer and customer for the different types of cloud services.
Data Security in Cloud Computing Systems
In November 2014, an especially chilling cyberattack shook the corporate world. Hackers, having explored the internal servers of Sony Pictures Entertainment, captured internal financial reports, top executives’ embarrassing e-mails, private employee health data, and even unreleased movies and scripts and dumped them on the open Web. Since companies and other organizations can’t stop attacks and are often reliant on fundamentally insecure networks and technologies, the big question is how respond to attacks and limit the damage, and adopt smarter defensive strategies in future. New approaches and new ways of thinking about cybersecurity are beginning to take hold. Organizations are getting better at detecting fraud and other attacks by using big data algorithms to mine information in real time. This allows respond far quickly, using platforms that alert security staff to what is happening and quickly help them take action. New tools based on the Cloud are emerging from a blossoming ecosystem of cybersecurity to be used in new technological paradigms such as the Internet of Things (IoT).
Because of complexity of the assurance of Cloud Computing Systems (CCS), we have focused our efforts on the assurance of data instead of the cloud services, which at the end of the day is one of the most important assets of users and organizations, and as a consequence of the providers of cloud services.
Moreover, in order to setting up a Cloud framework that specifically addresses, organizations’ information security, it is necessary to adapt and incorporate current data protection, trust and privacy policies in a comprehensive set of Cloud Computing guidelines that include governance and audit practices. So, we are working on develop a data security framework for CCS which defines data security requirements as well as measurements for defining data security levels in the different cloud services categories from a software engineering point of view.
Cloud Measure Platform
One of the main issues in Cloud Computing is the lack of information which helps to understand and define concepts of assurances of availability, reliability and liability in Cloud Computing Systems (CCS). Concepts such as price, performance, time to completion (availability), probability of failure and liability are key to being able to produce a comparison service, in order to establish Service Level Agreements (SLA) or design better mechanisms to improve the performance in CCS. This research presents the design of a repository of performance attributes which provides information and tools to facilitate the design, validation, and comparison of performance models and algorithms for CCS. The purpose of this repository is to help to establish attribute–performance relationships relating to specific applications with relatively well-known demands on systems to be able to determine how comparison services may be formulated. The design of the CloudMeasure repository is based on the Performance Measurement Framework for Cloud Computing which defines the basis for the measurement of Cloud Computing concepts that are directly related to performance and have been identified from international standards such as ISO 25010.