DIFFERENTIAL@UPO – DATAi – Intelligent Data Analysis

Nowadays the challenge of Carbon Emission Reduction is a priority for both the public and private sector, due to the consequences on climate change and global warming. One strategy that can be adopted in order to reduce such emissions is avoiding energy waste and improving energy consumption efficiency. According to the International Energy Agency, the energy required by buildings to maintain temperature and lighting conditions represents one third of the global energy consumption. The project DIFFERENTIAL aims at reducing this share of energy consumption by applying Big Data analysis on the data generated by sensors located in public buildings.

In particular, sub-project DIFFERENTIAL@UPO: Massive Data Management, Filtering and Exploratory Analysis focuses on providing techniques and tools for the automatic filtering and exploratory analysis of data generated by electricity consumption sensors, as a first necessary step to further sensor data analysis and exploitation. Given the frequency of measurement and the high number of sensors considered, the collected data sets present massive volumes, and must be considered within the Big Data paradigm. Moreover, such data is particularly affected by (partially) incorrect and unreliable data due to faulty sensor operations.

Data within the Big Data paradigm are characterized by massive volumes, high speed of generation, variety of data and veracity of data (i.e., data affected by biases, noise and abnormalities). Due to these features, the management and analysis of such data becomes almost impossible with traditional methods. It follows that new techniques are needed in order to deal with this kind of data, as the DIFFERENTIAL@UPO proposes.

Considering the above mentioned nature of this kind of data, in order to apply knowledge extraction techniques, adequate strategies are needed in order to efficiently managing the data. Tools for storage, access and automatically filtering the data, reducing the dimensionality and discarding faulty measurements, are also needed, as well as strategies for performing exploratory analysis. DIFFERENTIAL@UPO will approach these aspects by applying Soft Computing techniques, and in particular Fuzzy Logic and Evolutionary Computation, fields in which the research team has extensive experience. In particular, our proposal is to approach data storage and access by using new technologies for fuzzy massively parallel databases, in order to deal with the volume, speed of generation and veracity of the data. As far as data analysis is concerned, we plan to use Evolutionary Algorithms (EAs) combined with Fuzzy Logic. EAs are particularly suited for this problem since they have great exploration power and are easy to parallelize. Moreover, the combination with Fuzzy Logic will allow the developed techniques to handle the veracity of data.

Finally, as far as the technological aspects are concerned, we are planning to use High Performance Computing technologies, such as Map-Reduce computing techniques (on a Cloud infrastructure) and GPU computing, in order to make the techniques developed scalable with regards to the volume of data.

Differential@UPO: Massive Data Management, Filtering and Exploratory Analysis is a research project funded by Ministerio de Economía, Industria y Competitividad and FEDER under grant TIN2015-64776-C3-2-R

Publications

J.M. Medina, C.D. Barranco, O. Pons. Indexing techniques to improve the performance of necessity-based fuzzy queries using classical indexing of RDBMS. Fuzzy Sets and Systems.
F. Divina, A. Gilson, F. A. Gomez-Vela, M. García-Torres, José F. Torres. Stacking Ensemble Learning for Short-Term Electricity Consumption Forecasting. Energies, 11(4), 949, 2018.
J.M. Medina, C.D. Barranco, O. Pons. Indexes for Necessity Queries. Implementation and Performance Evaluation on a Fuzzy Object-Relational Database Management System. Fuzzy-IEEE 2018.
J.M. Medina, C.D. Barranco, O. Pons, D. Sanchez. Building and evaluation of indexes for possibilistic queries on a fuzzy object-relational database management system. Fuzzy-IEEE 2017.
R. Arias-Mitchel, M. García-Torres, C. Schaerer, F. Divina. Feature Selection Using Approximate Multivariate Markov Blankets. HAIS 2016.
F. Gomez-Vela, A. Lopez, J.A. Lagares, D. S. Baena, C.D. Barranco, M. García-Torres, F. Divina. Bioinformatics from a Big Data Perspective: Meeting the Challenge. IWBBIO 2017.

Publications

Project details