Handling heterogeneous big data using machine learning. To suggest and design new machine learning algorithms for data ingestion, data preprocessing, big data analysis and data analytics and reporting/visualization. To help decision making across different application areas
Big data is being generated daily from various sources: people, organizations and sensors. The huge amounts of data may need to be processes offline (Static) or online i.e non-stationary (real-time or near real-time). Current conventional machine learning algorithms were not meant for this. There is need to tweak existing statistical and machine learning techniques or design novel ones that process big data (data science) to give insights from these huge datasets/data environments in a more efficient and scalable manner. The key objectives are-1)To design new machine learning algorithms for big data processing 2) To evaluate the efficiency and scalability of proposed data analysis techniques against the existing ones through data experimentation, simulation and data modelling. 3) To discuss the findings related to proposed big data processing techniques in terms of handling data noise, consistency, missing values and data reliability.