Data Validation and Visualisation
High-quality data is the precondition for analysing and using big data and for guaranteeing the value of the data. This module, introduces the data quality challenges faced by big data. It will present tools and techniques employed to ensure data quality from data collection and computational procedures to facilitate automatic or semi-automatic identification and elimination of errors in large datasets. The module also introduces the topic of understanding and interpreting data through descriptive statistical methods. This will be achieved through a range of techniques such as Statistical metrics, Univariate analysis and Multivariate analysis. Students will develop the knowledge to assess the quality of the data and the skills necessary to perform appropriate data cleaning operations. In addition, students will have an understanding of processing data and interpreting and visualising results.
Machine Learning and Data Modelling
This module covers Machine Learning both conceptually and practically. Students will be introduced to a variety of unsupervised and supervised Machine Learning techniques. Once the core concepts have been introduced they will be given practical experience of their use, application and evaluation through laboratory exercises and a project. The students will develop an in-depth understanding of the potential and scope of applying and evaluating the different forms of Machine Learning. This will allow them to develop a range of applications from simple practical implementations to large scale implementations.
Data Science Foundations
The focus of this module is to present an understanding of key data science concepts, tools and programming techniques. Within the arena of data science, the theory behind the approaches of statistics, modelling and machine learning will be introduced emphasising their importance and application to data analysis. The notion of investigative and research skills will also be introduced through a number of problem solving exercises. Material covered will be contextualised by providing examples of the latest research within the area. Students will also be introduced to programming with Python / R. They will learn the basics of syntax, and how to configure their development environment for implementation and testing of algorithms related to data science.
This module aims to contextualise the role of Business Intelligence (BI) and why we need BI systems. A particular focus will be on how to turn already stored data into valuable information and why this is important. Vast amounts of data regarding company's customers and operations is routinely collected and stored in large corporate data warehouses. This data can be of immense value if properly analysed. Students will explore techniques and tools for data analysis, and presentation of the results to non-technical and managerial staff, in alignment with business strategies. Big Data technologies do offer BI although however, they are open to certain ethical and consent issues along with risks. These will be analysed, reviewed and evaluated.
Big Data and Infrastructure
Within this module a variety of database and data storage paradigms will be explored, ranging from more traditional relational systems to NoSql and object stores, time series databases, semantic store and graph stores.
Consideration will be given to big data and the problem with storing and querying high volumes of highly variable data which is stored and processed at a high speed. The cloud computing paradigm will also be introduced and how to avail of its power and resources.
The core concepts of distributed computing will be examined in the context of Hadoop. Students will be taught, practically and theoretically, about the components of Hadoop, workflows, functional programming concepts, use of MapReduce, Spark, Pig, Hive and Sqoop.
Statistical Modelling and Data Mining
This module first provides a systematic understanding to probability and statistics. It then provides an in depth analysis of the statistical modelling process and how to answer hypothesised questions. Next, the module provides a synthesis of the concepts of data mining and methods of exploring data. The content will be delivered and experienced through lectures, seminars and practical exercises using tools, such as, Python, R and Weka. Online tools, such as Blackboard will be used to facilitate blended learning approach. On completing this module, students will be able to compute conditional probabilities and use null hypothesis significance testing to test the significance of results, and understand and compute statistical measures such as the p-value for these tests. Students will apply, evaluate and critically appraise this knowledge in a range of complex real world contexts.
Masters Project (Research)
The aim of the project is to allow the student to demonstrate their ability in undertaking an independent research project for developing theoretical perspectives, addressing research questions using data, or analysing and developing real world solutions. They will be expected to utilise appropriate methodologies and demonstrate the skills gained earlier in the course when implementing the project.
As part of the project development activity they will be required to extract and demonstrate knowledge from the literature in an analytic manner and develop ideas and appropriate analytical models. This may involve the collection of primary or secondary data and the qualitative or quantitative analysis of the data. This will typically be followed by a structured analysis of needs for a realistic application or actual organisation and identification and application of tools/techniques required to deliver a well formed solution. Through the project the student will develop capabilities to analyse cases studies related to data science and/or business intelligence, and create improvement plans and recommendations for their implementation based upon the tools/technologies experienced during the programme of study.
In summary the masters project represents a piece of work performed by the student under suitable staff supervision which draws both from the practical and creative nature of a problem solving project and the traditional, scholarly exposition of an area of study. The content of the work must be original and contain a critical appraisal of the subject area.