Study-Unit Description

Study-Unit Description


TITLE Statistics for Data Scientists

LEVEL 05 - Postgraduate Modular Diploma or Degree Course


DEPARTMENT Artificial Intelligence

DESCRIPTION The student will be exposed to various techniques in order to run statistical tests and derive conclusions about hypothesis made on collected date.

The Unit will be divided into two and will cover the following topics:

Theoretical Part

- Introduction to Statistics
- Introduction to Probability
- Descriptive Analysis
- Statistical Inference
- Hypothesis Testing
- Statistical Modelling (Regression)
- Correlation
- Principal Compinent Analysis
- Visualisation
- Cross Validation
- Markov Chains

Application Part

- R programming
- Obtaining data from different data sources, cleansing of data and manipulation
- Data Visualisation
- Publishing data using Shiny Web Applications

Study-unit Aims:

The aims of this study-unit are to:
- help students understand, appreciate and apply the relevant techniques in statistics, the reasoning behind them and when to use them;
- explain how to use a range of modelling and data analytical techniques;
- introduce the data science pipeline and how technologies play a crucial role in this;
- highlight and appreciate the advantages and limitations of different technologies related to data science.

Learning Outcomes:

1. Knowledge & Understanding:
By the end of the study-unit the student will be able to:

- Demonstrate knowledge of basic statistical terms such as population, samlple, dependent/independent variables, etc.;
- analyse a dataset and distinguish between qualitative and quantitative variables;
- Apply a suitable sampling method and choose an appropriate sample size for the experiment;
- Annonate the data appropriately to ensure that it can be shared with other researchers;
- Construct a correct statistical model for the data in order to appy statistical tests;
- Produce informative visualizations based on the data to explain the data or to summarise the results.

2. Skills:
By the end of the study-unit the student will be able to:

- Collect, analyse data and apply statistical techniques intelligently to infer conclusions;
- Apply data classification techniques to identify key traits;
- Judge the probability of an event occurring based on certain conditions;
Use data visualization techniques to create compelling and informative graphics to present the data to a non-technical audience;
- Use the statistical tool R to gather, clean and analyse data.

Main Text/s and any supplementary readings:

- Chambers, J.M. (2008) Software for Data Analysis: Programming with R (Statistics and Computing), New York, Springer-Verlag
- James, G. (2009) An Introduction to Statistical Learning: With Applications in R. New York. Springer-Verlag
- Field, A., Miles, J. and Field, Z. (2012) Discovering Statistics Using R, London, SAGE Publications Ltd.

STUDY-UNIT TYPE Lecture and Tutorial

Assessment Component/s Assessment Due Resit Availability Weighting
Project SEM2 Yes 100%


The University makes every effort to ensure that the published Courses Plans, Programmes of Study and Study-Unit information are complete and up-to-date at the time of publication. The University reserves the right to make changes in case errors are detected after publication.
The availability of optional units may be subject to timetabling constraints.
Units not attracting a sufficient number of registrations may be withdrawn without notice.
It should be noted that all the information in the study-unit description above applies to the academic year 2018/9, if study-unit is available during this academic year, and may be subject to change in subsequent years.