University of Malta
 

Study-Unit Description
UOM Main Page
 
 
 
Apply - Admissions 2016
Newspoint
Campus Map button
Facebook
Twitter


CODE CPS3235

 
TITLE Data Science: From Data to Knowledge

 
LEVEL 03 - Years 2, 3, 4 in Modular Undergraduate Course

 
ECTS CREDITS 5

 
DEPARTMENT Computer Science

 
DESCRIPTION This study-unit aims to introduce all the phases of a data science project - and to make the student appreciate a formal and rigorous approach to the handling of data.

The first phase is data collection, how to gather the raw material for any data project. Practical examples of incomplete and noisy data nuisances will be provided. Next, we will clean and store the data. An overview of ubiquitous data formats and different types of storage technologies (e.g. relational, graph, key-value databases) will be discussed. We will explore the different ways to visualize the collected data, and to communicate results.

The next part of the study-unit focuses on building models with the data at hand. In data modelling we will start from simple, but powerful, statistical techniques (e.g. linear regression) to more complex machine learning methods. In the last part we will discuss the challenges of big data (e.g. in a bioinformatics setting), and techniques used to mitigate the scale of data.

Throughout the unit the student will be guided using practical, real-world examples.

Study-unit Aims:

"The role of Data Scientist is the sexiest job of the 21st century." [1] The aim of this study-unit is for students to be able to dissect such claims and to determine whether they are supported by data. This unit aims to teach the student how to use data to build predictive models, to be used in different industries and scientific areas. The student will be trained to fulfil the growing need of data science and business intelligence roles in industry.

[1] Harv. Bus. Rev. 2012 Oct;90(10):70-6, 128.

Learning Outcomes:

1. Knowledge & Understanding:
By the end of the study-unit the student will be able to:

- Complete a data analysis project from start to finish (all phases: processing, storing, visualization, analysis and modelling);
- Formulate a hypothesis and prove/disprove it based on the evidence (i.e. data);
- Build statistical and machine learning models to predict outcomes using real-life and artificial datasets;
- Appreciate the interdisciplinary nature of the field; involving statistics, cognitive science, and computer science.

2. Skills:
By the end of the study-unit the student will be able to:

- Familiarize the student with Python and data science libraries (e.g. scikit-learn, matplotlib, Pandas etc.);
- Communicate results from a data science project using appropriate visualization techniques;
- Application of statistical tests to determine if datasets are significantly different from each other;
- Use of Hadoop for a Big Data project.

Main Text/s and any supplementary readings:

Main Texts:

- Doing Data Science: Straight Talk from the Frontline (2013). Cathy O'Neil, Rachel Schutt
O'Reilly's take on data science, based on a set of lectures

- Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython (2012), Wes McKinney
Covers some of the Python libraries we will be using in this study unit

Supplementary Material:

- Naked Statistics: Stripping the Dread from the Data (2014) Charles Wheelan
Gives you a solid grasp of Statistics

- Data Scientists at Work (2014), Sebastian Gutierrez
Contains a set of interviews with luminaries in the data science field. Useful to learn which technologies are used in industry

- Statistics in Plain English (2010), Timothy C. Urdan
Excellent first textbook for people who want to gain a working knowledge in statistics

- The Elements of Statistical Learning: Data Mining, Inference, and Prediction (2011), Trevor Hastie, Robert Tibshirani, Jerome Friedman
Very popular advanced text. Together with Tom Mitchell's "Machine Learning" considered as the bible of the field

- The Signal and the Noise: Why So Many Predictions Fail - But Some Don't (2012), Nate Silver
Great read by Nate Silver (famous for his correct US elections predicitions)

 
STUDY-UNIT TYPE Lecture, Independent Study, Practicum & Tutorial

 
METHOD OF ASSESSMENT
Assessment Component/s Resit Availability Weighting
Project (including Presentation) Yes 100%

 
LECTURER/S Jean Paul Ebejer

 
The University makes every effort to ensure that the published Courses Plans, Programmes of Study and Study-Unit information are complete and up-to-date at the time of publication. The University reserves the right to make changes in case errors are detected after publication.
The availability of optional units may be subject to timetabling constraints.
Units not attracting a sufficient number of registrations may be withdrawn without notice.
It should be noted that all the information in the study-unit description above applies to the academic year 2017/8, if study-unit is available during this academic year, and may be subject to change in subsequent years.
Calendar
Notices
Study-unit Registration Forms 2017/8

Register

For Undergraduate (Day) and Postgraduate students.

 

Academic Advisors 2017/8

AA1

Academic Advisors for ICT 1st year students (Intake 2017/8), NOW available

Faculty of ICT Timetables

Timetables

ICT Timetables are available from Here.

Health and Safety Regulations for Labs Form

The Faculty of ICT Health and Safety Regulations for Laboratories form can be found here

 HealthAndSafety

 
 

Log In back to UoM Homepage