Study-Unit Description

Study-Unit Description


CODE CIS3107

 
TITLE Advanced Databases: Data Mining and Warehousing

 
UM LEVEL 03 - Years 2, 3, 4 in Modular Undergraduate Course

 
MQF LEVEL 6

 
ECTS CREDITS 6

 
DEPARTMENT Computer Information Systems

 
DESCRIPTION This unit focuses on current research topics in databases and data modelling for consolidation and presentation of an orginisation’s data infrastructure.
Given that a number of databases are accessible to an organisation then it is in a position to consolidate its sources together so as to provide "a subject oriented, nonvolatile, integrated, time variant collection of data in support of management's decisions" (B. Inmon).
This unit provides the candidate with knowledge and know how to build respositories onto which data warehousing and data mining excercises are executable.
Design and implementation techniques in SQL and procedureal extensions to SQL are presented.
A substaintial part is devoted to query design and optimisation for these massive data repositories.

Study-unit Aims:

The principal aim is to present candidates with techniques of how to identify, understand the underlying databases (and the processes executed over them), move data from a source to a destination, and then integrated it into a centralised repository. This centralised database needs to adhere to its own set of integrity constraints and gives the capability of tracing back data to its source.
After consolidation the physical design has to be tackled too and typically includes hardware and design techniques (e.g. what, how, when, where to index) that are very different from on-line systems.
Both data warehousing and data mining require extensive computational load if executed over massive datasets and therefore each query (or algorithm) needs careful study to design and optimise for execution. It has become customary that a number of specific techniques are applied to known problems.

Learning Outcomes:

1. Knowledge & Understanding:
By the end of the study-unit the student will be able to:

Recognise the need of and know how to build a cross organisation data infrastructure for a warehouse and data mining exercise;
Evaluate data sources and how to extract and move data into a staging area;
Build an organisation wide data repository for data warehousing and data mining (at logical and physical level);
Write complex queries in SQL and SQL procedural extensions;
Explain the difference between building the infrastructure and querying it in terms of computational load;
Explain query processing and optimisation in massive datasets.

2. Skills:
By the end of the study-unit the student will be able to:

Write and implement complex database design for an enterprise infrastructure with a database high level language;
Write and implement problematic extract, load and transform methods to consolidate the source databases into the infrastructure;
Write SQL commands for roll-up (and cube), top-n, group by, partitions and CTE;
Write procedures with embedded queries for basic algorithms that extract patterns;
Write code for specific data intensive problems: association rules, rules, clustering;
Write code to implement data mining in time series datasets;
Select, use, and deploy specialised tools for data warehousing and data mining.

Main Text/s and any supplementary readings:

• Fundamentals of Database Systems, Ramez Elmasri, Shamkant B. Navathe, 6th Edition, 2010, Addison Wesley, ISBN-13: 978-0136086208
• Data Mining: Concepts and Techniques, Jiawei Han, Micheline Kamber, Jian Pei, 3rd Edition, 2011,The Morgan Kaufmann Series in Data Management Systems), ISBN-13: 978-0123814791
• Data Warehouse Design: Modern Principles and Methodologies, Matteo Golfarelli, Stefano Rizzi, 2011, McGraw-Hill Osborne, ISBN-13: 978-0071610391
• A number of research papers are made available.
• System Manuals as per need (and available in department's labs)

Note: Inmon and Kimball books are still a good read for data warehousing.

 
RULES/CONDITIONS Before TAKING THIS UNIT YOU MUST TAKE CIS1042

 
STUDY-UNIT TYPE Independent Study, Lecture, Practicum & Seminar

 
METHOD OF ASSESSMENT
Assessment Component/s Assessment Due Sept. Asst Session Weighting
Practical SEM2 Yes 15%
Examination (3 Hours) SEM2 Yes 85%

 
LECTURER/S Anthony Spiteri Staines
Joseph Vella (Co-ord.)

 

 
The University makes every effort to ensure that the published Courses Plans, Programmes of Study and Study-Unit information are complete and up-to-date at the time of publication. The University reserves the right to make changes in case errors are detected after publication.
The availability of optional units may be subject to timetabling constraints.
Units not attracting a sufficient number of registrations may be withdrawn without notice.
It should be noted that all the information in the description above applies to study-units available during the academic year 2023/4. It may be subject to change in subsequent years.

https://www.um.edu.mt/course/studyunit