CSPC-515 Data Warehouse and Data Mining | |||||||
|---|---|---|---|---|---|---|---|
Teaching Scheme | Credit | Marks Distribution | Duration of End Semester Examination | ||||
| L | T | P | Internal Assessment | End Semester Examination | Total | ||
| 3 | 1 | 0 | 4 | Maximum Marks: 40 | Maximum Marks: 60 | 100 | 3 Hours |
| Minimum Marks: 16 | Minimum Marks: 24 | 40 | |||||
Unit-I
Data warehouse: Introduction to Data warehouse, Difference between operational database systems and data warehouses, Data warehouse Characteristics, Data warehouse Architecture and its Components, Extraction-Transformation-Loading, Logical(Multi-Dimensional), Data Modeling, Schema Design, Star and Snow-Flake Schema, Fact Constellation, Fact Table, Fully Additive, Semi-Additive, Non Additive Measures; Fact-Less-Facts, Dimension Table Characteristics; OLAP Cube, OLAP Operations, OLAP Server Architecture- ROLAP, MOLAP and HOLAP.
Unit-II
Data Mining: Fundamentals of data mining, Data Mining Functionalities, KDD, Data Mining process, Integration of a Data Mining System with a Database or Data Warehouse System, Major issues in Data Mining.
Data Pre-processing: Need for Data Pre-processing, Steps in data pre-processing: Data Cleaning, Data Integration and Transformation, Data Reduction Techniques, Data Discretization and Concept Hierarchy Generation.
Unit-III
Association Rules: Problem Definition, Frequent Item Set Generation, Association Rule Generation, APRIOIRI Algorithm, The Partition Algorithm.
Classification: Problem Definition, General Approaches to solving a classification problem, Evaluation of Classifiers, Classification techniques: Decision Trees-Decision, Naive-Bayes Classifier, Bayesian Belief Networks, K- Nearest neighbor classification. Algorithms Evaluation metrics.
Unit-IV
Clustering: Overview of Clustering, Categorization of Major Clustering Methods, Partitioning Methods, Hierarchical Methods.
Advanced Topics and Applications: Web Mining: Types of Web Mining, Web Mining Software. Text Mining: Definition and Importance, Applications: (Search Engines, Sentiment Analysis, Spam Filtering).
Real-world Applications: Business Intelligence, Healthcare Analytics, Cyber security Threat Detection.