Data Science Process: Roles and stages in a data science project, Working with files and databases, Exploring and managing data; Big Data- Types, Characteristics, Tools and Applications; Data Analytics- Types, Tools and Applications; Data and Relations: Data set - Data Scales - Set and Matrix Representations - Relations - Similarity Measures - Dissimilarity Measures - Sequence Relations – Sampling and Quantization.
Data preprocessing : Error Types - Error Handling - Filtering - Data Transformation - Data Merging; Data visualization: Diagrams - Principal Component Analysis - Multidimensional Scaling - Auto Associator - Histograms - Spectral Analysis.
Correlation: Linear Correlation - Causality - Chi-Square Tests; Regression: Linear Regression - Robust Regression - RBF Networks - Cross Validation - Feature Selection
Classification: Classification Criteria - Naive Bayes‘ Classifier -Rule Based Classification – Classification by Back Propagation - Support Vector Machine - Decision Trees - Lazy Learners – Model Evaluation and Selection-Techniques to improve Classification Accuracy.
Clustering: Cluster Partitions - Sequential - Prototype-Based - Fuzzy - Relational - Cluster Tendency Assessment - Cluster Validity - Self Organizing Maps; Case Study: Advertising on the Web.
Reference Book:
1 Dean J, “Big Data, Data Mining and Machine learning”, Wiley publications, 2014. 2 Provost F and Fawcett T, “Data Science for Business”, O‘Reilly Media Inc, 2013. 3 Janert PK, “Data Analysis with Open Source Tools”, O‘Reilly Media Inc, 2011.
Text Book:
Runkler TA, “Data Analytics: Models and algorithms for intelligent data analysis”, Springer, Third Edition 2020.