Course Details

Course Information
SemesterCourse Unit CodeCourse Unit TitleT+P+LCreditNumber of ECTS CreditsLast Updated Date
1ECE544TEXT MINING3+0+037,514.05.2025

 
Course Details
Language of Instruction English
Level of Course Unit Master's Degree
Department / Program ELECTRICAL AND COMPUTER ENGINEERING
Type of Program Formal Education
Type of Course Unit Elective
Course Delivery Method Face To Face
Objectives of the Course (1) Recall the basics of probability concepts
(2) Learn the fundamental notions of data & text mining
(3) Discuss essential NLP approaches for free-text data structures
(4) Apply machine learning algorithms for text mining problems
Course Content Intro. To Data & Text Mining, Why Text Mining, Issues and difficulties in Text Mining
Basic text processing commands in Unix like operating systems & regular expressions
Recap basic probabilities, n-gram language models, perplexity, and smoothing techniques
n-gram model interpolation and backoff, Naïve Bayes algorithm for text classification
Introduction to named entity recognition, information extraction
Conditional vs generative models, maximum entropy models for named entity recognition
Part-of-speech tagging using maxent models, rel. extraction (supervised, distant supervision)
Intro. to parsing, PCFGs, CNF, CKY algorithm & issues with PCFGs, Lexicalized PCFG
Dependency parsing, arc-eager parser, Malt parser, relation extraction through dependency structure
Lexical semantics, synonymy/homonymy/polysemy, word sense disambiguation
Word similarity, term-document matrices, tf-idf weighting, vector space model
Intro. to open-source text mining libraries (NLTK, spaCy -in python), building a model for prediction models
Course Methods and Techniques - In-person course instruction,
- Additional asynchronous resource sharing,
- Development of a course project,
Prerequisites and co-requisities None
Course Coordinator Asist Prof.Dr. Mehmet Gökhan Bakal gokhan.bakal@agu.edu.tr
Name of Lecturers Asist Prof.Dr. MEHMET GÖKHAN BAKAL gokhan.bakal@agu.edu.tr
Assistants None
Work Placement(s) No

Recommended or Required Reading
Resources Relevant Video Resources
Course Notes - Shared topic-related slides and other articles.
- SPEECH and LANGUAGE PROCESSING, An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition Second Edition by Daniel Jurafsky and James H. Martin.
Documents https://web.stanford.edu/~jurafsky/slp3/ed3book.pdf
Assignments https://drive.google.com/drive/u/0/folders/1wMx0x2M5I18jelH3UtCRpUmbvlLTigXQ

Course Category
Mathematics and Basic Sciences %30
Engineering %60
Engineering Design %0
Social Sciences %0
Education %0
Science %10
Health %0
Field %0

Planned Learning Activities and Teaching Methods
Activities are given in detail in the section of "Assessment Methods and Criteria" and "Workload Calculation"

Assessment Methods and Criteria
In-Term Studies Quantity Percentage
Quiz/Küçük Sınav 3 % 15
Ödev 3 % 40
Proje/Çizim 1 % 40
Diğer (Staj vb.) 1 % 5
Total
8
% 100

 
ECTS Allocated Based on Student Workload
Activities Quantity Duration Total Work Load
Münazara 5 1 5
F2F Dersi 14 3 42
Ev Ödevi 3 2 6
Sınıf İçi Aktivitesi 3 1 3
Sunum için Hazırlık 1 10 10
Teslim İçin Hazırlık 1 5 5
Sunum 1 3 3
Proje 1 40 40
Kısa Sınav 3 1 3
Okuma 1 3 3
Rapor 1 20 20
Araştırma 1 5 5
Kişisel Çalışma 1 10 10
Yazılım Deneyimi 3 2 6
Ders dışı çalışma 2 2 4
Yüz Yüze Ders 14 3 42
Asenkron Ders 7 2 14
Total Work Load   Number of ECTS Credits 7,5 221

 
Course Learning Outcomes: Upon the successful completion of this course, students will be able to:
NoLearning Outcomes
1 Express foundations of data mining and text mining concepts
2 Discuss computational text mining tasks including document classification & clustering, sentiment analysis, document summarization, and information extraction
3 Apply text processing experiments such as tokenization, named entity recognition, part-of-speech tagging for text classification
4 Perform sentence level mining including parsing, different parser approaches, lexical semantics and other methodologies
5 Use open-source NLP libraries (NLTK, spaCy, etc.) for building an NLP system to solve a real-world problem and to present the designed model

 
Weekly Detailed Course Contents
WeekTopicsStudy MaterialsMaterials
1 Intro. To Data & Text Mining, Why Text Mining, Issues and difficulties in Text Mining
2 Basic text processing commands in Unix-like operating systems & regular expressions
3 Introduction to probabilities, n-gram language models, perplexity, smoothing techniques
4 n-gram model interpolation and backoff, Naïve Bayes algorithm for text classification
5 Introduction to named entity recognition, relation extraction, and other forms of information extraction
6 Conditional vs generative models, maximum entropy models for named entity recognition
7 Part-of-speech tagging using maxent models
8 Introduction to parsing, PCFGs, CNF, CKY algorithm
9 Dependency parsing, arc-eager parser, Malt parser, relation extraction through dependency structure
10 Lexical semantics, synonymy/homonymy/polysemy, word sense disambiguation
11 Word similarity, term-document matrices, tf-idf weighting, vector space model
12 Intro. to open-source text mining libraries (NLTK, spaCy -in python) for text mining tasks
13 Overall Semester Recap
14 Final Project Presentations

 
Contribution of Learning Outcomes to Programme Outcomes
P1 P2 P3 P4 P5 P6 P7 P8 P9 P10 P11
All 3 4 4 5 5 4 4 1 2 2 3
C1 3 4 5 5 5 3 3 1 2 2 3
C2 3 5 5 5 5 5 4 1 2 2 4
C3 2 3 3 3 4 3 4 1 2 2 2
C4 3 4 4 5 5 4 3 1 2 2 2
C5 5 4 5 5 5 4 5 1 4 2 3

  Contribution: 1: Very Slight 2:Slight 3:Moderate 4:Significant 5:Very Significant

  
  https://sis.agu.edu.tr/oibs/bologna/progCourseDetails.aspx?curCourse=77735&lang=en