Turkish University Department Data (2019-2024)
Details and Documentation
Comprehensive Documentation
The detailed usage guide, API reference, and development notes for this project are available in the README file on GitHub.
About the Project
Turkish University Department Data (2019-2024)
A cleaned, standardized dataset collected from YOK Atlas and OSYM.
Goal (Short)
This dataset brings together university department data from 2019-2024 into a single pool and cleans it to unify naming and spelling differences into a consistent structure. This makes research, application development, and quick table-based analysis far easier. For 2025, only program lists are available for now; score and other statistical fields are intentionally left empty. The dataset is published on both GitHub and Kaggle, and I also provide a detailed analysis flow with notebooks on the Kaggle page.
Summary Stats
- 128,352 rows / 32,505 programs (program_code)
- 235 universities / 733 department names
- Data is used on sinavizcisi.com
Kaggle Performance
The dataset received strong feedback on Kaggle and was used as a reference dataset in other projects, which was very meaningful for me. In total, it reached 3,500+ views and 600+ downloads.
Kaggle Notebooks
- [EDA] Exploring Turkish University Admissions (2019-2024) : Yearly quota and enrollment trends, most competitive departments, field-based popularity shifts, and public vs private university comparisons.
- YOK Atlas Dataset - Consistency and Data Quality Analysis : Quota-enrollment consistency, gender anomalies, missing data patterns, and detailed analysis of calculation differences.
Technical Details
Data Model (Brief)
- Normalized core: departments_normalized.csv, department_stats.csv
- Lookup/bridge: department_names, faculty_names, score_types, universities_normalized, department_tags, etc.
- Fast EDA: data/all_in_one_denormalized.csv
ETL Steps
- remove_2025_from_departments.py → filter out 2025
- process_raw_data.py → normalized tables
- build_all_in_one_denormalized.py → single table
Example Usage
import pandas as pd
# Quick filtering with the denormalized file
eda = pd.read_csv('data/all_in_one_denormalized.csv')
q = (
(eda['year'] == 2024) &
(eda['city'] == 'ISTANBUL') &
(eda['university_type'] == 'vakif') &
(eda['department_name'] == 'Computer Engineering')
)
print(eda.loc[q, ['university_name','scholarship_type','total_quota','total_enrolled']])
Other Projects
Take a look at the other projects I built
Sinavizcisi
A web platform where I analyze YKS placement data and university reviews with AI and present them to students.
YokAPI
A data layer that normalizes YOK Atlas data and serves it through a single API. It standardizes YOK Atlas's scattered, …
EBA Score Bot
A desktop bot with a GUI that automates earning points on EBA.