Current Bachelor Thesis Topics
Bachelor Topics SS 2026
1. Text Mining and Machine Learning
Supervisors: Daniil Dobriy, Axel PolleresLLMs have not only made impressive progress in answering questions directly, but also in terms of formulating corresponding queries to access structured database or knowledge graphs via structured query languages like SQL and SPARQL.
In this thesis you shall explore the boudaries... which questions are TOO HARD for LLMs to answer and formulate as a query, even if the answer can be found in a structured database or knowledge graph.
Sources:
As en entry point with a summary of older benchmarks on this problem, called KGQA (knowledg graph based question answering), I recommend the following MSc thesis polleres.net/supervised_theses/Gerhard_Klager_MSc_2024.pdf and corresponding paper: Gerhard Georg Klager and Axel Polleres. Is GPT fit for KGQA? -- preliminary results. In Proceedings of the International Workshop on Knowledge Graph Generation from Text (Text2KG2023), co-located with Extended Semantic Web Conference 2023 (ESWC 2023), May 2023. to appear. [ .pdf ]
The recent progress on this topic, also leveraging LLMS with agentic capabilities is illustrated in our recent technical report:https://research.wu.ac.at/de/publications/agentic-sparql-evaluating-sparql-mcp-poweredintelligent-agents-o/
2. Automating topic trend analyses for a particular research field
Supervisors: Daniil Dobriy, Axel PolleresThis topic is about automating a reproducibility study:
In 2020 we published a study about Semantic Web research progress in the past 2 years:
Sabrina Kirrane, Marta Sabou, Javier D. Fernández, Francesco Osborne, Cécile Robin, Paul Buitelaar, Enrico Motta, and Axel Polleres. A decade of semantic web research through the lenses of a mixed methods approach. Semantic Web -- Interoperability, Usability, Applicability (SWJ), 11(6):979--1005, October 2020. [ DOI | http ]
One of the challenges in this study was to obtain and analyse fulltexts from Semantic Web conferences and venues and analyze them at scale with (back then) state of the art methods for topic analysis and clustering and other mixed methods and tools.
The goal of this thesis is to try to redo this study 5 years later, and devise a pipeleine to do such analyses for different fields and conferences and for different time frames. AI use is *allowed* in this exercise (vibe coding, text summarization via prompting, etc.) but the main goal is to compare the results with the original paper and approach, document your tool usage, and come up with a reusable pipeline.
3. Do LLMs produce factual statements that violate knowledge-graph integrity constraints?
Supervisors: Miguel Vazquez, Axel PolleresBackground
Large Language Models (LLMs) often generate plausible factual statements, but these statements may be inconsistent with curated, structured knowledge. Knowledge graphs provide an explicit representation of facts together with integrity constraints (e.g., type restrictions, single-value requirements) that help maintain data quality and consistency. This thesis investigates the mismatch between LLM-generated factual claims and explicit knowledge-graph constraints, using Wikidata as the reference knowledge base.
The work is well-aligned with research on trustworthy AI and data quality, and fits the broader Semantic Web / Knowledge Graph ecosystem at WU Wien.
Research question: When an LLM produces factual statements about entities, to what extent do these statements violate explicit integrity constraints defined in Wikidata?
Three initial references
Aidan Hogan et al. Knowledge Graphs. ACM Computing Surveys, 54(4), 2021. (Preprint: arXiv:2003.02320)
Denny Vrandečić and Markus Krötzsch. Wikidata: a free collaborative knowledgebase. ACM, 2014. (DOI: 10.1145/2629489)
Nicolas Ferranti, Jairo Francisco de Souza, Shqiponja Ahmetaj, and Axel Polleres. Formalizing and validating Wikidata’s property constraints using SHACL and SPARQL. Semantic Web Journal, 2024 (in press).
Keywords
LLMs, factuality, hallucination, knowledge graphs, integrity constraints, Wikidata, SHACL, SPARQL, evaluation pipeline, benchmarking
Expected prior knowledge (courses / skills)
Python programming and data handling
Basic understanding of knowledge graphs
Introductory familiarity with NLP/LLMs (prompting, using an API or local inference)
Basic experimental methodology (metrics, ablations, reproducibility)
4. Benchmarks and Knowledge Graph Embeddings
Supervisor: Diego Rincon-YanezBackground
Knowledge graph embeddings (KGEs) aim to represent entities and relations of a knowledge graph as low-dimensional continuous vectors that preserve the graph’s relational semantics and topology. Formally, given triples (h,r,t), embedding models learn parameterized scoring functions that assign higher plausibility to observed triples than to corrupted ones, thereby enabling tasks such as link prediction and knowledge base completion. Classical approaches rely on geometric or factorization-based formulations that model relations as transformations in latent space, while more recent methods leverage graph neural networks to aggregate multi-hop neighborhood information and capture higher-order dependencies. Despite substantial progress in expressiveness and scalability, open research challenges remain in modeling complex relation patterns, handling temporal and multimodal knowledge, and ensuring robustness and interpretability of learned representations across large, evolving knowledge graphs.
Research Problem
Standard benchmarks and datasets used for evaluation are largely derived from cleaned and static subsets of large knowledge bases, which poorly reflect the structural heterogeneity, schema richness, and noise characteristics of real-world RDF knowledge graphs. These benchmarks typically emphasize link prediction on simplified triple distributions and fixed train–test splits, encouraging models to exploit dataset-specific regularities rather than learn representations that generalize the actual knowledge in latent space representations, evolution, or schema-constrained data. Consequently, high performance on current benchmarks may not translate directly into the actual downstream tasks in practical RDF KG. This highlighting the need for improved evaluation datasets or protocols that capture ontology semantics, realistic incompleteness patterns, or streaming updates characteristic of operational RDF knowledge graphs.
Initial References
Hogan, A., Blomqvist, E., Cochez, M., D’Amato, C., Melo, G. De, Gutierrez, C., Kirrane, S., Gayo, J. E. L., Navigli, R., Neumaier, S., Ngomo, A. C. N., Polleres, A., Rashid, S. M., Rula, A., Schmelzeisen, L., Sequeda, J., Staab, S., & Zimmermann, A. (2022). Knowledge graphs. ACM Computing Surveys, 54(4), 1–37. doi.org/10.1145/3447772
Rossi, A., Barbosa, D., Firmani, D., Matinata, A., & Merialdo, P. (2021). Knowledge graph embedding for link prediction: A comparative analysis. ACM Transactions on Knowledge Discovery from Data, 15(2). doi.org/10.1145/3424672
Ji, S., Pan, S., Cambria, E., Marttinen, P., & Yu, P. S. (2021). A Survey on Knowledge Graphs: Representation, Acquisition, and Applications. IEEE Transactions on Neural Networks and Learning Systems, 1, 1–21. https://doi.org/10.1109/TNNLS.2021.3070843
Keywords
Knowledge Graph, Vector Representation, Embeddings, Knowledge Representation
Prior Knowledge
Python Language Skills
Graph Analytics
Concepts of Models Training
5. Identifying Heat Days in Austria: A Comparison of Percentile and Threshold Approaches and Their Impact on Mortality
Supervisors: Hannah Schuster, Elmar KieslingKey Words: Heat, Mortality, Climate Change
Background and Motivation
Periods of extreme heat are becoming increasingly relevant in Austria and are associated with negative impacts on human health. However, identifying heat days is not straightforward, particularly in a country with diverse topography such as Austria, where alpine regions and lowland areas differ substantially in their climatic conditions.
Different methodological approaches are commonly used to define heat days. Some studies rely on fixed temperature thresholds (e.g., days exceeding 30°C), while others use percentile-based methods that define heat relative to local climate conditions. The choice of definition may influence how many heat days are identified and how strongly heat exposure appears to be associated with mortality.
This thesis aims to compare these two approaches and examine how methodological choices affect the analysis of heat-related mortality in Austria.
Research Question
How do percentile-based and fixed-threshold definitions of heat days differ in Austria, and how does the choice of definition influence the estimated relationship between heat exposure and weekly mortality rates?
Objectives
Method Implementation & Data Preparation
Preprocess meteorological data for the selected study region.
Implement a fixed temperature threshold method to identify heat days.
Implement a percentile-based method to identify heat days.
Compare the number and temporal distribution of heat days identified by each method.
Heat–Mortality Analysis
Preprocess weekly mortality data.
Analyze the relationship between identified heat days and weekly mortality rates.
Compare how the two definitions influence statistical estimates of mortality.
Explore whether combining both methods provides additional explanatory insights.
References:
Moshammer, H., Jury, M., Hutter, H.-P., & Wallner, P. (2026). Determinants of Spatial Variation in Vulnerability to Extreme Temperatures in Austria from 1970 to 2020. Climate, 14(1), 16. https://doi.org/10.3390/cli14010016
Ebi, K. L., Capon, A., Berry, P., Broderick, C., de Dear, R., Havenith, G., et al. (2021). Hot weather and heat extremes: Health risks. The Lancet, 398(10301), 698–708. https://doi.org/10.1016/S0140-6736(21)01208-3
Schuster, H., Polleres, A., Anjomshoaa, A. et al. Heat, health, and habitats: analyzing the intersecting risks of climate and demographic shifts in Austrian districts. Sci Rep 15, 22812 (2025). https://doi.org/10.1038/s41598-025-05676-9
Robinson, P. J., 2001: On the Definition of a Heat Wave. J. Appl. Meteor. Climatol., 40, 762–775, https://doi.org/10.1175/1520-0450(2001)040<0762:OTDOAH>2.0.CO;2.
Kent, S. T., McClure, L. A., Zaitchik, B. F., Smith, T. T., & Gohlke, J. M. (2014). Heat waves and health outcomes in Alabama (USA): the importance of heat wave definition. Environmental health perspectives, 122(2), 151–158. https://pmc.ncbi.nlm.nih.gov/articles/PMC3914868/
6. Empirical Evidence on Policy Recommendations with an exemplary Use-Case of the Swiss city Geneva
Supervisors: Jennifer-Marieclaire Sturlese, Marta SabouAbstract:
European cities face similar issues when it comes to globalization. Space is limited and jobs are centralized in city hubs. An outcome of this is tight city transportation and congestion with peak-load overcrowding and network inefficiencies. The e-government initiative provides open data of several aspects of city development. One of this considers open data of public transport of the city of Geneva (tpg.ch). With this, data-based analyses can be conducted and empirical evidence can be used to propose policy recommendations.
In this bachelor thesis, you will adopt an analytical perspective similar to a policy consultant: By conducting exploratory data analysis (EDA) and applying machine learning techniques to the provided dataset, your thesis aims to generate evidence-based findings that may serve as a basis for data-driven policy recommendations to the city.
Your thesis is structured into two parts: a literature review on open data initiative of egovernment in Europe (= theoretical part); an explanation of the research design using appropriate tools (EDA with Python); the application of this methodology involving data mining, preprocessing, cluster normalization, visualization, and result analysis including two policy recommendations (= empirical part).
In order to receive this topic, you are required to demonstrate knowledge in: Exploratory Data Analysis with Python, ML packages (please state experience in motivation letter), completed K-5 of your specialization (Knowledge Management and Data Science, both welcomed, latter is preferred).
Keywords:
open data, machine learning, city public transport
Initial References:
tpg.ch Open Data on Public Transport of the City of Geneva.
Keng Siau and Yuan Long. Factors impacting e-government development. Journal of Computer Information Systems, 50(1): 98–107, 2009.
Rachel Silcock. What is e-government. Parliamentary affairs, 54(1):88–101, 2001.
Christian Von Haldenwang. Electronic government (e-government) and development. The European journal of development research, 16: 417–432, 2004
7. Machine Learning Applications on Police Research Data
Supervisors: Jennifer-Marieclaire Sturlese, Marta SabouAbstract:
Most police data is confidential due to the sensitive nature of the information it contains, such as personal identities, criminal investigations, and national security matters. As a result, researchers face significant challenges in accessing police data for analysis, making it difficult to fully explore machine learning applications in this field. This creates a gap between the potential benefits of data-driven approaches and the restrictions imposed by privacy (Koops, Hoepman & Leenes, 2013).
This bachelor thesis addresses this problem by exploring the diverse data sources used in police applications of machine learning, offering insights into the emerging trends driving this field, which becomes ever-so relevant in the current upsurge of cybersecurity matters. To achieve this, a bibliometric analysis is conducted up to the year 2025 to explain patterns in data sources, investigating the different means to access of viable datasets related to cybersecurity threats and corresponding measures. The data for this study stems from open-access databases, no confidential police sources are used.
The paper is structured into two parts: a review of prior research on data science applications in the policing domain (= theoretical part); an explanation of the research design using appropriate tools; the application of this methodology involving data mining, preprocessing, cluster normalization, visualization, and result analysis to explore machine learning within the policing context (= empirical part).
In order to receive this topic, you are required to demonstrate knowledge in: Exploratory Data Analysis with Python, ML packages (please state experience in motivation letter), completed K-5 of your specialization (Knowledge Management and Data Science, both welcomed, latter is preferred).
Keywords:
police data, machine learning, bibliometric analysis
Initial References:
Palumbo, R., Fakhar Manesh, M., & Petrolo, D. (2024). What makes work smart in the public sector? Insights from a bibliometric analysis and interpretive literature review. Public Management Review, 26(6), 1449-1474.
Wu, J., Liu, T., Mu, K. et al. Identification and causal analysis of predatory open access journals based on interpretable machine learning. Scientometrics 129, 2131–2158 (2024).
Pastor-Galindo, J., Nespoli, P., Mármol, F. G., & Pérez, G. M. (2020). The not yet exploited goldmine of OSINT: Opportunities, open challenges and future trends. IEEE access, 8, 10282-10304.
Koops, B. J., Hoepman, J. H., & Leenes, R. (2013). Open-source intelligence and privacy by design. Computer Law & Security Review, 29(6), 676-688.
8. Agentic AI in Cybersecurity Analytics – Opportunities and Risks
Supervisor: Elmar KieslingBackground
Large Language Models are seeing widespread adoption in cybersecurity applications for both offensive [3] and defensive [5] purposes [8]. Whereas their potential for malicious use has raised widespread concern1, their vast potential to improve security is also widely recognized. This has sparked research interest into how AI can automate, support and/or scale tasks such as log analysis [1], vulnerability assessment and mitigation [4, 7], threat detection [2], or attack graph construction [9].
More recently, cybersecurity has also emerged as a particularly promising application domain for agentic AI, where teams of autonomous AI agents are envisioned to solve complex tasks in distributed workflows. Although research is at a very early stage, initial results suggest that multi-agent workflows where agents use tools, perform multi-step reasoning, and ultimately support rapid decision making under pressure have strong potential to improve cybersecurity [6].
Research Problem
In this Bachelor thesis, you will critically assess the risks and potential benefits of automating and/or supporting cybersecurity risk analysis by means of agentic workflows. To this end, you will define an assessment framework (set of criteria, risk and benefit categories etc.), collect cybersecurity analysis workflows (e.g., vulnerability assessment, risk assessment, penetration testing, intrusion detection, threat hunting, impact assessment, root cause analysis, incident response etc.) and assess the potential benefits and risks in each of these scenarios.
Your thesis may address research questions such as:
Which cybersecurity analysis tasks can be automated and/or supported by means of agentic workflows?
What specific architectures, coordination mechanisms etc. have been proposed in this domain?
Which patterns of agentic workflows are emerging in the cybersecurity domain?
What risks does the use of agentic AI in cybersecurity entail?
What characteristics of agentic AI (autonomy, adaptability, complex problem solving, specialization, distribution etc.) are relevant in different cybersecurity applications? What are their implications?
This topic will benefit from strong interest and/or experience in the cybersecurity domain. It does not require in-depth technical expertise in implementing agentic AI approaches, but is aimed at investigating their potential and implications for cybersecurity analysis applications on a conceptual level.
Initial References
Matteo Boffa, Idilio Drago, Marco Mellia, Luca Vassio, Danilo Giordano, Rodolfo Valentim, and Zied Ben Houidi. Logpr´ecis: Unleashing language models for automated malicious log analysis: Pr´ecis: A concise summary of essential points, statements, or facts. Computers Security, 141:103805, 2024.
Yiren Chen, Mengjiao Cui, Ding Wang, Yiyang Cao, Peian Yang, Bo Jiang, Zhigang Lu, and Baoxu Liu. A survey of large language models for cyber threat detection. Computers Security, 145:104016, 2024.
Eider Iturbe, Oscar Llorente-Vazquez, Angel Rego, Erkuden Rios, and Nerea Toledo. Unleashing offensive artificial intelligence: Automated attack technique code generation. Computers Security, 147:104077, 2024.
Abdechakour Mechri, Mohamed Amine Ferrag, and Merouane Debbah. Secureqwen: Leveraging llms for vulnerability detection in python codebases. Computers Security,148:104151, 2025.
Shuang Tian, Tao Zhang, Jiqiang Liu, Jiacheng Wang, Xuangou Wu, Xiaoqiang Zhu, Ruichen Zhang, Weiting Zhang, Zhenhui Yuan, Shiwen Mao, et al. Exploring the role of large language models in cybersecurity: A systematic survey. arXiv preprint arXiv:2504.15622, 2025.
Vaishali Vinay. The evolution of agentic ai in cybersecurity: From single llm reasoners to multi-agent systems and autonomous pipelines. arXiv preprint arXiv:2512.06659, 2025.
Xiaoqing Wang, Yuanjing Tian, Keman Huang, and Bin Liang. Practically implementing an llm-supported collaborative vulnerability remediation process: A teambased approach. Computers Security, 148:104113, 2025.
Jie Zhang, Haoyu Bu, Hui Wen, Yongji Liu, Haiqiang Fei, Rongrong Xi, Lun Li, Yun Yang, Hongsong Zhu, and Dan Meng. When llms meet cybersecurity: A systematic literature review. Cybersecurity, 8(1):55, 2025.
Yongheng Zhang, Tingwen Du, Yunshan Ma, Xiang Wang, Yi Xie, Guozheng Yang, Yuliang Lu, and Ee-Chien Chang. Attackg+: Boosting attack graph construction with large language models. Computers Security, 150:104220, 2025.
9. Implementing an Agentic Workflow for Cybersecurity Analytics
Supervisor: Elmar KieslingBackground
Large Language Models are seeing widespread adoption in cybersecurity applications for both offensive [3] and defensive [5] purposes [8]. Whereas their potential for malicious use has raised widespread concern1, their vast potential to improve security is also widely recognized. This has sparked research interest into how AI can automate, support and/or scale tasks such as log analysis [1], vulnerability assessment and mitigation [4, 7], threat detection [2], or attack graph construction [9].
More recently, cybersecurity has also emerged as a particularly promising application domain for agentic AI, where teams of autonomous AI agents are envisioned to solve complex tasks in distributed workflows. Although research is at a very early stage, initial results suggest that multi-agent workflows where agents use tools, perform multi-step reasoning, and ultimately support rapid decision making under pressure have strong potential to improve cybersecurity [6].
Research Problem
In this Bachelor thesis, you will conduct an initial survey of promising opportunitites for agentic AI to support analytic tasks in cybersecurity (e.g., vulnerability assessment, risk assessment, penetration testing, intrusion detection, threat hunting, impact assessment, root cause analysis, incident response etc.). Based on this initial survey, the thesis will focus on a selected task (or small subset of tasks) and a scenario for experimentation. It will document the implementation of an agentic workflow and evaluate and compare how agentic AI can support the selected analytic task.
Your thesis may address research questions such as:
What are key criteria when selecting an agentic framwork for cybersecurity analysis?
What are dimensions to compare available frameworks for cybersecurity analysis?
What are key challenges in the implementation of agentic workflows for cybersecurity analysis?
How does the implemented agentic workflow compare to a manual process? What are specific benefits, risks, and challenges?
This topic is aimed at experimenting with agentic frameworks and implementing agentic workflow(s) in a selected cybersecurity scenario. This necessitates interest in developing the necessary skills to work with agentic frameworks. Interest and experience in the cybersecurity domain is beneficial.
Initial References
Matteo Boffa, Idilio Drago, Marco Mellia, Luca Vassio, Danilo Giordano, Rodolfo Valentim, and Zied Ben Houidi. Logpr´ecis: Unleashing language models for automated malicious log analysis: Pr´ecis: A concise summary of essential points, statements, or facts. Computers Security, 141:103805, 2024.
Yiren Chen, Mengjiao Cui, Ding Wang, Yiyang Cao, Peian Yang, Bo Jiang, Zhigang Lu, and Baoxu Liu. A survey of large language models for cyber threat detection. Computers Security, 145:104016, 2024.
Eider Iturbe, Oscar Llorente-Vazquez, Angel Rego, Erkuden Rios, and Nerea Toledo. Unleashing offensive artificial intelligence: Automated attack technique code generation. Computers Security, 147:104077, 2024.
Abdechakour Mechri, Mohamed Amine Ferrag, and Merouane Debbah. Secureqwen: Leveraging llms for vulnerability detection in python codebases. Computers Security, 148:104151, 2025.
Shuang Tian, Tao Zhang, Jiqiang Liu, Jiacheng Wang, Xuangou Wu, Xiaoqiang Zhu, Ruichen Zhang, Weiting Zhang, Zhenhui Yuan, Shiwen Mao, et al. Exploring the role of large language models in cybersecurity: A systematic survey. arXiv preprint arXiv:2504.15622, 2025.
Vaishali Vinay. The evolution of agentic ai in cybersecurity: From single llm reasoners to multi-agent systems and autonomous pipelines. arXiv preprint arXiv:2512.06659, 2025.
Xiaoqing Wang, Yuanjing Tian, Keman Huang, and Bin Liang. Practically implementing an llm-supported collaborative vulnerability remediation process: A teambased approach. Computers Security, 148:104113, 2025.
Jie Zhang, Haoyu Bu, Hui Wen, Yongji Liu, Haiqiang Fei, Rongrong Xi, Lun Li, Yun Yang, Hongsong Zhu, and Dan Meng. When llms meet cybersecurity: A systematic literature review. Cybersecurity, 8(1):55, 2025.
Yongheng Zhang, Tingwen Du, Yunshan Ma, Xiang Wang, Yi Xie, Guozheng Yang, Yuliang Lu, and Ee-Chien Chang. Attackg+: Boosting attack graph construction with large language models. Computers Security, 150:104220, 2025.
10. Text Mining and Machine Learning
Supervisor: Johann MitlöhnerText Mining and Machine Learning
Text mining aims to turn written natural language into structured data that allow for various types of analysis which are hard or impossible on the text itself; machine learning aims to automate the process using a variety of adaptive methods, such as artificial neural nets which learn from training data. Typical goals of text mining are Classification, Sentiment Detection, and other types of Information Extraction, e.g. Named Entity Recognition: identify people, places, organizations; Relation Extraction, e.g. locations of organizations.
Connectionist methods and deep learning in particular have achieved much attention and success recently; these methods tend to work well on large training datasets which require ample computing power. Our institute can provide access to high performance GPU units for student use in thesis projects. It is recommended to use a framework such as PyTorch or Tensorflow/Keras for developing your deep learning application; the changes required to go from CPU to GPU computing will be minimal. This means that you can start developing using your PC, with a small subset of the training data; when you later transition to the GPU server more performance will mean that larger datasets become feasible.
On text mining e.g.: Minqing Hu, Bing Liu: Mining and summarizing customer reviews. KDD '04: Proceedings of the tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 168-177, ACM, 2004
For a more recent work and overview e.g.: Percha B. Modern Clinical Text Mining: A Guide and Review. Annu Rev Biomed Data Sci. 2021 Jul 20;4:165-187. doi: 10.1146/annurev-biodatasci-030421-030931. Epub 2021 May 26. PMID: 34465177.
Datasets can be found e.g. at huggingface and kaggle.
Some References:
Butcher, Peter WS, and Panagiotis D. Ritsos. "Building Immersive Data Visualizations for the Web." Proceedings of International Conference on Cyberworlds (CW'17), Chester, UK. 2017.
Teo, Theophilus, et al. "Data fragment: Virtual reality for viewing and querying large image sets." Virtual Reality (VR), 2017 IEEE. IEEE, 2017.
Millais, Patrick, Simon L. Jones, and Ryan Kelly. "Exploring Data in Virtual Reality: Comparisons with 2D Data Visualizations." Extended Abstracts of the 2018 CHI Conference on Human Factors in Computing Systems. ACM, 2018.
Yu Shu, Yen-Zhang Huang, Shu-Hsuan Chang, and Mu-Yen Chen (2019). Do virtual reality head-mounted displays make a difference? a comparison of presence and self-efficacy between head-mounted displays and desktop computer-facilitated virtual environments. Virtual Reality, 23(4):437-446.
Korkut, E. H., and Surer, E. (2023). Visualization in virtual reality: a systematic review. Virtual Reality, 27(2), 1447-1480.
keywords: artificial neural networks, machine learning, text mining
11. Visualizing Data in Virtual and Augmented Reality
Supervisor: Johann MitlöhnerHow can AR and VR be used to improve exploration of data? Developing new methods for exploring and analyzing data in virtual and augmented reality presents many opportunities and challenges, both in terms of software development and design inspiration. There are various hardware options, starting with Google Cardboard, to more sophisticated and expensive, such as Quest, and many others. Taking part in this challenge demands programming skills as well as creativity. A basic VR or AR application for exploring a specific type of (open) data will be developed by the student. The use of a platform-independent kit such as A-Frame is essential, as the application will be compared in a small user study to its non-VR version in order to identify advantages and disadvantages of the visualization method implemented. Details will be discussed with supervisor.
Some References:
Butcher, Peter WS, and Panagiotis D. Ritsos. "Building Immersive Data Visualizations for the Web." Proceedings of International Conference on Cyberworlds (CW'17), Chester, UK. 2017.
Teo, Theophilus, et al. "Data fragment: Virtual reality for viewing and querying large image sets." Virtual Reality (VR), 2017 IEEE. IEEE, 2017.
Millais, Patrick, Simon L. Jones, and Ryan Kelly. "Exploring Data in Virtual Reality: Comparisons with 2D Data Visualizations." Extended Abstracts of the 2018 CHI Conference on Human Factors in Computing Systems. ACM, 2018.
Yu Shu, Yen-Zhang Huang, Shu-Hsuan Chang, and Mu-Yen Chen (2019). Do virtual reality head-mounted displays make a difference? a comparison of presence and self-efficacy between head-mounted displays and desktop computer-facilitated virtual environments. Virtual Reality, 23(4):437-446.
Korkut, E. H., and Surer, E. (2023). Visualization in virtual reality: a systematic review. Virtual Reality, 27(2), 1447-1480.
keywords: virtual reality, augmented reality, data visualization, data exploration
12. Testing Algorithms for Digital Democracy in Simulations
Supervisors: Hanna Kern, Jan MalyKeywords: Computational Social Choice, Fairness, Democracy, Voting, Simulations, Simulations in Python.
Context:
A growing number of novel, digital participation processes allow citizens to express their opinions and directly influence policy decisions on a wide range of topics, from the design of a new park in Vienna [1] to the shape of the new constitution in Chile [2] and Iceland [3]. One major challenge in such digital democracy processes is the fair representation of minority opinions. Computer scientists have, in recent years, developed novel tools and algorithms that can be used to make the different forms of digital participation fairer and more representative.
In this thesis, we will focus on a setting that has not received a lot of attention so far, namely on elections where voters can express which candidates or opinions they approve of and which they disapprove of - a model that captures in particular many large scale deliberation processes, hosted on platforms like Pol.is. Kraiczy et al. (2025) [4] recently explored two settings with disapprovals and suggested fairness notions and voting rules. In this thesis, we will reflect more on the settings in which voters express their disapprovals and further investigate, using simulations on real-world and synthetic data, our intuition regarding fairness in practice.
Problem:
Research into voting with approvals and disapprovals is very new and only theoretical, therefore there is a lack of more practical studies and simulations.
Goal/expected results of the thesis:
This thesis will experimentally investigate how fair the outcomes of the different voting rules in this setting are and how much they are affected by small changes in the voting instance.
Research Questions:
How fair are the outcomes produced by the voting rules proposed in [4]?
How much does our intuition regarding fairness change when there are small changes to a voting instance?
How does the inclusion of disapprovals change a voter’s satisfaction with the outcome, and how does this affect our intuition regarding fairness?
Methodology:
Get familiar with the setting and the intuition behind the proposed voting rules in [4]
Reflect on where you agree and disagree with the modelling done in [4]
Develop Python scripts to run the simulations on different real-world and synthetic data-sets.
Evaluate the results of the simulations.
Required Skills:
Good understanding of data analysis, ideally with python.
A willingness to learn about mathematical measures of fairness.
References:
[1] https://mitgestalten.wien.gv.at/de-DE/projects/miep-gies-park
[2] https://europeandemocracyhub.epd.eu/wp-content/uploads/2023/12/Case-Study-Chile-FINAL-v2.pdf
[3] Hélène Landemore, When public participation matters: The 2010–2013 Icelandic constitutional process, International Journal of Constitutional Law, Volume 18, Issue 1, January 2020, Pages 179–205,https://doi.org/10.1093/icon/moaa004
[4] Sections 1,2,3 of:
Kraiczy, Sonja, Georgios Papasotiropoulos, and Piotr Skowron. "Proportionality in Thumbs Up and Down Voting." arXiv preprint arXiv:2503.01985 (2025).
Boehmer, Niclas, et al. "Guide to numerical experiments on elections in computational social choice." arXiv preprint arXiv:2402.11765 (2024).
13. Using LLMs to Unveil the Hidden Structure of Online Discussions
Supervisor: Jan MalyKeywords: LLM, Online Discussions, NeuroSymbolic AI,
Context:
Today, we have unlimited possibilities to talk to anyone about any topic through the magic of the internet. At the same time, as a society, we are more polarized and divided than ever before. Many solutions have been proposed to overcome this divide and to make our online discourse more consensus-driven and de-polarizing, from AI-moderation to automated intervention and consensus generation. These approaches often require a deep understanding of the underlying structure of the discussion, modeled through frameworks like abstract argumentation. Recently, research has shown that LLMs generally outperform purpose build software in modeling discussions in these frameworks. However, a systematic comparison of LLMs' performance across different frameworks is still lacking, making it hard to choose the best framework for modeling online discussions.
Problem:
LLMs can model online discussions in different semantic frameworks reasonably well, but often more expressive and complex frameworks lead to a worse modelling performance. Without a systematic understanding of this trade-off, it is often hard to choose the right framework for a given application.
Goal/expected results of the thesis:
This thesis will use different LLMs to model real-world online discussions in different semantic frameworks and compare the accuracy of the models across different LLMs and frameworks. As a result, we will gain an understanding of the best trade-offs between expressiveness and ease of modelling of these frameworks.
Research Questions:
How well can different LLMs model discussions in different frameworks
Does increased complexity/expressiveness always lead to lower accuracy
Are there significant differences in the modelling performance between different LLMs
Methodology:
Get familiar with the different modelling frameworks
Use human labeling to generate a baseline
Set up the experimental framework to measure modelling accuracy
Statistically evaluate the results of the experiments
Required Skills:
Good understanding of data analysis, ideally with python
Ability to statistically evaluate experimental results
A willingness to learn about formal models of argumentation
References:
Can Large Language Models perform Relation-based Argument Mining?, Deniz Gorur and Antonio Rago and Francesca Toni, ACL 2025, https://aclanthology.org/2025.coling-main.569.pdf
Exploring the Potential of Large Language Models in Computational Argumentation, Guizhen Chen, Liying Cheng, Anh Tuan Luu, and Lidong Bing, ACL 2024, https://aclanthology.org/2024.acl-long.126/
Evaluation and Facilitation of Online Discussions in the LLM Era: A Survey, Katerina Korre, Dimitris Tsirmpas, Nikos Gkoumas, Emma Cabalé, Danai Myrtzani, Theodoros Evgeniou, Ion Androutsopoulos, John Pavlopoulos, ACL 2025
14. LLM-Based Extraction of AI Risk Patterns from Incident Reports
Supervisors: Muhammad Ikhsan, Elmar KieslingBackground
As Artificial Intelligence (AI) systems are increasingly deployed in safety-critical and businesscritical contexts, systematically assessing their risks becomes essential. Failures in AI systems may result not only in technical malfunction, but also in legal, ethical, societal, and financial consequences. Regulatory initiatives such as the EU AI Act, together with standards such as ISO/IEC 23894 and the NIST AI Risk Management Framework also emphasize the importance
of systematic AI risk management.
While existing frameworks provide high-level guidance for identifying and documenting risks, they often operate at a governance and organizational level. In practice, organizations face the challenge of translating abstract risk categories into structured, system-level representations that support analysis, documentation, and mitigation.
At the same time, large collections of publicly available AI incident reports document real-world failures across domains. These reports contain valuable empirical insights into recurring risk situations, but they are typically written in unstructured narrative form.
Recent advances in Large Language Models (LLMs) raise the question whether such models can support the transformation of unstructured textual incident descriptions into structured representations of AI-related risks.
Research Problem
In this Bachelor thesis, you will investigate whether and how Large Language Models (LLMs) can support the structured extraction and representation of AI-related risks from unstructured incident reports.
The thesis explores the potential of LLMs to identify recurring risk elements such as underlying vulnerabilities, technical failures, broader impacts, and mitigation measures, and to transform narrative descriptions into more structured formats suitable for analysis.
Specifically, your thesis will address research questions such as:
To what extent can LLMs extract structured risk information from AI incident reports?
How does prompt design influence the quality and consistency of the extracted information?
Can iterative or reflective prompting strategies improve completeness and/or reliability?
What are the limitations, uncertainties, and potential biases of LLM-supported extraction?
How suitable are automatically extracted representations for supporting systematic AI risk analysis?
The goal is to evaluate the feasibility and limitations of LLM-supported risk pattern extraction in the context of AI governance and risk management.
Requirements
Familiarity with Large Language Models
Some understanding of Natural Language Processing (NLP)
Interest in AI risk, AI governance, or socio-technical system analysis
Basic programming skills to interact with LLM APIs or NLP toolkits are sufficient
Initial References
ISO/IEC 23894:2023. Artificial Intelligence — Guidance on Risk Management. International Organization for Standardization, 2023.
National Institute of Standards and Technology (NIST). AI Risk Management Framework (AI RMF 1.0). U.S. Department of Commerce, 2023.
Golpayegani, D., Pandit, H. J., Lewis, D. (2022). “AIRO: An Ontology for Representing AI Risks Based on the Proposed EU AI Act and ISO Risk Management Standards.” In Towards a Knowledge-Aware AI. IOS Press, pp. 51–65.
Slattery, P., Saeri, A. K., Grundy, E. A. C., et al. (2024). “The AI Risk Repository: A Comprehensive Meta-Review, Database, and Taxonomy of Risks from Artificial Intelligence.” arXiv preprint arXiv:2408.12622.
Ikhsan, M., Kiesling, E., Mahmoud, S., Prock, A., Revenko, A., Ekaputra, F. J. (2025). “Pattern-based AI Risk Assessment: A Taxonomy Expansion Use Case.” Workshop Proceedings.
ISO/IEC TR 24028:2020. Artificial Intelligence — Overview of Trustworthiness in Artificial Intelligence.
ISO/IEC TR 24027:2021. Bias in AI Systems and AI Aided Decision Making.
Wei, J., Wang, X., Schuurmans, D., et al. (2022). “Chain-of-Thought Prompting Elicits Reasoning in Large Language Models.” Advances in Neural Information Processing Systems (NeurIPS).
Madaan, A., et al. (2023). “Self-Refine: Iterative Refinement with Self-Feedback.” Advances in Neural Information Processing Systems (NeurIPS).
Paeth, K., Atherton, D., Pittaras, N., Frase, H., McGregor, S. (2025). “ Lessons for editors of AI incidents from the AI Incident Database.” Proceedings of the AAAI Conference on Artificial Intelligence, 39(28), 28946–28953.
15. Abstraction-Level Comparison of AI Risk Frameworks: A Mechanism-Oriented Analysis
Supervisors: Muhammad Ikhsan, Elmar KieslingBackground
As organizations increasingly deploy AI systems, they must navigate a growing "jungle" of AI risk frameworks, standards, and governance guidelines. Examples include the MIT AI Risk Repository, the OECD AI system classification framework, ISO/IEC 23894, the NIST AI Risk Management Framework, and security-oriented initiatives such as the OWASP AI Security Top 10.
While these frameworks aim to structure and manage AI-related risks, they differ significantly in terminology, abstraction level, and conceptual focus. Some emphasize societal harms (e.g., discrimination or misinformation), others focus on technical vulnerabilities or security threats, while governance-oriented standards address compliance and oversight requirements.
For organizations implementing AI governance processes, these differences create practical challenges. Risk concepts across frameworks are defined at different conceptual levels: some describe underlying system vulnerabilities, others observable technical failures, societal impacts, or institutional consequences. When these distinctions are not made explicit, structured comparison and integration of multiple frameworks becomes difficult.
Research Problem
This thesis investigates how selected AI risk frameworks conceptualize and structure AI-related risk, with a particular focus on abstraction level.
The central objective is to examine whether distinguishing between different types of risk descriptions, such as:
• Structural system characteristics,
• Mechanism-level vulnerabilities
• Technical Consequences
• Social and ethical impacts
• Governance or regulatory implications
can support clearer comparison and interpretation across heterogeneous AI risk frameworks.
The thesis will address research questions such as:
• How do selected AI risk frameworks define and structure “risk”?
• At which abstraction levels do they define risk concepts?
• Do frameworks clearly distinguish between underlying vulnerabilities and downstream impacts?
• Where do conceptual overlaps or abstraction-level conflations occur?
• Does analyzing risks from a mechanism-oriented perspective clarify similarities and differences between frameworks?
The goal is not to develop a new taxonomy or framework. Instead, the thesis provides a structured abstraction-level analysis and evaluates whether a mechanism-oriented analytical perspective supports clearer cross-framework understanding in the context of AI governance.
Required / Recommended Skills
• Strong analytical and conceptual reasoning skill
• Interest in AI governance, IT risk management, or digital compliance
• Basic familiarity with AI systems and digital risk concepts
Initial References
• OECD (2022), “OECD Framework for the Classification of AI systems”, OECD Digital Economy Papers, No. 323, OECD Publishing, Paris
• Golpayegani, D., Pandit, H. J., & Lewis, D. (2022). AIRO: An ontology for representing AI risks based on the proposed EU AI Act and ISO risk management standards. IOS Press.
• Golpayegani, D., Pandit, H. J., & Lewis, D. (2023). To Be High-Risk, or Not To Be—Semantic Specifications and Implications of the AI Act’s High-Risk AI Applications and Harmonised Standards. Proceedings of the ACM Conference on Fairness, Accountability, and Transparency.
• Slattery, P., Saeri, A. K., Grundy, E. A. C., et al. (2024). The AI Risk Repository: A comprehensive meta-review, database, and taxonomy of risks from artificial intelligence. arXiv preprint.
• OWASP. (2023). OWASP AI Security and Risk Initiatives. owasp.org
• ISO/IEC 23894:2023. Artificial Intelligence — Guidance on Risk Management. International Organization for Standardization.
• National Institute of Standards and Technology (2023). AI Risk Management Framework (AI RMF 1.0).
• Ikhsan, M., Kiesling, E., Mahmoud, S., Prock, A., Revenko, A., & Ekaputra, F. J. (2025). Pattern-based AI Risk Assessment: A Taxonomy Expansion Use Case. Workshop Proceedings.
16. Case Study of Neurosymbolic Approaches for Knowledge Engineering (Suggested Use Case: Football Common Data Format)
Supervisors: Alexander Prock, Fajar J. EkaputraKeywords: knowledge engineering, knowledge graphs, neurosymbolic artificial intelligence, case study
Context: Knowledge Graphs explicitly represent knowledge in a machine-readable format to enable the integration, management and utilization of knowledge at scale [1]. However, their manual construction is tedious and expensive.
Neurosymbolic AI [2] combines the predictive powers of artificial neural networks (e.g. classical machine learning, deep neural networks or large language models) with interpretable knowledge representation and reasoning (e.g. semantic web resources: ontologies and knowledge graphs, or formal logic).
Neurosymbolic approaches for knowledge engineering can aid various knowledge engineering tasks, e.g. ontology construction, knowledge extraction or knowledge graph construction (cf. [3] for an overview of knowledge engineering tasks).
One example of a neurosymbolic approach for knowledge engineering is Text2AMR2FRED [4], which extracts knowledge from natural language text to construct a knowledge graph. First, a neural network model is used to parse textual input into an intermediate abstract representation, which is then mapped to knowledge graph triples using predefined rules. The resulting knowledge graph is then further refined and enriched, e.g. by aligning the extracted knowledge to established publicly available knowledge bases.
Problem: The use of AI applications in the sports domain has significantly grown in recent years. For example, in the football domain, AI systems are used for player recruitment, performance monitoring, and selection.We are currently developing the Football Common Data Format (FCDF) ontology [5], to provide a shared conceptualisation and formal basis for using semantic resources in AI applications in the football domain. One possible use case is the population of the FCDF ontology from various data sources, such as existing external knowledge bases, e.g. Wikidata, or natural language text, e.g. reports and newspaper articles.
The thesis topic can be shifted according to your interests; other use cases can be proposed, and other knowledge engineering tasks can be the focus.
Goal/expected results of the thesis:
Identification of suitable neurosymbolic approaches for a chosen knowledge engineering task (e.g. knowledge graph construction/population) in a chosen use case (e.g. football ontology)
Application of selected approaches for the chosen task and use case
Collection of practical insights and assessment of performance
Research Questions:
What are suitable neurosymbolic approaches for knowledge graph construction in the football domain? (to be adapted depending on chosen knowledge engineering task and use case)
How do these approaches perform when applied to the use case in practice?
What are the benefits and challenges of choosing these approaches?
Methodology:
Familiarize yourself with knowledge engineering tasks, neurosymbolic AI and the chosen use case
Literature study to identify suitable neurosymbolic approaches
Case study of selected approaches for chosen use case (including practical evaluation)
Prior Knowledge & Skills:
Interest in knowledge engineering and/or the technical aspects of knowledge management
Understanding of semantic web technologies (e.g., RDF/OWL) or the willingness to learn about them
Ideally, you have completed the SBWL Knowledge Management, have taken the technical courses (“technical KM”; K2 & K3) already, or have acquired similar skills elsewhere
Hands-on mentality, resp. the willingness to try out novel approaches in practice
References
[1] Hogan, Aidan, Eva Blomqvist, Michael Cochez, Claudia d’Amato, Gerard De Melo, Claudio Gutierrez, Sabrina Kirrane et al. "Knowledge graphs." ACM Computing Surveys (CSUR) 54, no. 4 (2021): 1-37. https://doi.org/10.1145/3447772
[2] Hitzler, Pascal, and Md Kamruzzaman Sarker, eds. "Neuro-symbolic artificial intelligence: The state of the art." (2022). https://doi.org/10.48550/arXiv.2105.05330
[3] Tamašauskaitė, G., & Groth, P. (2023). Defining a knowledge graph development process through a systematic review. ACM Transactions on Software Engineering and Methodology, 32(1), 1-40. https://doi.org/10.1145/3522586
[4] Gangemi, A., Graciotti, A., Meloni, A., Nuzzolese, A. G., Presutti, V., Reforgiato Recupero, D., & Russo, A. (2026). Text2AMR2FRED, converting text into RDF/OWL knowledge graphs via abstract meaning representation. Knowledge and Information Systems, 68(1), 47. https://doi.org/10.1007/s10115-025-02631-y
[5] Ekaputra, F. J., Käfer, G., & Kempe, M. (2025). An Ontology for the Common Data Format on Football Match Data. In Joint Proceedings of Industry, Doctoral Consortium, Posters and Demos of the 24th International Semantic Web Conference (ISWC-C 2025): SWC 2025 Companion Volume, November 2–6, 2025, Nara, Japan CEUR Workshop Proceedings. https://ceur-ws.org/Vol-4085/paper56.pdf
17. Automated evaluation of AI-generated explanations
Supervisors: Stefani Tsaneva, Marta SabouKeywords: ontology engineering, human-centric explanations, large language models
Context: Knowledge Engineering (KE) encompasses a variety of activities, including the acquisition of knowledge and its representation through semantic models such as ontologies. Traditionally, KE requires substantial manual effort to define, implement, and validate domain-specific requirements. Moreover, tool support for many KE tasks remains limited, increasing the likelihood of modeling errors, especially when ontology engineers lack advanced KE training or are working with complex logical constraints. Recently, to support ontology engineers, the potential of Large Language Models (LLMs) has been explored in the context of ontology verification, specifically for defect detection, classification, explanation, and correction. While initial studies demonstrate that LLMs can assist with these tasks, further experimentation is necessary to generalize and extend these findings.
Problem: Currently, there is a lack of tools and methods for the automated evaluation of AI-generated explanations within the context of ontology validation.
Goal/expected results of the thesis: This thesis will investigate how LLMs can be utilised to annotate AI-generated explanations according to value-based requirements.
Research Questions: To what extent can LLMs annotate AI-generated explanations according to value-based requirements?
How accurately do LLMs evaluate ontology defect explanations?
Do different LLMs vary in their evaluation performance?
How consistent are LLM-generated annotations across repeated evaluations of the same input?
Methodology:
Get familiar with prior experiments on LLMs for ontology defect explanation and the produced explanations dataset.
Design and implement scripts (e.g., in Python or other suitable languages) to prompt various LLMs for explanation evaluation tasks.
Perform experiments across different LLMs and analyse the results.
Required Skills:
Understanding of ontologies, ontology constraints and reasoning (SBWL K2 completed).
Experience with Python (or other languages that support API access to LLMs).
Data analysis skills for processing generated outputs and evaluating performance.
References
Tsaneva, S., Herwanto, G. B., Llugiqi, M., & Sabou, M. Knowledge Engineering with Large Language Models: A Capability Assessment in Ontology Evaluation. https://www.semantic-web-journal.net/system/files/swj3852.pdf
C.-H. Chiang, H.-y. Lee, Can large language models be an alternative to human evaluations?, in: Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics, 2023. 10.18653/v1/2023.acl-long.870
18. The Hidden Cost of Unattractive Jobs: Knowledge Loss through Staff Turnover in Healthcare
Supervisor: Florian KraguljUnattractive working conditions in the healthcare sector lead to above-average staff turnover and resignation rates– and these destroy knowledge. Nursing work is highly knowledge-intensive: experiential know-how, practical wisdom, team routines, and relationships with clients are difficult to codify and often irretrievably lost when employees leave. This makes workplace and job attractiveness a strategic concern. Failing to improve workplace quality carries real and lasting costs.
This thesis shall explore the causal chain of ‘unattractive jobs → negative implications that cause knowledge loss → costs’, drawing on management, knowledge management, and health economics literature. Key questions include:
What types of knowledge are most at risk in care settings?
What are the measurable costs when knowledge is lost through turnover, intention to leave, etc.?
How does the literature link job attractiveness, turnover, and knowledge loss in these contexts?
Your task is to conduct a systematic literature review on management, knowledge management, and health economics literature. Findings are to be synthesized into a structured argumentation framework that is accessible to both academic and non-academic audiences. Theoretical anchors for the review could include the resource-based view of the firm, knowledge-based view of the firm, organizational learning, and knowledge management frameworks.
Keywords: Knowledge loss, Job attractiveness, Healthcare, Personal purpose, Structured literature review
Initial References:
Galan N (2023), "Knowledge loss induced by organizational member turnover: a review of empirical literature, synthesis and future research directions (Part II)". The Learning Organization: An International Journal, Vol. 30 No. 2 pp. 137–161
Parise, S., Cross, R., & Davenport, T. H. (2006). Strategies for preventing a knowledge-loss crisis. MIT sloan management review, 47(4), 31.
Droege, S. B., & Hoobler, J. M. (2003). Employee turnover and tacit knowledge diffusion: A network perspective. Journal of Managerial Issues, 50-64.
19. Phronesis in Healthcare - A Knowledge Perspective
Supervisors: Florian Kragulj, Susanne AhmadThe concept of phronesis was originally coined by Aristotle and is often translated as practical wisdom. In Greek philosophy, phronesis refers to the practical judgment and prudence that enables us to make ethically sound decisions in concrete situations going beyond theoretical knowledge (episteme) and technical skill (techne) by integrating experience, context, and moral reflection. These capabilities are of particular importance in healthcare, a domain characterized by complexity, time pressure, and ethical ambiguity. Physicians, nurses, and other healthcare professionals regularly face situations in which rule-based action alone is insufficient. Effective practice requires a reflective form of judgment that combines medical expertise, experiential knowledge, and ethical sensitivity.
In this bachelor thesis, you will explore the concept of phronesis in the context of healthcare, focusing on nurses and other healthcare professionals, from a knowledge perspective. You will investigate how practical wisdom can be defined and operationalized within health organizations, identify relevant fields of application, such as clinical decision-making, professional nursing practice, and ethical judgment. After conducting a structured literature review and researching relevant best practices, you will conclude with recommendations for how healthcare organizations can systematically foster and transfer practical wisdom to enhance both care quality and organizational learning.
For further questions, please contact Susanne Ahmad (susanne.ahmad@wu.ac.at).
Keywords: Phronesis, Practical Wisdom, Healthcare, Nursing, structured literature review
Initial References:
Conroy, M., Malik, A.Y., Hale, C. et al. Using practical wisdom to facilitate ethical decision-making: a major empirical study of phronesis in the decision narratives of doctors. BMC Med Ethics 22, 16 (2021). doi.org/10.1186/s12910-021-00581-y
Cosgrove L, Shaughnessy AF. Becoming a Phronimos: Evidence-Based Medicine, Clinical Decision Making, and the Role of Practical Wisdom in Primary Care. J Am Board Fam Med. 2023;36(4):531-536. doi:10.3122/jabfm.2023.230034R1
Fugelli P. Clinical practice: between Aristotle and Cochrane. Schweiz Med Wochenschr. 1998;128(6):184-188.
Kinsella, Elizabeth Anne, and Allan Pitman, eds. Phronesis as professional knowledge: Practical wisdom in the professions. Vol. 1. Springer Science & Business Media, 2012.
20. AI-based data completion for fair representation in online discussions
Supervisors: Jan Maly, Felicia SchmidtKeywords: computational social choice, recommender systems, digital democracy
Context: Online discussions are a crucial part of modern democratic deliberation. To reap the benefits of such debates, it needs to be possible to summarize them with statements that best represent the different discussion points. So far, most algorithms simply show majority opinions, which tend to neglect the full spectrum of beliefs. The research field of computational social choice offers algorithms for summaries with better representation guarantees. However, the sheer multitude of comments in online discussions makes it impossible for any single user to express their opinion on all of them. In this thesis project, we will explore whether we can use modern AI technologies to bridge this gap in information to further fair representation also in a digital democracy setting.
Problem: Currently, existing methods for selecting representative statements in online discussions struggle with the highly incomplete information on user opinions.
Goal/expected results of the thesis: This thesis will investigate how machine learning techniques can best be used to predict user opinions on statements in online discussions.
Research Questions: To what extent can matrix completion methods accurately predict users’ approval of online discussion statements? Can these methods be combined with state-of-the-art algorithms to accurately represent users’ opinions?
Methodology:
Get familiar with the field of computational social choice and machine learning-based matrix completion methods.
Design and implement scripts (in Python) to use these completion methods on data from real-world online discussions.
Perform experiments across different completion methods and analyse the results.
Required Skills:
Good understanding of data analysis, ideally with python.
Willingness to learn about mathematical measures of fairness.
References
Piotr Faliszewski, Piotr Skowron, Arkadii Slinko, and Nimrod Talmon. Multiwinner Voting: A New Challenge for Social Choice Theory. In Ulle Endriss (editor), Trends in Computational Social Choice, chapter 2, pages 27–47. AI Access, 2017. https://archive.illc.uva.nl/COST-IC1205/BookDocs/Chapters/TrendsCOMSOC-02.pdf
Zhaoliang Chen, Shiping Wang. A review on matrix completion for recommender systems. https://doi.org/10.1007/s10115-021-01629-6
21. Measuring the impact of Science in different fields using Scientific Knowledge Graphs
Supervisors: Axel Polleres, Diego Rincon-YanezScientific Knowledge Graphs such as OpenAlex [1], Google Scholar, or also Wikidata [2] contain a lot of detailed information about Researchers, their relationships, e.g. supervision relationships [3], affiliations and outputs such as publications and their citations .
Enabled also by standardized identifiers for reseaarchers [4] or publications [5], Scientometrics is concerned with measuring scientific impact and success, with e.g. metrics like the h-index [6] that are typically computed on a personal level. While in a previous thesis project a student already has investigated how such metrics could be computed on an organizational level, in this thesis we want to go one step further in order to define, quantify and analyse scientometric indexes on a topicwise level.
To this end, the project shall investigate and prototypically implement
how to associate scientific papers with a hierarchy of research topics
associate authors and organisations with the degree of impact they have in a particular research field.
assess and discuss existing knowledge graphs and identifier systems, e.g. [1-5] in terms of coverage, based on a practical case study.
The result should be a prototype/tool that - given a group of researchers, assess their individual and aggregated research impact and topical connections. More details will be provided by the supervisors.
22. Train Travel Made Easy: What’s The European From/To Price
Supervisors: Shahrom Sohi, Axel PolleresTrain Travel Made Easy: Mapping Cross-Border Rail Prices in Europe
European rail is central to the EU’s decarbonisation strategy. Cross-border train travel produces substantially lower CO₂ emissions per passenger kilometre than air or car travel and is therefore a key pillar of the European Green Deal and Fit for 55 agenda. Yet despite strong policy support, international rail continues to suffer from low modal share. One of the main barriers is not infrastructure, but information: cross-border ticket prices remain fragmented, difficult to compare, and often opaque for passengers.
Railway undertakings such as ÖBB, Deutsche Bahn, SNCF, Trenitalia, Renfe, and Eurostar operate separate booking platforms, pricing logics, and technical interfaces. For multi-country journeys, travellers frequently need to consult several websites, manually combine tickets, or rely on third-party aggregators such as Trainline or Omio. This creates price asymmetries, reduces transparency, and limits informed decision-making.
This thesis investigates the question: What is the actual European “from–to” rail price?
The project will analyse how cross-border fares are structured, displayed, and potentially distorted across booking systems. The objective is to systematically collect, compare, and evaluate pricing data in order to assess transparency gaps and structural fragmentation.
Methodologically, the student will work with web scraping approaches and structured data extraction. Depending on technical interest and feasibility, the project may explore LLM-based agents to automate itinerary reconstruction and price comparison across booking platforms.
The thesis is supervised by Shahrom Sohi (ÖBB / WU Vienna) and Axel Polleres and offers the opportunity to collaborate with mobility stakeholders, contributing to research on seamless European rail travel and digital interoperability.
References:
Sohi, Shahrom, et al. "Inter-railways data sharing for seamless travelling applications in Europe." (2025). https://semantic-transportation.github.io/sem4tra-kg-website/papers/Sem4Tra25_paper_4.pdf
23. Bridging Linguistic Barriers in Cross-Border Rail Operations: Speech-to-speech translation preserving standardized terminology
Supervisors: Shahrom Sohi, Axel PolleresThe European railway domain increasingly depends on cross-border operations, where train drivers and infrastructure controllers often speak different national languages. The Translate4Rail (https://translate4rail.eu/) project has shown that providing a standardized set of predefined messages and a translation tool can mitigate miscommunication in normal and emergency situations.
This is a relatively open topic for people who like to play around with technology and AI models and push the boundaries: LLMs have enabled incredible advances over rthe past yeard, not only as generative question answering systems by also replacing most traditional techniques for multi-lingual translations. On the other hand, speech-to-text and text-to-speech technologies have likewise evolved to not only understanding spoken text and reading out text, but also capturing vocal features, up to the level of creating (synchronous) voice clones.
In thes course of this thesis, we will explore technical boundaries and different approaches for solving speech-to-speech translation in a safety critical context, where messages should be translated preserving standardized terminology in a particular domain, namely railway/mobility services.
This thesis will:
Analyze the limitations of the current Translate4Rail prototype and associated language-tool approaches, focusing on weak points such as message coverage, ambiguity, error handling, latency, and safety assurance.
Propose practical hands on improvements or extensions of Translate4Rail with, RAG, Model Finetuning and agent Frameworks.
Implement a pilot to validate the enhancements, measuring metrics such as misunderstanding rate, response delay, coverage of scenario space, and safety margins.
Evaluate interoperability and safety aspects, possibly in collaboration with rail actors or infrastructure managers, to assess feasibility in real corridor settings.
References:
Atanasov, I., Pencheva, E., & Vatakov, V. (2023). An Approach to Designing Critical Railway Voice Communication. Electronics, 12(6), 1406.
https://doi.org/10.3390/electronics12061406
Rosberg, T., Thorslund, B. Radio communication-based method for analysis of train driving in an ERTMS signaling environment. Eur. Transp. Res. Rev. 14, 18 (2022). https://doi.org/10.1186/s12544-022-00542-5
24. Linking Railway Accident Reports with Infrastructure Knowledge Graphs: Towards a European Railway Safety Knowledge Space
Supervisors: Shahrom Sohi, Axel PolleresThe European Union Agency for Railways (ERA) is building Knowledge Graphs (KGs) such as the Railway Infrastructure Register (RINF KG), which describe the European rail network’s assets, topology, and compliance attributes. In parallel, structured accident and incident data (e.g., ERAIL reports, NIB reports) are increasingly being digitized. Yet, these two domains infrastructure and safety events remain largely disconnected.
This thesis will:
Map accident data models to infrastructure entities (e.g., linking derailments to track sections, collisions to operational points, or accidents at level crossings to specific infrastructure features).
Build a semantic layer that connects unstructured accident reports (using NLP/NER extraction pipelines) to the ERA infrastructure KG, using ontologies and vocabularies from ERA and the Semantic Web community.
Develop a prototype Knowledge Graph integration, aligning accident entities (time, location, type, cause) with RINF infrastructure elements, enabling cross-querying (e.g., “Which track segments have had repeated accidents of type X in the past 10 years?”).
Evaluate potential applications in safety monitoring, predictive risk analysis, and regulatory reporting, in collaboration with ERA datasets.
References:
Rail Accident Investigations
https://www.era.europa.eu/domains/accident-incident/rail-accident-investigation_en
RINF register of infrastructure
25. Thesis project: AI AGENTS to plan B2B sales leads from media observation
Supervisors: Shahrom Sohi, Axel PolleresSupervisors: Shahrom Sohi, Rail Cargo Group – ÖBB
Keywords:
Large Language Models (LLMs)
Web scrapping
Pattern recognition
B2B Sales
Lead Generation
Abstract:
This thesis topic is a highly relevant and practically applicable tuning of an LLM: Training an LLM to Generate Sales Leads from Media News
Through media monitoring, it is possible to generate sales leads. This works very well manually but is time-consuming and doesn't scale. We can provide a set of media articles and, from this, the manually identified subset of sales leads. An LLM should be trained to recognize a pattern from this so that it can then independently filter the sales leads relevant to the RCG from the media articles."
The challenge is to teach the LLM the specific context to understand through pattern recognition, so that it will be able to select news articles that represent a sales lead from other news articles.
Example: Company A is a prospective target customer. On a given day there are a X number of media articles published with company A’s name in it. One of them states that company A won a new contract and will because of this soon start exports to a new country. These transport streams provide a business opportunity for Rail Cargo Group; hence we classify it as a sales lead. The trained LLM should be equipped with such a contextual understanding that it can (pre-)select this one article from the entire set of X articles.
A functioning model would be an additional source to continuously feed the B2B sales funnel with new sales leads.
You will be provided by Rail Cargo Group with historical data on news articles, sales leads selected from it, as well as on a limited subset the conversion rates of these sales leads. In addition, elaborated list of key words is available, and you will receive close coaching and input by Rail Cargo Group subject matter experts.
V. Kumar et al. (2024). “AI-powered marketing: What, where, and how?”
Journal: Industrial Marketing Management (Elsevier)
https://www.sciencedirect.com/science/article/pii/S0268401224000318
D. Herhausen et al. (2025). “From words to insights: Text analysis in business research”
Journal: Journal of Business Research
https://www.sciencedirect.com/science/article/pii/S0148296325003145
26. Facilitating Interdisciplinary Collaboration through Knowledge Graph-based Expertise Discovery and Recommendation: A Use Case of an Austrian University
Supervisors: Daniil Dobriy, Axel PolleresAdvisor: Daniil Dobriy
The thesis/topic can be written by more than one student
Other important information:
If your thesis involves software development, Git version control will be used. Prior Git experience is not mandatory - a training course will be provided if needed. Artifacts produced during your thesis should meet the following requirements whenever permissible:
| Documentation | Publication | License | |
| Software | StandardREADME[1] | Institute’s GitLab[2] | MIT license[3] |
| Datasets | DCAT[4] | WU data portal[5] | CC BY 4.0[6] |
| Other artifacts | Standard README | Institute’s GitLab | CC BY 4.0 |
For computationally intensive applications, students will be provided SSH access to institute’s server infrastructure. To facilitate deployment, complex applications are recommended to be containerized using Docker - training available if needed.
Thesis description:
Despite being transformative, integrative and accelerating innovation [3], interdisciplinary research is associated with a number or institutional, professional and organizational challenges [1]. Inspired by the theory of weak ties [4], this research topic aims at developing a comprehensive framework for a researcher discovery and expert recommendation [2] that leverages semantic web technologies [5] to enable interdisciplinary collaboration at academic institutions. Positioned at WU Vienna and using the research information management systems available at the university (e.g., Pure) as a practical case study, the thesis combines a variety of topics including the challenges of interdisciplinary collaboration as a literature and survey study with technical aspects such as ETL (WU research platforms), knowledge graph construction, ontology re-use and recommendation frameworks to solve a genuine institutional challenge.
This thesis topic envisions collaboration with the WU Research Service Center. The final scope and focus of the topic will be decided with the students based on the relevant expertise and research interests. The topic is multi-faceted and will accommodate multiple students - i.e., we encourage multiple students to apply and work in parallel on the evaluation of different aspects of the topic.
Prerequisites:
Familiarity with ETL (Extract, Transform, Load) pipelines and knowledge representation/knowledge graphs is advantageous.
Interest in developing thesis findings into an academic publication.
Strong time management skills and ability to meet project milestones. A structured workflow with clear milestones will guide you through the thesis development process.
References:
[1] Daniel, K. L., McConnell, M., Schuchardt, A., & Peffer, M. E. (2022). Challenges facing interdisciplinary researchers: Findings from a professional development workshop. Plos one, 17(4), e0267234.
[2] Nikzad–Khasmakhi, Narjes, M. A. Balafar, and M. Reza Feizi–Derakhshi. "The state-of-the-art in expert recommendation systems." Engineering Applications of Artificial Intelligence 82 (2019): 126-147.
[3] Fagan, Jesse, et al. "Assessing research collaboration through co-authorship network analysis." The journal of research administration 49.1 (2018): 76.
[4] Granovetter, Mark. "The strength of weak ties: A network theory revisited." Sociological theory (1983): 201-233.
[5] Hogan, Aidan, et al. "Knowledge graphs." ACM Computing Surveys (Csur) 54.4 (2021): 1-37.
Keywords: Expert Recommendation Systems, Interdisciplinary Research, Scientific Linked Data, Knowledge Graphs, Ontologies, Research Information Management Systems, Scientific Collaboration Networks
[1] See github.com/RichardLitt/standard-readme
[2] I.e., git.ai.wu.ac.at
[3] See opensource.org/license/mit
[4] See www.w3.org/TR/vocab-dcat-3/
[5] I.e., data.wu.ac.at/portal
27. From Discovery Aid to Bias Amplifier: Analysing the Role of Language Model-based Retrievers in Scientific Research: Risks, Metrics, and Mitigation Strategies
Supervisors: Daniil Dobriy, Axel PolleresAdvisor: Daniil Dobriy
Other important information:
If your thesis involves software development, Git version control will be used. Prior Git experience is not mandatory - a training course will be provided if needed. Artifacts produced during your thesis should meet the following requirements whenever permissible:
| Documentation | Publication | License | |
| Software | StandardREADME[1] | Institute’s GitLab[2] | MIT license[3] |
| Datasets | DCAT[4] | WU data portal[5] | CC BY 4.0[6] |
| Other artifacts | Standard README | Institute’s GitLab | CC BY 4.0 |
For computationally intensive applications, students will be provided SSH access to institute’s server infrastructure. To facilitate deployment, complex applications are recommended to be containerized using Docker - training available if needed.
Thesis description:
Platforms like Perplexity,[7] ChatGPT Deep Research feature[8] as well as the newly introduced Google Scholar’s Scholar Labs[9] aim to facilitate research discovery via a Retrieval-Augmented Generation-enhanced chat relying on extensive paper repositories including non-peer-reviewed pre-publication resources like Arxiv,[10] bioRxiv[11] and Zenodo.[12] However, extensive use of such language model (LM)-based retrievers could lead to source bias [1] whenever retrievers prefer LM-generated or unrepresentative content and self-fulfilling prophesies [2] whenever researchers prompt the retrievers in such a way which retrieves sources supporting researcher assumptions. In this context, attempts have been made to optimise LM training on multi-turn conversations to better elicit user intent [3], design RAG systems focused on counterfactual retrieval [4] and propose multi-agent debate systems to improve variety of results [5]. The aim of this topic is to (a) evaluate risks associated with the use of LM-based retrievers in research, (b) propose metrics and approaches towards analysing the extent and effects of LM-based retriever use in research, (c) propose workflows and architectures addressing the identified risks (a) in LM-based retrievers.
Prerequisites:
Interest in developing thesis findings into an academic publication.
Strong time management skills and ability to meet project milestones.
References:
[1] Wang, Haoyu, et al. "Perplexity Trap: PLM-Based Retrievers Overrate Low Perplexity Documents." arXiv preprint arXiv:2503.08684 (2025).
[2] Bauer, Kevin, and Andrej Gill. "Mirror, mirror on the wall: Algorithmic assessments, transparency, and self-fulfilling prophecies." Information Systems Research 35.1 (2024): 226-248.
[3] Wu, Shirley, et al. "Collabllm: From passive responders to active collaborators." arXiv preprint arXiv:2502.00640 (2025).
[4] Yue, Zhenrui, et al. "Retrieval augmented fact verification by synthesizing contrastive arguments." arXiv preprint arXiv:2406.09815 (2024).
[5] Learning to break: Knowledge-enhanced reasoning in multi-agent debate system
Keywords: AI in Research, Algorithmic bias, Retrieval-Augmented Generation, LM-based Retrievers, Self-fulfilling Prophecies, Science of Science
[1] See github.com/RichardLitt/standard-readme
[2] I.e., git.ai.wu.ac.at
[3] See opensource.org/license/mit
[4] See www.w3.org/TR/vocab-dcat-3/
[5] I.e., data.wu.ac.at/portal
[6] See creativecommons.org/licenses/by/4.0/deed.en
[8] See https://openai.com/index/introducing-deep-research/
[9] See https://scholar.google.com/scholar_labs
28. Developing and Evaluating Components of the Search Engine for the Web of Data (search.ai.wu.ac.at)
Supervisors: Daniil Dobriy, Axel PolleresAdvisor: Daniil Dobriy
The thesis/topic can be written by more than one student
Other important information:
If your thesis involves software development, Git version control will be used. Prior Git experience is not mandatory - a training course will be provided if needed. Artifacts produced during your thesis must meet the following requirements whenever permissible:
| Documentation | Publication | License | |
| Software | StandardREADME[1] | Institute’s GitLab[2] | MIT license[3] |
| Datasets | DCAT[4] | WU’s data portal[5] | CC BY 4.0[6] |
| Other artifacts | Standard README | Institute’s GitLab | CC BY 4.0 |
For computationally intensive applications, students will be provided SSH access to institute’s server infrastructure. To facilitate deployment, complex applications are recommended to be containerized using Docker - training available if needed.
Thesis description:
Standard protocols such as the Model Context Protocol (MCP) [2] that allow LLMs to connect to tools, have recently boosted the applications of LLMs to develop "agentic" AI applications, which - powered by - LLMs' planning capabilities promise to solve complex tasks with the access of external tools and databases. Use cases explored so far in the literature have mostly focused on interactions with single (e.g. relational) databases, with tasks such as schema exploration, query formulation from natural language (text-to-SQL) and results translation to desired output formats being successfully delegated to such agents. On the contrary, SPARQL as a standard query language offers even more flexibility to combine various data sources through (a) endpoints readily implementing a standard protocol, (b) standardized metadata formats to self-describe an endpoints schema and capabilities potentially leveraging dynamic discovery, and (c) SPARQL's native capability to federate queries across multiple such endpoints.
Previous pipelines have been proposed for text-to-SPARQL generation and federated querying [3]. One of the proposed agentic SPARQL federation architectures (see Figure 1) relies on a centralized “Catalogue” that facilitates endpoint discover and schema exploration. Previously, SPARQLES [1] has been such proposed “endpoint monitoring service” and is a precursor to the Catalogue. In this thesis, your goal will be to explore types of metadata and strategies that could support (a) endpoint discovery (i.e. “searching for an applicable endpoint/knowledge graph that could deliver an answer to a specific natural language question or SPARQL query”) and (b) schema exploration (i.e. “retrieving the part of schema that is relevant to a particular natural language question or SPARQL query”). More detailed goal definition will be formulated with the student in the early stages of the thesis planning process.
The topic will accommodate multiple students – i.e., we encourage multiple students to apply and work in parallel on the evaluation of different strategies with the evaluation testbed we provide.
Prerequisites:
Familiarity with SPARQL and MCP is advantageous.
Interest in developing thesis findings into an academic publication.
Strong time management skills and ability to meet project milestones. A structured workflow with clear milestones will guide you through the thesis development process.
References:
Pierre-Yves Vandenbussche, Jürgen Umbrich, Luca Matteis, Aidan Hogan, and Carlos Buil-Aranda. 2017. SPARQLES: Monitoring public SPARQL endpoints. Semantic web 8, 6 (2017), 1049–1065.
Anthropic. 2025. Model Context Protocol Specification, Version 2025-06-18. MCP Specification. modelcontextprotocol.io/specification/2025-06-18 Accessed: 2025-09-14.
Emonet, V., Bolleman, J., Duvaud, S., de Farias, T. M., & Sima, A. C. (2024). Llm-based sparql query generation from natural language over federated knowledge graphs. arXiv preprint arXiv:2410.06062. https://arxiv.org/html/2410.06062v2
Keywords: Query Federation,Retrieval-Augmented Generation, Large Language Models, Named Entity Recognition, Knowledge Graphs, Knowledge Graph Question Answering (KGQA)
[1] See github.com/RichardLitt/standard-readme
[2] I.e., git.ai.wu.ac.at
[3] See opensource.org/license/mit
[4] See www.w3.org/TR/vocab-dcat-3/
[5] I.e., data.wu.ac.at/portal
Write a Thesis