Current Bachelor Thesis Topics
Bachelor Thesis Topics WS 2022/2023
Bachelor thesis topics will be assigned primarily to students of our SBWLs; and within this group of students, to those who have passed most SBWL courses, i.e., bring the necessary prior subject knowledge for successfully writing the bachelor thesis (ideally, have already passed the SBWL Research Seminar).
1. Querying Very Large Graph databases
Supervisor: Axel PolleresWikidata, as one of the largest collaboratively created Knowledge Graphs seems to struggle with scalability [1,2]: its backend query service runs on an RDF graph database called BlazeGraph, but imposed partially very restricting limitations to user queries on its public endpoint (query.wikidata.org) by interrrupting many complex user queries with a timeout.
In this thesis you should investigate alternatives to evaluate such queries, either by
operating with script-based manually implemented "query plans" on wikidata dumps (e.g. by using commandline linux tools, or python and bigdata frameworks)
or by evaluating alternative (graph) database engines that can copy with databases of the size of wikidata (consisting of at the moment of over 14billion RDF triple statements (cf. w.wiki/5fBD)
This topic lends itself to potentially several thesis, all of which should contain a literature review AND a practical implementation part in order to deal with complex queries on large scale Knowledge Graphs like Wikidata:
command line/scripting alternatives (including the lightweightRDF compression library/framework RDF-HDT [3] which has been co-developed in the institute)
Open source and commercial graph database engines
Cloud based offerings
pipelines consisting of combinations of the above three.
As a starting point, you may also consider a recent survey in VLDBJ on RDF Databases [4]
1. www.wikidata.org/wiki/Wikidata:SPARQL_query_service/WDQS_backend_update/August_2021_scaling_update
2. phabricator.wikimedia.org/T20656
3. rdfhdt.org
4. Waqas Ali, Muhammad Saleem, Bin Yao, Aidan Hogan, Axel-Cyrille Ngonga Ngomo: A survey of RDF stores & SPARQL engines for querying knowledge graphs. VLDB J. 31(3): 1-26 (2022)
2. Visualizing Open Data in Virtual and Augmented Reality - How can AR and VR be used to improve exploration of data?
Supervisor: Johann MitlöhnerBackground: Developing new methods for exploring and analyzing data in virtual and augmented reality presents many opportunities and challenges, both in terms of software development and design inspiration. There are various hardware options, from Google Cardboard to Oculus Rift.
Taking part in this challenge demands programming skills as well as creativity.
A basic VR or AR application for exploring a specific type of open data will be developed by the student. The use of a platform-independent kit such as A-Frame is essential, as the application will be compared in a small user study to its non-VR version in order to identify advantages and disadvantages of the visualization method implemented. Details will be discussed with supervisor.
Millais, Patrick, Simon L. Jones, and Ryan Kelly. "Exploring Data in
Virtual Reality: Comparisons with 2D Data Visualizations." Extended
Abstracts of the 2018 CHI Conference on Human Factors in
Computing Systems. ACM, 2018.
More info at http://mitloehner.com/lehre/thesis-en.html
3. Text Mining using Deep Learning with Tensorflow and GPU
Supervisor: Johann MitlöhnerExplore capabilities of state of the art GPU units providing huge RAM and processing power for developing deep learning applications in Python and Keras; new hardware now available at our institute, accessible via remote interface. Details to be discussed with supervisor. Programming experience in Python and understanding of connectionist machine learning approaches required.
Minqing Hu, Bing Liu, "Mining and summarizing customer reviews", KDD '04, pp. 168-177.
More info at http://mitloehner.com/lehre/thesis-en.html
4. Platform Outages and Online Activity
Supervisor: Johannes WachsOnline collaboration, especially in the software industry, increasingly takes place on a few centralized platforms. GitHub and Slack are two distinguished examples. Occasionally these services experience disruptions and are unavailable for hours or even days. These shocks present an opportunity to study what developers do when primary tools for collaboration and communication are offline. Do they focus on alternative tasks? Do they explore new ideas? Or are times during outages written off as lost? In this thesis, the student will gather data on outages of GitHub and Slack and compare activity on before, during, and after these events. The student can study changes in activity and behavior on GitHub and Stack Overflow, and in Google searches of relevant terms using econometric methods.
References:
Moldon, Lukas, Markus Strohmaier, and Johannes Wachs. "How gamification affects software developers: Cautionary evidence from a natural experiment on github." 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE). IEEE, 2021.
Malik, Momin M., and Jürgen Pfeffer. "Identifying platform effects in social media data." Tenth International AAAI Conference on Web and Social Media. 2016.
Elmezouar, Mariam, et al. "Exploring the Use of Chatrooms by Developers: An Empirical Study on Slack and Gitter." IEEE Transactions on Software Engineering (2021).
5. Geographic Dynamics of Software Use
Supervisor: Johannes WachsEven though software can be written almost anywhere on Earth, much of it is written or maintained in a few places: Silicon Valley, Berlin, London, Stockholm. Of course there are important exceptions: Ubuntu was first created in South Africa, Linux in Finland, Ruby in Japan, Lua in Rio. We don't yet have much large scale evidence of where software is created vs where it is used, and how usage spreads geographically. One excellent source of data on software use is PyPi, the Python programming language's primary package manager. This data, hosted on BigQuery, includes information on years of downloads of individual libraries including country of downloader. The goal of this thesis will be to collect temporal data on where specific libraries were downloaded, seeking to infer where a library was first created, and mapping its spread. The student will need to access a very large dataset (35+ TB) on Bigquery and extract aggregated data for analysis.
References:
Wachs, J., Nitecki, M., Schueller, W., & Polleres, A. (2022). The geography of open source software: Evidence from github. Technological Forecasting and Social Change, 176, 121478.
Takhteyev, Y., & Hilts, A. (2010). Investigating the geography of open source software through GitHub. Manuscript submitted for publication.
Fackler, T., & Laurentsyeva, N. (2020). Gravity in online collaborations: Evidence from github. In CESifo Forum (Vol. 21, No. 03, pp. 15-20). München: ifo Institut-Leibniz-Institut für Wirtschaftsforschung an der Universität München.
6. Identifying Open Source Software Dependencies in Smartphone Apps
Supervisor: Johannes WachsOpen source software is used in all applications on smartphones. When apps use an OSS library, they are supposed to acknowledge this somewhere. Unfortunately they do not do this in a consistent way. It would be valuable to better understand which apps are using which OSS libraries, because apps depend on the continued functionality of such upstream libraries, hence on their maintainers. These dependencies are potential sources of vulnerability. The goal of this thesis will be to evaluate potential methods to extract in an automated way the OSS dependencies of apps, either from iphones or android or both, and to link these OSS to GitHub or Gitlab projects. In case an automated way is not possible, the student will manually curate a list of dependencies for the most commonly used apps in Austria, and calculate statistics on the level of maintenance of the app.
References:
Avelino, G., Passos, L., Hora, A., & Valente, M. T. (2016, May). A novel approach for estimating truck factors. In 2016 IEEE 24th International Conference on Program Comprehension (ICPC) (pp. 1-10). IEEE.
Almeida, D. A., Murphy, G. C., Wilson, G., & Hoye, M. (2017, May). Do software developers understand open source licenses?. In 2017 IEEE/ACM 25th International Conference on Program Comprehension (ICPC) (pp. 1-11). IEEE.
Decan, A., Mens, T., & Claes, M. (2017, February). An empirical comparison of dependency issues in OSS packaging ecosystems. In 2017 IEEE 24th international conference on software analysis, evolution and reengineering (SANER) (pp. 2-12). IEEE.
7. The impact of digital technologies on business processes – insights from published case studies
Supervisor: Monika Malinova MandelburgerProcesses deliver value to customers through a repetitive execution of its activities. However, every good process eventually becomes a bad process due to numerous factors such as changing environment (e.g. climate change, pandemic), increased competition and rising customer expectations which influence how a company operates. In order to keep up with all these changes, organizations have to continuously change their business processes. One of the biggest enablers of process change nowadays are the emerging digital technologies.
This thesis should explore the different ways digital technologies such as the Internet of Things affect the business processes of organizations. This should be done by means of a systematic literature review of published scientific papers and/or industry case studies, or by collecting and analysing empirical data on the role of digital technologies on the business processes in organizations.
This thesis topic could be divided into multiple thesis topics, each focusing on a specific digital technology (e.g. AI, VR, AR, Robotics, Automation, Cloud Computing, Data Analytics, etc.).
References:
Mendling, J., Pentland, B. T., & Recker, J. (2020). Building a complementary agenda for business process management and digital innovation. European journal of information systems, 29(3), 208-219.
Bilgeri, D., Gebauer, H., Fleisch, E., & Wortmann, F. (2019). Driving process innovation with IoT field data. MIS Quarterly Executive, 18, 191-207.
Kamalaldin, A., Sjödin, D., Hullova, D., & Parida, V. (2021). Configuring ecosystem strategies for digitally enabled process innovation: A framework for equipment suppliers in the process industries. Technovation, 105, 102250.
Sjödin, D. R., Parida, V., Leksell, M., & Petrovic, A. (2018). Smart Factory Implementation and Process Innovation: A Preliminary Maturity Model for Leveraging Digitalization in Manufacturing Moving to smart factories presents specific challenges that can be addressed through a structured approach focused on people, processes, and technologies. Research-Technology Management, 61(5), 22-31.
Shi, Zhan, et al. "Smart factory in Industry 4.0." Systems Research and Behavioral Science 37.4 (2020): 607-617.
8. How do digital technologies affect the customer’s experience and convenience?
Supervisor: Monika Malinova MandelburgerProcesses deliver value to customers through a repetitive execution of its activities. One company can outperform another company that sells the same products and/or services by executing their processes better. Nowadays, companies take advantage of the different digital technologies such as AI and Data Analytics in order to serve their customers better. For example, companies use technologies such as AR to enable customers to try products virtually, which in turn increases the customer’s convenience as well as experience. Also, digital technologies facilitate better patient care by making it possible to monitor the health of people without hospitalization. This also has an effect on the customer convenience.
This thesis should explore the different ways digital technologies can be used to enhance the customer convenience and experience.
References:
Petersen, J. A., Paulich, B. J., Khodakarami, F., Spyropoulou, S., & Kumar, V. (2022). Customer-based execution strategy in a global digital economy. International Journal of Research in Marketing, 39(2), 566-582.
Hoyer, W. D., Kroschke, M., Schmitt, B., Kraume, K., & Shankar, V. (2020). Transforming the customer experience through new technologies. Journal of Interactive Marketing, 51(1), 57-71.
Lee, S. M., & Lee, D. (2020). “Untact”: a new customer service strategy in the digital age. Service Business, 14(1), 1-22.
Jamkhaneh, H. B., Tortorella, G. L., Parkouhi, S. V., & Shahin, R. (2022). A comprehensive framework for classification and selection of H4. 0 digital technologies affecting healthcare processes in the grey environment. The TQM Journal.
9. Executive decision-making with process models
Supervisor: Monika Malinova MandelburgerProcesses are continuously changed in organizations. BPM experts often present the process changes to senior executives in order to get approval for their implementation throughout the organization. However, often executives lack modelling knowledge to understand process changes when presented in form of process models (processes modelled using BPMN). As a result, BPM experts use tools such as PPT to present process changes.
For this thesis the student should investigate the best way to present process changes to executives using process models.
References:
Pijpers, G. G., Bemelmans, T. M., Heemstra, F. J., & van Montfort, K. A. (2001). Senior executives' use of information technology. Information and software Technology, 43(15), 959-971.
Turetken, O., Dikici, A., Vanderfeesten, I., Rompen, T., & Demirors, O. (2020). The influence of using collapsed sub-processes and groups on the understandability of business process models. Business & Information Systems Engineering, 62(2), 121-141.
Mendling, J., Strembeck, M., & Recker, J. (2012). Factors of process model comprehension—findings from a series of experiments. Decision support systems, 53(1), 195-206.
10. A Survey of AI Audit Tools
Supervisor: Marta SabouContext. Artificial Intelligence (AI) Audits are used to investigate AI systems in many contexts, e.g., security, robustness or fairness. Initial methodologies [1] and tools [2, 3] have been proposed to enable improved AI Auditing processes. The basic audit process consists of the following four steps:
Definition of audit goal and scope: The goal of the scoping stage is to clarify the objective of the audit (e.g. fairness, security, robustness, identifying ML errors,...) of the investigated system.
Preparation of audit: The mapping stage is to identify system components and stakeholders that are relevant to audit goal and scope.
Conduction of audit: The actual “This stage is where the majority of the auditing team’s testing activity is done—when the auditors execute a series of tests to gauge the compliance of the system with the prioritized ethical values of the organization.” [1] Collection of audit trace data is also part of this step.
Postprocessing of audit: This phase of the audit is the more reflective stage, when the results of the tests at the execution stage are analyzed in regard to the expectations raised in Step 1.
Problem. Despite the availability of such methodologies and tools, AI audits are still not yet mature, largely unstructured and time-consuming. Thus, we are interested to understand the capability of existing tools that support/enable auditability of AI systems.
Goal. The thesis would be focusing on conducting a survey on the existing tools (e.g., using the list on [4] as a starting point), to analyze and extract information which kind of AI audits they support and which auditing steps they automate/support. We are mainly interested in Step 2 (Preparation of audits) and parts of Step 3 (Conduction of audits) above.
Expected outcome. The thesis should primarily focus on an analysis of the extracted data (e.g., analyzing/discussing main trends that emerge for the topic).
Thesis type: the topic is suitable both for a Bachelor and a Master thesis.
Initial Reading List.
[1] Inioluwa Deborah Raji, Andrew Smart, Rebecca N. White, Margaret Mitchell, Timnit Gebru, Ben Hutchinson, Jamila Smith-Loud, Daniel Theron, and Parker Barnes. 2020. Closing the AI accountability gap: defining an end-to-end framework for internal algorithmic auditing. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency (FAT* '20). Association for Computing Machinery, New York, NY, USA, 33–44. doi.org/10.1145/3351095.3372873
[2] I. Naja, M. Markovic, P. Edwards, W. Pang, C. Cottrill and R. Williams, "Using Knowledge Graphs to Unlock Practical Collection, Integration, and Audit of AI Accountability Information," in IEEE Access, vol. 10, pp. 74383-74411, 2022, doi: 10.1109/ACCESS.2022.3188967.
[3] Aequitas: A Bias and Fairness Audit Toolkit; arxiv.org/pdf/1811.05577.pdf
11. Benchmarking human-centric ontology defects
Supervisor: Marta SabouContext. Ontologies are conceptual domain models capturing the main terms and relations in a domain. They are at the basis of knowledge graphs as well as several modern intelligent applications (e.g., search, decision support). As such ensuring the quality of ontologies is important. While several methods exist for automatic ontology verification, some ontology defects can only be identified by human inspection [1, 2].
Problem. Despite the importance of human-centric ontology evaluation, there is currently no benchmark of ontologies with defects on which methods for human-centric evaluation could be developed.
Goal and expected outcome. The goal of this thesis is to create a benchmark of ontologies with such defects. Input material will be a collection of ontologies created by students as part of assignments. These ontologies should be anonymised in a first step. Second, the ontologies will be inspected with tools that can pinpoint possible defects which still need to be verified by a user. In particular the OOPS! Tool [3] will be used. Finally, the ontologies and their defects will be documented in a way that they can be reused for benchmarking activities.
Thesis type: the topic is suitable both for a Bachelor and a Master thesis.
Initial Reading List.
[1] Alan Rector et al. “OWL pizzas: Practical experience of teaching OWL-DL: Common errors & common patterns”. In: vol. 3257. Oct. 2004, pp. 63–81. isbn: 978-3-540-23340-4. doi: 10.1007/978-3-540-30202-5_5.
[2] Paul Warren et al. “Improving comprehension of knowledge representation languages: A case study with Description Logics”. In: International Journal of Human-Computer Studies 122 (Feb. 2019), pp. 145–167. doi:10.1016/j.ijhcs.2018.08.009.
[3] Marıa Poveda-Villalon, Asuncion Gomez-Perez, and Mari Carmen Suarez Figueroa. “OOPS!” In: Innovations, Developments, and Applications of Semantic Web and Information Systems. IGI Global, 2018, pp. 120–148. Doi: 10.4018/978-1-5225-5042-6.ch005.
12. Analyzing Constraint Languages for Knowledge Graphs
Supervisor: Nicolas FerrantiBackground
Knowledge graphs (KGs) are nowadays the main structured data representation model on the web, representing interconnected knowledge of different domains. There are several methods to model a KG. For instance, they can be extracted from semi-structured web data, like DBpedia, or edited collaboratively by a community, like Wikidata. Since there is no perfect method and knowledge about the world is constantly changing, regular updates in the KGs are required. In this context, constraints are implemented as rules to test data compliance and find possible inconsistencies.
Goal of the thesis
The main goal of this thesis is to compare different constraint languages and reduce the gap between Wikidata property constraints language and Shapes Constraint Language (SHACL), the current recommended language by World Wide Web Consortium (W3C). The specific goal is to use mappings from Wikidata to SHACL to make it possible to validate Wikidata constraints using SHACL validators.
Requirements
It's interesting that the student has some previous knowledge and interest in Python, databases, and knowledge graphs. Further desirable requirements are pro-activity and self-organization.
Initial references
Pareti, Paolo, and George Konstantinidis. "A Review of SHACL: From Data Validation to Schema Reasoning for RDF Graphs." Reasoning Web International Summer School (2021): 115-144.
Shenoy, K., Ilievski, F., Garijo, D., Schwabe, D., & Szekely, P. (2021). A Study of the Quality of Wikidata. arXiv preprint arXiv:2107.00156.
Vrandečić, D. (2012, April). Wikidata: A new platform for collaborative data collection. In Proceedings of the 21st international conference on world wide web (pp. 1063-1064).
Name of supervisor: Nicolas Ferranti
13. Modeling emerging patterns of Human-AI Collaboration in Hybrid Intelligence Systems
Supervisor: Elmar KieslingBackground
In recent years, a large stream of AI research has emerged that is not aimed at developing systems that replicate or surpass human intelligence, but rather focused on a
synergistic combination of human and machine intelligence, giving rise to the concept of hybrid intelligence systems [2, 1, 4, 5, 3]. This growing research interest reflects a broader human-centric paradigm that is also evident in industrial policies – such as the European Union Ethics Guidelines for Trustworthy AI, which calls for human agency and oversight. The question how such hybrid intelligent systems and the roles and interactions
between human and artificial agents can be modeled, however, has so far attracted limited attention.
Research Problem
In this Bachelor thesis, you will investigate questions such as
What are appropriate ways to model processes that are performed collaboratively by humans and artificial agents?
How can the process models support the planning and design of hybrid intelligence systems?
How can the process models be used to enact hybrid intelligence workflows?
As a starting point for your investigation, you will collect early reported examples of successful human-machine collaboration (e.g., in business applications such as decision support systems). To this end, you will review the scientific and (optional) gray literature (e.g., publications about funded projects, demonstrators and applications published on portals such as AI4eu etc.) for reported hybrid intelligence applications. Based on the collected examples, you will then identify emerging patterns of human-AI collaboration and investigate to what extent and how these patterns can be modeled using a process modeling language (e.g., imperative languages such as BPMN, declarative languages such as Declare). The thesis will result in a set of modeled examples and modeling insights; these may include successful modeling patterns and/or identified challenges that stem from the tension between the flexible and adaptive nature of hybrid intelligence systems and the constraints imposed by process modeling languages.
References
[1] Zeynep Akata et al. “A research agenda for hybrid intelligence: augmenting human
intellect with collaborative, adaptive, responsible, and explainable artificial intelligence”.
In: Computer 53.08 (2020), pp. 18–28.
[2] Dominik Dellermann et al. “Hybrid intelligence”. In: Business & Information Systems
Engineering 61.5 (2019), pp. 637–643.
[3] Patrick Hemmer et al. “Human-AI Complementarity in Hybrid Intelligence Systems:
A Structured Literature Review.” In: PACIS (2021), p. 78.
[4] Matthew Johnson and Alonso Vera. “No AI is an island: the case for teaming
intelligence”. In: AI magazine 40.1 (2019), pp. 16–28.
[5] Marieke MM Peeters et al. “Hybrid collective intelligence in a human–AI society”.
In: AI & society 36.1 (2021), pp. 217–238.
14. Building Knowledge Graphs from text data
Supervisor: Dawa ChangBackground:
Knowledge graphs (KGs) can be defined as a graph of data intended to accumulate and convey knowledge of the real world, whose nodes represent entities of interest and whose edges represent potentially different relations between these entities [1], and it’s especially working well for both humans and machines to consume knowledge. Nowadays KGs are regarded as a key enabler for a number of technologies including artificial intelligence, question answering, personal assistants and more across all sectors and large companies (incl. Microsoft, Google, Facebook, Amazon, Samsung, Ebay and IBM) [2]. Despite of its wide usage and importance, however, KGs are still unfamiliar to many people as a technique to manage data.
Research problem:
The main purpose of this topic is to know what is KGs and how to make it. Also, this topic aims to get familiar with text analysis. In this regard, this thesis aims to (1) see what is KGs and its special characteristics, (2) analyze text data for the purpose of building a RDF Knowledge Graph, (3) try to make a RDF Knowledge Graph with corpora resulted from the text analysis, (4) and figure out what could be needed to build an informative and useful Knowledge Graph. The text data will be provided, but a student is free to choose any other text data or software in discussion with the supervisor of this topic.
Initial references:
Hogan, A., Blomqvist, E., Cochez, M., d’Amato, C., Melo, G. D., Gutierrez, C., ... & Zimmermann, A. (2021). Knowledge graphs. ACM Computing Surveys (CSUR), 54(4), 1-37. https://dl.acm.org/doi/pdf/10.1145/3447772
Noy, N., Gao, Y., Jain, A., Narayanan, A., Patterson, A., & Taylor, J. (2019). Industry-scale Knowledge Graphs: Lessons and Challenges: Five diverse technology companies show how it’s done. Queue, 17(2), 48-75. https://dl.acm.org/doi/pdf/10.1145/3329781.3332266
Claudio Gutierrez and Juan F. Sequeda (2019) "A Brief History of Knowledge Graph's Main Ideas: A tutorial" http://knowledgegraph.today/paper.html
15. Deep Learning in Business Process Management: A reproducibility study
Supervisor: Stefan BachhofnerDeep Learning has had tremendous success in business and science (LeCun, 2015, Schmidhuber, 2015). The paper by (Krizhevsky, 2012) is widely accepted to be the key paper for this success, as they were able to substantially decrease the error on ImageNet Large Scale Visual Recognition Challenge 2012 (ILSVRC-2012), an image classification task, by using a convolutional neural network. As a consequence, deep learning has been applied to predictive problems in business process management as well – for example (Nguyen, 2020), (Taymouri, 2021), (Park and Song, 2019), (Tax, 2017), (Weinzierl, 2020) and (Obodoekwe, 2022).
The objective of this thesis is to reproduce at least one paper that applies deep learning to a predictive problem in business process management. You can choose one of the ones listed above. The scope will be adapted according to whether it will be a master or bachelor thesis.
Supervisor: Stefan Bachhofner
Resources
STAT 157, Introduction to Deep Learning, UC Berkley, courses.d2l.ai/berkeley-stat-157/index.html
6.S191, Introduction to Deep Learning, Massachusetts Institute of Technology, introtodeeplearning.com
Dive into Deep Learning, d2l.ai
CS231n, Convolutional Neural Networks for Visual Recognition, Stanford University, cs231n.stanford.edu
A collection of lecture on deep learning, Massachusetts Institute of Technology, deeplearning.mit.edu
References
LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436-444.
Schmidhuber, J. (2015). Deep learning in neural networks: An overview. Neural networks, 61, 85-117.
Neu, D. A., Lahann, J., & Fettke, P. (2021). A systematic literature review on state-of-the-art deep learning methods for process prediction. Artificial Intelligence Review, 1-27.
Nguyen, A., Chatterjee, S., Weinzierl, S., Schwinn, L., Matzner, M., & Eskofier, B. (2020, October). Time matters: time-aware LSTMs for predictive business process monitoring. In International Conference on Process Mining (pp. 112-123). Springer, Cham.
Taymouri, F., La Rosa, M., & Erfani, S. M. (2021). A deep adversarial model for suffix and remaining time prediction of event sequences. In Proceedings of the 2021 SIAM International Conference on Data Mining (SDM) (pp. 522-530). Society for Industrial and Applied Mathematics.
G. Park and M. Song, "Prediction-based Resource Allocation using LSTM and Minimum Cost and Maximum Flow Algorithm," 2019 International Conference on Process Mining (ICPM), 2019, pp. 121-128, doi: 10.1109/ICPM.2019.00027.
Park, G., & Song, M. (2020). Predicting performances in business processes using deep neural networks. Decision Support Systems, 129, 113191.
Tax, N., Verenich, I., La Rosa, M., Dumas, M. (2017). Predictive Business Process Monitoring with LSTM Neural Networks. In: Dubois, E., Pohl, K. (eds) Advanced Information Systems Engineering. CAiSE 2017. Lecture Notes in Computer Science(), vol 10253. Springer, Cham. doi.org/10.1007/978-3-319-59536-8_30
Weinzierl, S., Zilker, S., Brunk, J., Revoredo, K., Matzner, M., & Becker, J. (2020, September). XNAP: making LSTM-based next activity predictions explainable by using LRP. In International Conference on Business Process Management (pp. 129-141). Springer, Cham.
Obodoekwe E, Fang X, Lu K. Convolutional Neural Networks in Process Mining and Data Analytics for Prediction Accuracy. Electronics. 2022; 11(14):2128. doi.org/10.3390/electronics11142128
16. Performance Evolution of In-Knowledge Graph Tasks: A Structured Literature Review
Supervisor: Stefan BachhofnerKnowledge graphs (KG) are means to model a domain of interest via relationships between objects, where the objects are the nodes and the relationships are the edges of a graph (Hogan, 2020). See (Rotmensch, 2017) for an example from medicine. After construction, a KG can be used for downstream tasks, which are grouped into In-KG and Out-of-KG tasks - sometimes also called applications, but we stick with tasks for now - (Wang, 2017). The In-KG tasks are link prediction, triple classification, entity classification, and entity resolution. In the thesis, we are interested in these In-KG tasks and which methods of the past 10 to 12 years have lead to performance gains on these tasks. Within this, we are particularly interested in comparing methods which use knowledge graph embeddings and those which do not. Knowledge graph embeddings are graph embeddings, in other words, the nodes and edges of the knowledge graph are mapped into a continuous vector space – which is referred to as embedding a knowledge graph (Wang, 2017). Embeddings are, however, not limited to nodes and edges but are also done for substructures (subset of nodes and/or edges) and even the whole-graph (Cai, 2018).
The objective of this thesis is to conduct a comprehensive structured literature review (SLR) on the performance evolution of in-KG tasks. Within this (SLR), we are particularly interested in knowledge graph embeddings. We want to understand whether, and to which extend, knowledge graph embeddings lead to performance gains for in-KG tasks. In other words, the comparison between methods that use embeddings and those that do not is of particular interest. In the thesis, the student will systematically characterise the data sets used for in-KG tasks, the methods used for in-KG tasks, and how the performance on in-KG has changed over time, and to which extend embeddings have improved, or not, the performance. The student might also be asked to give her/his opinion on the current state of the art. The scope will be adapted according to whether it will be a master or bachelor thesis.
Supervisor: Stefan Bachhofner
Resources
CS 520, Knowledge Graphs, Stanford University, https://web.stanford.edu/class/cs520/, web.stanford.edu/~vinayc/kg/notes/Table_Of_Contents.html
References
Hogan, A., Blomqvist, E., Cochez, M., d'Amato, C., de Melo, G., Gutierrez, C., ... & Zimmermann, A. (2020). Knowledge graphs. arXiv preprint arXiv:2003.02320.
Rotmensch, M., Halpern, Y., Tlimat, A., Horng, S., & Sontag, D. (2017). Learning a health knowledge graph from electronic medical records. Scientific reports, 7(1), 1-11.
Wang, Q., Mao, Z., Wang, B., & Guo, L. (2017). Knowledge graph embedding: A survey of approaches and applications. IEEE Transactions on Knowledge and Data Engineering, 29(12), 2724-2743.
Cai, H., Zheng, V. W., & Chang, K. C. C. (2018). A comprehensive survey of graph embedding: Problems, techniques, and applications. IEEE Transactions on Knowledge and Data Engineering, 30(9), 1616-1637.
17. Performance Evolution of Out-of-Knowledge Graph Tasks: A Structured Literature Review
Supervisor: Stefan BachhofnerKnowledge graphs (KG) are means to model a domain of interest via relationships between objects, where the objects are the nodes and the relationships are the edges of a graph (Hogan, 2020). See (Rotmensch, 2017) for an example from medicine. After construction, a KG can be used for downstream tasks, which are grouped into In-KG and Out-of-KG tasks - sometimes also called applications, but we stick with tasks for now - (Wang, 2017). The Out-of-KG tasks are relation extraction, question answering, and recommender systems. In the thesis, we are interested in these Out-of-KG tasks and which methods of the past 10 to 12 years have lead to performance gains on these tasks. Within this, we are particularly interested in comparing methods which use knowledge graph embeddings and those which do not. Knowledge graph embeddings are graph embeddings, in other words, the nodes and edges of the knowledge graph are be mapped into a continuous vector space – which is referred to as embedding a knowledge graph (Wang, 2017). Embeddings are, however, not limited to nodes and edges but are also done for substructures (subset of nodes and/or edges) and even the whole-graph (Cai, 2018).
The objective of this thesis is to conduct a comprehensive structured literature review (SLR) on the performance evolution of Out-of-KG tasks. Within this (SLR), we are particularly interested in knowledge graph embeddings. We want to understand whether, and to which extend, knowledge graph embeddings lead to performance gains for Out-of-KG tasks. In other words, the comparison between methods that use embeddings and those that don’t is of particular interest. In the thesis, the student will systematically characterise the data sets used for Out-of-KG tasks, the methods used for Out-of-KG tasks, and how the performance of Out-of-KG tasks has changed over time, and to which extend embeddings have improved, or not, the performance. The student might also be asked to give her/his opinion on the current state of the art. The scope will be adapted according to whether it will be a master or bachelor thesis.
Supervisor: Stefan Bachhofner
Resources
CS 520, Knowledge Graphs, Stanford University, https://web.stanford.edu/class/cs520/, web.stanford.edu/~vinayc/kg/notes/Table_Of_Contents.html
References
Hogan, A., Blomqvist, E., Cochez, M., d'Amato, C., de Melo, G., Gutierrez, C., ... & Zimmermann, A. (2020). Knowledge graphs. arXiv preprint arXiv:2003.02320.
Rotmensch, M., Halpern, Y., Tlimat, A., Horng, S., & Sontag, D. (2017). Learning a health knowledge graph from electronic medical records. Scientific reports, 7(1), 1-11.
Wang, Q., Mao, Z., Wang, B., & Guo, L. (2017). Knowledge graph embedding: A survey of approaches and applications. IEEE Transactions on Knowledge and Data Engineering, 29(12), 2724-2743.
Cai, H., Zheng, V. W., & Chang, K. C. C. (2018). A comprehensive survey of graph embedding: Problems, techniques, and applications. IEEE Transactions on Knowledge and Data Engineering, 30(9), 1616-1637.
18. Examining Cognitive Effectiveness of Process Mining Representations
Supervisor: Djordje DjuricaSo far, research on process mining has been largely concerned with devising new algorithms for automatic discovery and conformance checking. These algorithms are commonly evaluated for their effectiveness, which is often measured using precision and recall against a gold standard. What is surprising is the fact that hardly any user studies have been conducted on using process mining tools and visual representations generated from event logs. That is a problem, because precision and recall evaluations ignore the representation format of how process mining outputs are presented to a user and the relationship to the user’s tasks at hand. Insights from process modeling research cannot be readily applied for two reasons. First, visual representations generated by process mining algorithms extend beyond the set of languages studied in modelling research and enhance existing languages with additional information. Second, the tasks considered in modeling research focus on general understanding while process mining users have more specific analytical tasks with a selective focus on parts of the process.
Therefore, the aim of this thesis is to conduct an experiment into the effectiveness of different process mining representations and tasks.
References:
Moody, D. (2009). The “physics” of notations: toward a scientific basis for constructing visual notations in software engineering. IEEE Transactions on software engineering, 35(6), 756-779.
Mendling, J., Djurica, D., & Malinova, M. (2021, September). Cognitive effectiveness of representations for process mining. In International Conference on Business Process Management (pp. 17-22). Springer, Cham.
Wohlin, C., Runeson, P., Höst, M., Ohlsson, M. C., Regnell, B., & Wesslén, A. (2012). Experimentation in software engineering. Springer Science & Business Media.
19. Measuring the availability of Open Datasets in the web, a consolidation work in monitoring Open Data portals
Supervisors: Shahrom H. Sohi, Axel PolleresBackground
Open Data has increased in popularity and many private and public stakeholders want to promote transparency and enable new business models: this access is provided by different means. The purpose is to provide direct machine readable access to the information and foster democracy and innovative reuse of publicly available data [1].
This movement has already many years, therefore the organisations have already an history on how to treat its access. The work from Neumair and Polleres, 2018 [2], analysed the Open Data Portals measuring its quality and proposing metrics, assessing “the goodness” of these Open Data sources. This represents a “generic formal model to represent data and metadata in web portals”[2]. This project has been implemented in Portal Watch data.wu.ac.at/portalwatch.
This project has been interrupted after the work of Thomas Weber in 2020 aic.ai.wu.ac.at/~polleres/supervised_theses/Thomas_Weber_BSc_2020.pdf [3]. It requires to be monitored again. The research aims to continue and extend the work analysing the new sources of data and integrate into a new version of dashboard in the data.wu.ac.at website.
Research problem
In this Bachelor thesis, you will investigate questions such as:
RQ1: What are the open data sources that are present on these archives? What is the best way to visualise the quality of data and its availability?
RQ2: What is the most continuous source of open data and why?
The goal is to consolidate the previous work and to provide ways to communicate open data portals to different stakeholders. This project suits people who are interested and willing to deepen knowledge in data analysis and data management techniques: foster their knowledge of SQL/Python, visualisation techniques.
Meet the supervisor
Feel free to book a meeting if you’re willing to discuss the topic use its Calendly
calendly.com/shahrom-sohi/30min
Reference
[1] Gurstein, 2011 “Open data: Empowering the empowered or effective data use for everyone?”
firstmonday.org/article/view/3316/2764
[2] Neumair and Polleres, 2018 “Enabling Spatio-Temporal Search in Open Data”
research.wu.ac.at/en/publications/enabling-spatio-temporal-search-in-open-data-15
[3] Weber, 2020 “Open Dataset Archive”, Bachelor Thesis
aic.ai.wu.ac.at/~polleres/supervised_theses/Thomas_Weber_BSc_2020.pdf
20. Accessing open transport information for service oriented mobility
Supervisor: Shahrom H. SohiBackground
Raising the population in urban areas challenges transportation practitioners to design more efficient mobility solutions. At the same time new forms of mobility are emerging accessible digitally [1]. Transport mode choice analysis takes care of the core factors that affect a user when they are planning to move such as Comfort, Time and Cost [2]. The growth of access to transport via digital interfaces is representing a transport need, these information need to take into account factors affecting usage of mode of transport.
At the same time Open Data increased in popularity and many private and public stakeholders want to promote transparency and enable new business models: this access is provided by different means. The purpose is to provide direct machine readable access to the information and foster democracy and innovative reuse of publicly available data [3]. Accessibility such as of the information can be explained with modern data quality assessments [4]
Research problem
In this Bachelor thesis, you will investigate questions such as:
RQ: What are the transport planning related information accessible as open data?
RQ: Can this information be categorised according to factors affecting mode of transport choice and how?
RQ: What are the benefits for transport stakeholders?
At the beginning you navigate into the basics of the mobility domain: Why do people/goods move in this way? After gathering the essential information about Transport Planning and understanding what can be useful for transport info. You collect information about availability datasets and analyse them. Additionally you can propose a new/ use case of your choice in the mobility domain selecting one of the Austrian Mobility providers or choose one past example. The thesis will be ideal for people who are curious about the mobility world and would like to dig into Data Management Specialisations. You will work with Shahrom Sohi, who is looking for transportation challenges that combine data technologies and international best practices.
Meet the supervisor
Feel free to book a meeting if you’re willing to discuss the topic use its Calendly
calendly.com/shahrom-sohi/30min
Reference
[1] Shaheen et al 2020 “Mobility on Demand Planning and Implementation: Current Practices, Innovations, and Emerging Mobility Futures”
URL : rosap.ntl.bts.gov/view/dot/50553
[2] Geurs and van Wee 2004 “Accessibility evaluation of land-use and transport strategies: review and research directions“
www.sciencedirect.com/science/article/pii/S0966692303000607
[3] Gurstein, 2011 “Open data: Empowering the empowered or effective data use for everyone?”
firstmonday.org/article/view/3316/2764
[4] Neumair and Polleres, 2018 “Enabling Spatio-Temporal Search in Open Data”
research.wu.ac.at/en/publications/enabling-spatio-temporal-search-in-open-data-15
21. Analysing the evolution of community-driven (sub-)schemas within Wikidata
Supervisors: Axel Polleres and Nicolas FerrantiWikidata is a collaborative knowledge graph not structured according to predefined ontologies. Its schema evolves in a bottom-up approach based, defined by its users. In this paper, we propose a methodology to investigate how semantics develop in sub-schemas used by particular, domain-specific communities within the Wikidata knowledge graph. In this thesis, based on a recent position paper [1], some of its suggested tasks to analyse communities should be prototypically implemented, namely: (1) an approach to identify the domain sub-schema from a set of given (domain-specific) classes and its related communities should be implemented; (2) such identified sub-schemas and communities, including their evolution over time should be analysed. The overall goal is to get better insights in the communities contributing to Wikidata themselves, raising the potential of Wikidata improvement and its internal re-use by domain experts.
[1] Sofia Baroncini, Margherita Martorana, Mario Scrocca, Zuzanna Smiech, and Axel Polleres. Analysing the evolution of community-driven (sub-)schemas within Wikidata. In Proceedings of the 3rd Wikidata Workshop (co-located with ISWC2022), October 2022. to appear. [ http://polleres.net/publications/baro-etal-2022WD.pdf ]
22. The self concept in Spiritual Knowledge Management
Supervisors: Alexander Kaiser, Birgit FordinalThe field of knowledge management (KM) has undergone a fundamental transformation, and currently new ideas to re-specify the future role of KM research and practice in a changing and increasingly dynamic world are reflected. One such new idea is the concept of Spiritual Knowledge Management. The core process of Spiritual Knowledge Management is to enable, manage and organize the deep learning process in order to develop the best version of oneself (individual) or itself (organization). One fundamental dimension and key element of spirituality, which can be found in almost all definitions and approaches of spirituality - even if they are sometimes quite different - is the self. Spirituality is connected inseparably with a continuous evolution of the self towards a fully developed and fully unfolded self.
This understanding of spirituality is in line with a number of authors - from very different fields and backgrounds - who all distinguish between different forms of the self over time.
RichardBoyatzis, who comes from the field of organizational behaviour and coaching, differentiates between the real self, that is the self a person currently is and the ideal self, this is the self a person could be [Boyatzis and Akrivou, 2006], [Boyatzis and Dhar, 2021]. Otto Scharmer, whose background is the leadership and change management area as well as the action research field, distinguishes between one’s current “self” and the emerging future “Self” that represents one’s greatest potential [Scharmer and Kaufer, 2013]. Richard Rohr comes from Catholic spirituality and differentiates between the true self and the false self [Rohr, 2013]. He argues that the true self is that part of a person that knows who you are, and whose you are. Matthew Kelly, whose background is consulting as well as spirituality proposes the concept of a ’best-version-of-myself’ in comparison to the current self [Kelly, 2004] and suggests that by finding legitimate needs, deepest desires and talents, one may be able to find this ‘best-version-of-myself’ [Kelly, 2017]. From a completely different background comes knowledge management theorist Ikujiro Nonaka. He describes knowledge creation as a continuous selftranscendental process through which one transcends the boundary of the old self (individually or as an organization) into a new self by acquiring and creating a new context, a new world view and novel knowledge [Nonaka and Konno, 1998]. Also the psychiatrist and psychotherapist Viktor E. Frankl focuses on the self aspect. He was the first to propose and introduce the term self-transcendence in detail and argued that human existence is always directed to something, or someone, other than itself and termed this constitutive characteristic of human existence ”self-transcendence” [Frankl, 1966]. Stam et al. [Stam et al., 2014] argue that the self-concept is how we perceive and what we know about ourselves, respectively. At the same time, they differentiate between the self as we currently experience it and the possible self.
In this bachelor thesis, the different self concepts and the different notions of best version of self, ideal self, true self, self, etc. will not only be outlined and described, but compared with each other. The main interest here is in the underlying assumptions and concepts of the different self concepts and how they fit together and/or also differ fundamentally.
This bachelor thesis can only be assigned to students who are very interdisciplinary oriented and interdisciplinary interested and who are highly open minded.
Ein Vorabgespräch mit einem der beiden Betreuer ist zwingend erforderlich!
Supervisor: Alexander Kaiser and Birgit Fordinal
Literatur:
[Boyatzis and Dhar, 2021] Boyatzis, R. and Dhar, U. (2021). Dynamics of the ideal self. Journal of Management Development, 41(1):1–9.
[Boyatzis and Akrivou, 2006] Boyatzis, R. E. and Akrivou, K. (2006). The ideal self as the driver of intentional change. Journal of management development.
[Bratianu, 2015] Bratianu, C. (2015). Spiritual knowledge. Organizational knowledge dynamics: Managing knowledge creation, acquisition, sharing, and transformation, pages 72–102.
[Bratianu, 2017] Bratianu, C. (2017). Emotional and spiritual knowledge. In Knowledge and Project Management, pages 69–91. Springer.
[Frankl, 1966] Frankl, V. E. (1966). Self-transcendence as a human phenomenon. Journal of Humanistic Psychology, 6(2):97–106.
[Grisold and Kaiser, 2017] Grisold, T. and Kaiser, A. (2017). Leaving behind what we are not: Applying a systems thinking perspective to present unlearning as an enabler for finding the best version of the self. Journal of Organisational Transformation & Social Change, 14(1):39–55.
[Kelly, 2004] Kelly, M. (2004). The rhythm of life: Living every day with passion and purpose. Simon and Schuster.
[Kelly, 2017] Kelly, M. (2017). Perfectly Yourself: Discovering God’s Dream for You. Beacon Publishing.
[Nonaka and Toyama, 2007] Nonaka, I. and Toyama, R. (2007). Strategic management as distributed practical wisdom (phronesis). Industrial and corporate change, 16(3):371–394.
[Rocha and Pinheiro, 2019] Rocha, R. and Pinheiro, P. (2019). Spirituality in knowledge management: Systematic literature review and future studies suggestions. In European Conference on Knowledge Managemen, volume 1, pages 892–XXVI. Academic Conferences and Publishing International
[Rohr, 2011] Rohr, R. (2011). Falling upward: A spirituality for the two halves of life. John Wiley & Sons.
[Rohr, 2013] Rohr, R. (2013). Immortal diamond: The search for our true self. John Wiley & Sons.
[Scharmer, 2001] Scharmer, C. (2001). Self-transcending knowledge. sensing and organizing around emerging opportunities. Journal of Knowledge Management, 5(2):137–150.
[Scharmer, 2009] Scharmer, C. O. (2009). Theory U: Learning from the future as it emerges. Berrett-Koehler Publishers.
[Stam et al., 2014] Stam, D., Lord, R. G., Knippenberg, D. v., and Wisse, B. (2014). An image of who we might become: Vision communication, possible selves, and vision pursuit. Organization Science, 25(4):1172–1194.
[Wong, 2016] Wong, P. T. (2016). Meaning-seeking, self-transcendence, and well-being. In Logotherapy and existential analysis, pages 311–321. Springer.
23. Exploring the Relationship between Practical Wisdom and Performance: Employing the Organizational Phronesis Scale
Supervisor: Florian KraguljThe concept of phronesis (i.e., practical wisdom), dating back to Aristotle, has recently been “rediscovered” and has entered the stage of knowledge management (Nonaka & Takeuchi, 2019, 2021). In essence, it is about doing the right thing in a particular context to promote the common good. However, the concept remains theoretically elusive and empirically difficult to test. In a recent attempt to address this shortcoming, Raysa et al. (2021a, 2021b, in prep.) propose the Organizational Phronesis Scale (OPS).
Phronesis is considered to have a positive influence on (organizational) performance. The newly proposed OPS allows for investigating this relationship empirically. In this bachelor thesis, you will theoretically relate organizational phronesis to concept(s) of organizational performance and are among the first to use the OPS in combination with another/other scale(s) you identify in the literature. You will perform basic statistical analysis (correlation) of empirical data, that you will need to obtain, and discuss your findings.
Nonaka, I., & Takeuchi, H. (2021). Humanizing strategy. Long Range Planning, 102070.
Nonaka, I., & Takeuchi, H. (2019). The wise company: How companies create continuous innovation. Oxford University Press.
Rocha G. R., Pinheiro, P., D‘Angelo, M., & Kragulj, F. (2021) Organizational Phronesis Scale Development. 22nd European Conference on Knowledge Management - ECKM 2021
Rocha G. R., Pinheiro, P., Kragulj, F., & Nunes C. (2021) There remains much to learn about organizational phronesis. Theory and Applications in the Knowledge Economy - TAKE 2021
Rocha G. R., Pinheiro, P., Kragulj, F., & Nunes C. (in prep.) ONE STEP TOWARDS RECOGNIZING THE PRACTICALLY WISE COMPANY: MEASUREMENT AND VALIDITY
Betreuer: Florian Kragulj
Write a Thesis