Keynote Speakers
Prof. Evrim Acar Ataman
Simula Metropolitan Center for Digital Engineering
Extracting Insights from Complex Data: (Coupled) Tensor Factorizations & Applications
There is an emerging need to jointly analyze data sets collected from different sources in order to extract insights about complex systems such as the human brain or human metabolome. For instance, joint analysis of omics data (e.g., metabolomics, microbiome, genomics) holds the promise to improve our understanding of the human metabolism and facilitate precision health. Such data sets are heterogeneous – they are a collection of static and dynamic data sets. Dynamic data can often be arranged as a higher-order tensor (e.g., subjects by metabolites by time) while static data can be represented as a matrix. Tensor factorizations have been successfully used to reveal the underlying patterns in higher-order tensors, and extended to joint analysis of multimodal data through coupled matrix/tensor factorizations (CMTF). However, joint analysis of heterogeneous data sets still has many challenges, especially when the goal is to capture the underlying patterns. In this talk, we discuss CMTF models for temporal and multimodal data mining. We focus on a flexible, accurate and computationally efficient modelling and algorithmic framework that facilitates the use of a variety of constraints, loss functions and couplings with linear transformations when fitting CMTF models. Through various applications, we discuss the advantages and limitations of available CMTF methods.
Prof. Bettina Grün
Vienna University of Economics and Business
Clustering Data Using Bayesian Mixture Models
Cluster analysis aims a grouping objects and is a main task in exploratory data analysis, statistical data analysis and machine learning. Model-based approaches have the advantage that model specification and selection are performed within a principled statistical framework, facilitating interpretation, improving validation and including uncertainty quantification. Pursuing a Bayesian approach allows the specification of suitable priors which can include a-priori knowledge about the cluster structure to be detected as well as regularizing the likelihood.
We will give an overview on recent advances in Bayesian model-based clustering, including prior specifications as well as computational inference tools. Suitable prior specifications need to enable the selection of the number of clusters in the data set as well as appropriate cluster distribution approximation and potentially also variable selection.
Inference methods need to cover approximation methods for the posterior such as MCMC schemes, but also post-processing methods for model selection and model identification.
Prof. Paweł Lula
Krakow University of Ecomonics
Topic identification in analysis of scientific productivity – models, methods, and tools
The identification, modelling and evaluation of research trends should be considered as a very important part of the analysis of scientific achievements. The results of the analysis of research topics are indispensable in: evaluating the course of scientific development, analyzing leading scientific centers, designing and monitoring the implementation of scientific development policies, predicting the most promising directions of scientific developments, managing research units, and assessing interdisciplinarity in research.
In identifying research trends, the most important source of data is textual information in the form of scientific publications or their abstracts. During the initial stage of analysis, it is necessary to choose the appropriate method of representing textual information (frequency matrix, word sequences, embeddings). Then choose the right approach to the model building process (supervised, unsupervised). The next step involves the process of building and evaluating the quality of the model. A positive evaluation of the model justifies its implementation.
The objectives of this presentation are: to present the essential methods of identifying research topics, to conduct an evaluation of discussed algorithms, and to present software tools to implement the topic identification process.
The presentation will include the results of research on the development of research trends in Poland in the area of social sciences.
Prof. Line Clemmensen
Technical University of Denmark
Machine learning in psychiatry with distribution shifts, fairness and explainability in mind
Machine learning (ML) finds many applications within psychiatry using multiple modalities like speech, video, biosensors, and medical health records. ML models developed on open source large datasets can be challenged by distribution shifts and a lack of explainability or fairness for minority groups. We will dive into some of the challenges and look at ways of addressing these concerns going from data representativity and fairness to explainability of models.