Training courses

Several pre-congress seminars are planned for this edition.

These seminars are mainly aimed at students (PhD students, postdocs, etc.) but also at experienced researchers to discover, perfect and/or consolidate their knowledge in the field of chemometrics.

The following topics will be offered in 3-hour seminars on 26 February. These training courses will be theoretical or alternatively theory/practical (no computers provided). Places are limited to 25 per session.

Fees: €100 per person per training and €50 for students.

 

 

Pre-congress seminars 

 

Philippe BASTIEN (L'Oréal Recherche)

A little Journey through causality

 

The goal of this course is to present basic concepts using the Pearl and Rubin approaches. After a historical presentation of causality, we will focus on solving certain paradoxes (Simpson) that cannot be explained by classical statistics. We will present a graphical language in the form of a DAG (directed acyclic graph) proposed by Pearl to express and visualize our causal vision of the world. We will rely on the notions of v-structures and d-separation in constructing the DAG. We will show how to identify causal effects using tools such as back-door, front-door, instrumental variables, or do-calculus. We will approach the notion of counterfactuals mainly through the work of Rubin with the notion of potential outcomes. We will show how to eliminate confounding bias by weighting observations with the propensity score. We will introduce the use of packages in R and Python.

 

Ludovic DUPONCHEL

Processing data from hyperspectral imaging

 

The goal of this course is to introduce all the tools necessary to explore hyperspectral / multispectral data coming from spectroscopic imaging experiments such as Raman, mid-infrared, near-infrared and many more. We will therefore look at this particular structure of the data cube, which mixes spectral and spatial information. After an introduction to the instrumental principles specific to these experiments, we will first approach the univariate exploration that is the basis of imaging, emphasizing its advantages and disadvantages. We will then discuss multivariate tools such as Principal Component Analysis (PCA), unsupervised classification methods (clustering, K-means, etc.), supervised classification methods (k-Nearest Neighbors, SIMCA, PLS-DA, etc.), or Multivariate Curve Resolution (MCR-ALS). These are of course classical chemometric tools, but we will approach them in this specific context of spectroscopic imaging.

 

Jean-Michel ROGER

INRAE Montpellier

Principal component analysis applied to spectral data and preprocessing of near infrared spectra

 

Principal component analysis (PCA) is the cornerstone of linear methods for processing multivariate data. The operation of this method will be presented, intuitively and formally, as well as guidelines for use. A first case of simple analysis, on epidemiological data, will illustrate the classic use of PCA. A second example of data analysis of visible - near infrared spectra will show a very different use of this method, opening the way to the understanding of many other chemometric methods, such as PLS or MCR-ALS. Spectral data, and in particular near-infrared spectra, are tainted by a certain number of distortions which pollute their analysis. A certain number of pretreatment methods are available to reduce or even eliminate the effect of this pollution. This course offers a review of the main preprocessing methods, and a strategy for choosing the methods to apply, based on the examination of spectral data.

 

Raffaele VITALE

Spectral unmixing and resolution of multivariate curves: principles and application

 

This course aims to provide a global perspective on the problems of spectral unmixing and multivariate curve resolution. We will begin with a general introduction to the nature, properties, and geometry of mixed spectroscopic data. We will then focus on one of the most commonly used chemometric approaches for their bilinear decomposition: MCR-ALS.

We will describe the methodological principles and the algorithmic implementation of this approach. We will present the results obtained for real mixing data and discuss their interpretation not only from a physicochemical point of view, but also from a mathematical and geometric point of view. In this context, we will refer to the recent work carried out in the framework of the study of the properties of the MCR-ALS algorithm and the implications arising from its use.

 

Dr. Sylvie ROUSSEL

PDG Ondalys

Review of the main Machine Learning (ML) methods

 

This course aims to provide an overview of machine learning methods applicable to laboratory instrumental data (smart data, small data and not big data and deep learning). It will start with an introduction that will define Machine Learning (ML) versus Chemometrics. An overview of the main machine learning methods will be presented. Then, specific algorithms will be explored in depth: shallow neural networks (ANN), support vector machines (SVM), and classification and regression tree (CART) / random forests (RF) methods. The course concludes with an application example.

 

Marion BRANDOLINI-BUNLON, Benoît JAILLAIS, Mohamed HANAFI

Multi-block spectroscopy and metabolomics data analysis

 

The joint analysis of multiple data tables from vibrational spectroscopy or metabolomics measurements that share the same observations or variables is a real scientific asset. Classical chemometric methods such as Principal Component Analysis (PCA) and Partial Least Squares (PLS) regression have been extended to be more efficient for the analysis of these so-called "multiblock" data. The goal of this course is to introduce the concept of multiblock data and the principles of their analysis, and to make multiblock methods more accessible and implementable for users. In particular, we will discuss multiblock data structures and the notion of canonical factorization of multiblock data, and we will reformulate existing methods based on this notion. Given the variety of approaches proposed and the resulting methods to achieve different goals, the non-specialist user may remain perplexed. This course will therefore be primarily a didactic introduction to the analysis of multiblock data when these techniques are applied to vibrational spectroscopy and metabolomics data. The analysis process, with tasks ranging from visualization of multiblock databases to innovative applications, will be presented based on several case studies. The advantages and disadvantages of the different methods will also be discussed. Finally, tools with standardized and enriched graphical output are offered for all methods. A summary of software resources available for multiblock data analysis will be provided, with a special focus on the ChemFlow tool.

 

Véronique CARIOU, Jean-Michel GALHARRET

Structural equations and their applications

 

Structural equation modeling is, in a sense, a generalization of linear regression models to complex systems. It examines the relationships between multiple blocks of paired data on individuals described by a set of observed variables that differ across blocks. In these models, an unobserved variable is associated with each of the blocks, and we are interested in all of the regression equations that link these variables together. The coefficients of these models can be estimated using covariance analysis (Jöreskog, 1970, LISREL) or the PLS approach (Wold 1982). Covariance analysis is the most widely used approach in the human and social sciences. It is also the one that has the most advanced foundations in terms of statistical validation. The PLS approach, also called PLS-PM or, more recently, PLS-SEM, is often popular because of its ability to assign a component to each block (as in PCA), thus materializing unobserved variables. After a brief mathematical introduction to this modeling in the context of latent variable models, we will illustrate the covariance analysis approach in psychology. Second, we will present the alternative with PLS-SEM by illustrating it with examples from sensometry and chemometrics. Finally, we will discuss the contribution of PLS-SEM to composite models (Dijkstra, 2013; Henseler et al., 2014).

 
Online user: 3 RSS Feed | Privacy
Loading...