BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Institut de Science des Données - ECPv6.16.2//NONSGML v1.0//EN
CALSCALE:GREGORIAN
METHOD:PUBLISH
X-WR-CALNAME:Institut de Science des Données
X-ORIGINAL-URL:https://isdm.umontpellier.fr
X-WR-CALDESC:Évènements pour Institut de Science des Données
REFRESH-INTERVAL;VALUE=DURATION:PT1H
X-Robots-Tag:noindex
X-PUBLISHED-TTL:PT1H
BEGIN:VTIMEZONE
TZID:Europe/Paris
BEGIN:DAYLIGHT
TZOFFSETFROM:+0100
TZOFFSETTO:+0200
TZNAME:CEST
DTSTART:20230326T010000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:+0200
TZOFFSETTO:+0100
TZNAME:CET
DTSTART:20231029T010000
END:STANDARD
BEGIN:DAYLIGHT
TZOFFSETFROM:+0100
TZOFFSETTO:+0200
TZNAME:CEST
DTSTART:20240331T010000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:+0200
TZOFFSETTO:+0100
TZNAME:CET
DTSTART:20241027T010000
END:STANDARD
BEGIN:DAYLIGHT
TZOFFSETFROM:+0100
TZOFFSETTO:+0200
TZNAME:CEST
DTSTART:20250330T010000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:+0200
TZOFFSETTO:+0100
TZNAME:CET
DTSTART:20251026T010000
END:STANDARD
BEGIN:DAYLIGHT
TZOFFSETFROM:+0100
TZOFFSETTO:+0200
TZNAME:CEST
DTSTART:20260329T010000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:+0200
TZOFFSETTO:+0100
TZNAME:CET
DTSTART:20261025T010000
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTART;TZID=Europe/Paris:20250310T153000
DTEND;TZID=Europe/Paris:20250310T153000
DTSTAMP:20260624T171209
CREATED:20250603T090108Z
LAST-MODIFIED:20250603T090108Z
UID:10000420-1741620600-1741620600@isdm.umontpellier.fr
SUMMARY:Distributional Matrix Completion via Nearest Neighbors in the Wasserstein Space
DESCRIPTION:Campus St Priest (860 Rue Saint Priest 34095 Montpellier Cedex 5)\, bat. 5\, Room: 02.124\nMachine Learning in Montpellier\, Theory & Practice\nJacob Feitelberg \nWe study the problem of distributional matrix completion: Given a sparsely observed matrix of empirical distributions\, we seek to impute the true distributions associated with both observed and unobserved matrix entries. This is a generalization of traditional matrix completion where the observations per matrix entry are scalar-valued. To do so\, we utilize tools from optimal transport to generalize the nearest neighbors method to the distributional setting. Under a suitable latent factor model on probability distributions\, we establish that our method recovers the distributions in the Wasserstein metric. We demonstrate through simulations that our method (i) provides better distributional estimates for an entry compared to using observed samples for that entry alone\, (ii) yields accurate estimates of distributional quantities such as standard deviation and value-at-risk\, and (iii) inherently supports heteroscedastic distributions. In addition\, we demonstrate our method on a real-world quarterly earnings predictions dataset. We also prove novel asymptotic results for Wasserstein barycenters over one-dimensional distributions. \n            Visio
URL:https://isdm.umontpellier.fr/event/distributional-matrix-completion-via-nearest-neighbors-in-the-wasserstein-space-2/
CATEGORIES:Séminaire
ATTACH;FMTTYPE=image/jpeg:https://isdm.umontpellier.fr/wp-content/uploads/2025/02/ml-mpt-1.jpg
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=Europe/Paris:20250306T153000
DTEND;TZID=Europe/Paris:20250306T153000
DTSTAMP:20260624T171209
CREATED:20250603T090111Z
LAST-MODIFIED:20250603T090111Z
UID:10000423-1741275000-1741275000@isdm.umontpellier.fr
SUMMARY:How Should We Construct Prediction Sets? Insights from Conformal Prediction
DESCRIPTION:Campus St Priest (860 Rue Saint Priest 34095 Montpellier Cedex 5)\, bat. 5\, Room: 02.124\nMachine Learning in Montpellier\, Theory & Practice\nTiffany Ding (UC Berckley) \nIn the first part of the talk\, I will present some reflections on the purpose of prediction sets and the role that statistics can play in forming useful prediction sets. In particular\, I will discuss how prediction sets fit into a decision making pipeline and the different kinds of decisions one may make using a prediction set. In the second part of the talk\, I will describe a particular statistically motivated set-generating procedure for the classification setting called clustered conformal prediction\, which gives all classes an equal chance of being correctly included in the prediction set (“class-conditional coverage”). This procedure can be useful in situations where it is important to identify instances of all classes\, even the rare ones. We demonstrate the performance of this method on ImageNet and other image classification datasets. \n            Visio
URL:https://isdm.umontpellier.fr/event/how-should-we-construct-prediction-sets-insights-from-conformal-prediction-2/
CATEGORIES:Séminaire
ATTACH;FMTTYPE=image/jpeg:https://isdm.umontpellier.fr/wp-content/uploads/2025/02/ml-mpt-1.jpg
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=Europe/Paris:20250303T000000
DTEND;TZID=Europe/Paris:20250303T000000
DTSTAMP:20260624T171209
CREATED:20250603T090113Z
LAST-MODIFIED:20250603T090113Z
UID:10000424-1740960000-1740960000@isdm.umontpellier.fr
SUMMARY:Predicting benefit from adjuvant therapy with corticosteroids in community-acquired pneumonia: a data-driven analysis of randomised trials
DESCRIPTION:Campus St Priest\nMachine Learning in Montpellier\, Theory & Practice\nJim Smit \nDespite several randomised controlled trials (RCTs) on the use of adjuvant treatment with corticosteroids in patients with community-acquired pneumonia (CAP)\, the effect of this intervention on mortality remains controversial. We aimed to evaluate heterogeneity of treatment effect (HTE) of adjuvant treatment with corticosteroids on 30-day mortality in patients with CAP. […] \n            Visio\n                        Lire plus
URL:https://isdm.umontpellier.fr/event/predicting-benefit-from-adjuvant-therapy-with-corticosteroids-in-community-acquired-pneumonia-a-data-driven-analysis-of-randomised-trials-2/
CATEGORIES:Séminaire
ATTACH;FMTTYPE=image/jpeg:https://isdm.umontpellier.fr/wp-content/uploads/2025/02/ml-mpt-1.jpg
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=Europe/Paris:20250224T160000
DTEND;TZID=Europe/Paris:20250224T160000
DTSTAMP:20260624T171209
CREATED:20250603T090114Z
LAST-MODIFIED:20250603T090114Z
UID:10000425-1740412800-1740412800@isdm.umontpellier.fr
SUMMARY:Doubly Robust and Efficient Calibration of Prediction Sets for Censored Time-to-Event Outcomes
DESCRIPTION:Campus St Priest\nRebecca Farina \nOur objective is to construct well-calibrated prediction sets for a time-to-event outcome subject to right-censoring with guaranteed coverage. Our approach is inspired by modern conformal inference literature\, in that\, unlike classical frameworks\, we obviate the need for a well-specified parametric or semi-parametric survival model to accomplish our goal. In contrast to existing conformal prediction methods for survival data\, which restrict censoring to be of Type I\, whereby potential censoring times are assumed to be fully observed on all units in both training and validation samples\, we consider the more common right-censoring setting in which either only the censoring time or only the event time of primary interest is directly observed\, whichever comes first. Under a standard conditional independence assumption between the potential survival and censoring times given covariates\, we propose and analyze two methods to construct valid and efficient lower predictive bounds for the survival time of a future observation. The proposed methods build upon modern semiparametric efficiency theory for censored data\, in that the first approach incorporates inverse-probability-of-censoring weighting (IPCW)\, while the second approach is based on augmented-inverse-probability-of-censoring weighting (AIPCW). For both methods\, we formally establish asymptotic coverage guarantees\, and demonstrate both via theory and empirical experiments that AIPCW substantially improves efficiency over IPCW in the sense that its coverage error bound is of second-order mixed bias type\, that is doubly robust\, and therefore guaranteed to be asymptotically negligible relative to the coverage error of IPCW. \n            Visio
URL:https://isdm.umontpellier.fr/event/doubly-robust-and-efficient-calibration-of-prediction-sets-for-censored-time-to-event-outcomes-2/
CATEGORIES:Séminaire
ATTACH;FMTTYPE=image/jpeg:https://isdm.umontpellier.fr/wp-content/uploads/2025/02/ml-mpt-1.jpg
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=Europe/Paris:20250217T160000
DTEND;TZID=Europe/Paris:20250217T160000
DTSTAMP:20260624T171209
CREATED:20250603T090116Z
LAST-MODIFIED:20250603T090116Z
UID:10000428-1739808000-1739808000@isdm.umontpellier.fr
SUMMARY:Rethinking Early Stopping: Refine\, Then Calibrate
DESCRIPTION:Inria Montpellier\, St-Priest Campus\, Building 5\, Room 02/022\nMachine Learning in Montpellier\, Theory & Practice \nMachine learning classifiers often produce probabilistic predictions that are critical for accurate and interpretable decision-making in various domains. The quality of these predictions is generally evaluated with proper losses like cross-entropy\, which decompose into two components: calibration error assesses general under/overconfidence\, while refinement error measures the ability to distinguish different classes.\nIn this paper\, we provide theoretical and empirical evidence that these two errors are not minimized simultaneously during training. Selecting the best training epoch based on validation loss thus leads to a compromise point that is suboptimal for both calibration error and\, mostimportantly\, refinement error. To address this\, we introduce a new metric for early stopping and hyperparameter tuning that makes it possible to minimize refinement error during training. The calibration error is minimized after training\, using standard techniques. Our method integrates seamlessly with any architecture and consistently improves performance across diverse classification tasks. \n            Visio\n                        Article
URL:https://isdm.umontpellier.fr/event/rethinking-early-stopping-refine-then-calibrate-2/
CATEGORIES:Séminaire
ATTACH;FMTTYPE=image/jpeg:https://isdm.umontpellier.fr/wp-content/uploads/2025/02/ml-mpt-1.jpg
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=Europe/Paris:20250217T100000
DTEND;TZID=Europe/Paris:20250217T120000
DTSTAMP:20260624T171209
CREATED:20250603T090115Z
LAST-MODIFIED:20250603T090115Z
UID:10000427-1739786400-1739793600@isdm.umontpellier.fr
SUMMARY:The EU Rail Knowledge Graph
DESCRIPTION:Salle 104 du bâtiment 11 dit le château\, 2 Place Pierre Viala Campus La Gaillarde\, 34000 Montpellier\nSESAME – SEmantic web SeminAr MontpEllier \nThe European Union Agency for Railways (ERA) is an Agency in charge of facilitating the implementation of an efficient\, safe\, and interoperable rail transport across member states in Europe. To this end\, ERA maintains different registers with legal mandate covering different domains of railway (infrastructure\, rolling stock\, signalling\, safety\, humans\, etc). Since 2020\, ERA has adopted a data-centric organisation strategy\, which includes leveraging semantic web technology to the different registers. \nThis talk will dive deeper into the current implementation as knowledge graph of two registers: the register of infrastructure (RINF) and the European Register of Authorised Types of Vehicles (ERATV). I will focus on the ontology development\, the process management of the 50+ classes and 461 properties; as well as the SHACL rules (around 100). The talk will also showcase two applications consuming the knowledge graph – one for data retrieval for non SPARQL experts and the route compatibility check to answer if a certain railway vehicle can travel the route between two operational points. The talk will also highlight some challenges faced by a public authority when adopting a data-centric approach. \nLa présentation sera en français et commencera par un café offert par les Halles de l’IA de Université de Montpellier devant la salle 104 à 10h00.
URL:https://isdm.umontpellier.fr/event/the-eu-rail-knowledge-graph-2/
CATEGORIES:Séminaire
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=Europe/Paris:20250213T160000
DTEND;TZID=Europe/Paris:20250213T160000
DTSTAMP:20260624T171209
CREATED:20250603T090135Z
LAST-MODIFIED:20250603T090135Z
UID:10000431-1739462400-1739462400@isdm.umontpellier.fr
SUMMARY:Using and contributing to the data.table package for efficient big data analysis
DESCRIPTION:Inria Montpellier\, St-Priest Campus\, Building 5\, Room 01/124\nMachine Learning in Montpellier\, Theory & Practice \nData.table is one of the most efficient open-source in-memory data manipulation packages available today. First released to CRAN by Matt Dowle in 2006\, it continues to grow in popularity\, and now over 1500 other CRAN packages depend on data.table. This talk will start with data reading from CSV\, discuss basic and advanced data manipulation topics\, and finally will end with a discussion about how you can contribute to data.table.\nhttps://github.com/tdhock/2023-10-LatinR-data.table?tab=readme-ov-file#source-files-for-latinr-datatable-tutorial-slides \n            Visio
URL:https://isdm.umontpellier.fr/event/using-and-contributing-to-the-data-table-package-for-efficient-big-data-analysis-2-2/
CATEGORIES:Séminaire
ATTACH;FMTTYPE=image/jpeg:https://isdm.umontpellier.fr/wp-content/uploads/2025/02/ml-mpt-1.jpg
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=Europe/Paris:20250203T170000
DTEND;TZID=Europe/Paris:20250203T170000
DTSTAMP:20260624T171209
CREATED:20250603T094005Z
LAST-MODIFIED:20250603T094005Z
UID:10000437-1738602000-1738602000@isdm.umontpellier.fr
SUMMARY:Personalizing Treatment with Causal Inference and Scalably Evaluating LLMs in Medicine
DESCRIPTION:François Grolleau \nThis talk examines two critical aspects of data-driven medicine: personalized treatment strategies using causal inference and the robust evaluation of large language models (LLMs) for clinical applications. \nIn the first part\, we present a novel approach to personalized medicine\, applying causal statistical learning to observational data to develop individualized treatment rules. We focus on optimizing the timing of renal replacement therapy initiation in acute kidney injury\, demonstrating: (i) the estimation and validation of an optimal dynamic strategy\, and (ii) a comprehensive framework for evaluating individualized rules using observational data. \nIn the second part\, we tackle the challenge of evaluating LLMs in medicine\, focusing on the generation of hospital course summaries. Current evaluation methods are often either unscalable (physician-led) or untrustworthy for clinical settings (LLM-as-a-judge). We propose a rubric-based approach to LLM evaluation that combines the scalability of automated methods with the trustworthiness demanded by medical applications\, paving the way for responsible deployment of LLMs in healthcare. \n            Visio
URL:https://isdm.umontpellier.fr/event/personalizing-treatment-with-causal-inference-and-scalably-evaluating-llms-in-medicine-2/
CATEGORIES:Séminaire
ATTACH;FMTTYPE=image/jpeg:https://isdm.umontpellier.fr/wp-content/uploads/2025/02/ml-mpt2028129-TM2f7V.tmp_.jpg
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=Europe/Paris:20241118T150000
DTEND;TZID=Europe/Paris:20241118T150000
DTSTAMP:20260624T171209
CREATED:20241031T132136Z
LAST-MODIFIED:20241031T132136Z
UID:10000305-1731942000-1731942000@isdm.umontpellier.fr
SUMMARY:Séminaire SESAME du Professeur Preslav Nakov
DESCRIPTION:Factuality Challenges in the Era of Large Language Models: Can we Keep LLMs Safe and Factual?\nDans le cadre de l’axe transverse IA-Science des données\, le LIRMM accueillera le Lundi 18 Novembre à 15h\, le Professeur Preslav Nakov qui donnera un séminaire dans l’Amphi St Priest (JJ Moreau – Bât 2). \nVous êtes attendu à partir de 14h30 pour un accueil café. Séminaire gratuit et sans inscription. \n  \nAbstract\nWe will discuss the risks\, the challenges\, and the opportunities that Large Language Models (LLMs) bring regarding factuality. We will then delve into our recent work on using LLMs for fact-checking\, on detecting machine-generated text\, and on fighting the ongoing misinformation pollution with LLMs. We will also discuss work on safeguarding LLMs\, and the safety mechanisms we incorporated in Jais-chat\, the world’s best open Arabic-centric foundation and instruction-tuned LLM\, based on our Do-Not-Answer dataset. Finally\, we will present a number of LLM fact-checking tools recently developed at MBZUAI: (i) LM-Polygraph\, a tool to predict an LLM’s uncertainty in its output using cheap and fast uncertainty quantification techniques\, (ii) Factcheck-Bench\, a fine-grained evaluation benchmark and framework for fact-checking the output of LLMs\, (iii) OpenFactVerification (Loki)\, an open-source tool for fact-checking the output of LLMs\, developed based on Factcheck-Bench and optimized for speed and quality\, and (iv) OpenFactCheck\, a framework for building customized fact-checking systems and for benchmarking entire LLMs. \n  \nPreslav Nakov\nProfessor and Department Chair for NLP\nNatural Language Processing\nMohamed bin Zayed University of Artificial Intelligence \nDr. Preslav Nakov is a Professor and Chair of the Natural Language Processing department at Mohamed bin Zayed University of Artificial Intelligence (MBZUAI). His research focuses on computational linguistics\, large language models\, fact-checking\, disinformation\, propaganda\, and detecting machine-generated text. He helped develop Jais\, the leading open-source Arabic-centric LLM\, and is part of MBZUAI’s LLM360 team. Nakov holds a PhD from UC Berkeley and has held roles at Qatar Computing Research Institute\, the National University of Singapore\, and Sofia University. He has authored multiple books and over 300 research papers\,\nreceiving numerous awards for his work on fake news detection\, propaganda\, and machine-generated content. He is Chair-Elect of the European Chapter of the Association for Computational Linguistics (EACL) and serves on the editorial boards of several prestigious journals. His research has been featured in 100+ media outlets\, including MIT Technology Review\, Forbes\, and CNN.
URL:https://isdm.umontpellier.fr/event/seminaire-sesame-du-professeur-preslav-nakov/
CATEGORIES:Séminaire
ATTACH;FMTTYPE=image/png:https://isdm.umontpellier.fr/wp-content/uploads/2024/10/SESAME.png
END:VEVENT
BEGIN:VEVENT
DTSTART;TZID=Europe/Paris:20241021T133000
DTEND;TZID=Europe/Paris:20241021T133000
DTSTAMP:20260624T171209
CREATED:20240926T113805Z
LAST-MODIFIED:20240926T113805Z
UID:10000366-1729517400-1729517400@isdm.umontpellier.fr
SUMMARY:Séminaire Sésame
DESCRIPTION:Representation Learning for Entity Alignment: from benchmarks to real-world data and backwards\npar Konstantin TODOROV du LIRMM \nRésumé: Entity Alignment (EA) through representation learning aims to map entities from two different Knowledge Graphs (KGs) referring to the same real-world object. The mapping is performed by embedding the entities in a shared space\, where the distance between the embeddings serves as a proxy for the similarity between the entities. While many embedding-based models perform well on synthetic benchmark datasets\, they often struggle in real-world scenarios due to the heterogenous\, incomplete and domain-specific nature of real-world data. Despite efforts to create realistic benchmarks\, there has been little comparison of model performance across different by nature datasets. In this talk\, we analyse the semantic similarity and the KG profiles to explain the performance drop on real-world data. We test on real-world data models that excel on synthetic datasets\, showing that many current methods fail to find correct entity mappings in such scenarios. Furthermore\, most models do not consider entities from outside the validation data\, limiting their ability to discover new alignments in large-scale KGs. We demonstrate how restricting the search space impacts negatively generalization\, underlining the need for more robust solutions for real-world scenarios of EA.
URL:https://isdm.umontpellier.fr/event/seminaire-sesame-octobre-2024/
CATEGORIES:Séminaire
END:VEVENT
END:VCALENDAR