GitHub

PyData Paris

I've just got back from two days in Paris, attending PyData. I have lots of free time at the moment so decided to pick an interesting conference in a cool location that I'd never visited before, and this ticked both boxes.

I didn't see too much of Paris, but what I did see was lovely. I'm actively persuading my wife for us to take a family trip there for a day or two, mainly so she can sample the pastries. I picked hotels poorly though, I was in a great location (10 minutes from the Louvre) but my hotel room directly overlooked a really busy street lined with cafes, which was great in the early evening, but the noise went on until 1am which meant I got no sleep.

The conference itself was held at Cité des Sciences, a lovely building in really lovely surroundings. The main lecture hall (Gaston Berger) was great for the larger talks but the two smaller rooms had the most uncomfortable seats I've ever sat in.

The talks themselves were 30 minutes each, with a 5 minute break in between, across 3 tracks. 30 minutes is seems to be in the Goldilocks zone of long enough to be informative without being so long as to fatigue the listener across a whole day.

My personal highlights were:

Applying Causal Inference in Industry: A Case Study from Glasswool Production

I like talks that have a practical element, and I found this one really informative. The speakers used causal inference to determine how much the age of a spinner used in glasswool production affected the thermal properties of the final product.

It was incredibly interesting to go beyond finding correlations to quantifying the actual impact of a confounding variable on an objective. The reading list given by the presenters has made its way onto my own reading list.

Probabilistic regression models

This talk presented a dataset with heteroskedastic properties, and presented several methods for different modelling strategies, as well as several evaluation methods to quantify performance. The presentation reminded me of the Classical Machine Learning module I did on my Masters, and the things I learned there turned out to be incredibly useful, so hopefully this will add a few things to my skillset that I can use in future.

Tackling Domain Shift with SKADA

Another great talk, and I'm itching to give the SKADA library a go as a result. My Masters project involves model training and this talk has made me consider writing a section on domain shift as it applies to my use-case of scheduling EV charging.

Optimal Transport in Python

I attended this one because it sounded interesting, and ended up coming away with a really useful bit of information that applies directly to my Masters project, which was entirely unexpected. Specifically a small part of my project discusses sampling methods and compares RNG to Sobol over high dimensions. I have a few metrics to show the difference but this talk discussed Wasserstein distances, which could be an incredibly useful metric to quantify how far my samples deviate from the optimal distribution.

For that nugget alone this talk was worth it, but I found the whole thing incredibly interesting.

Takeaways

I really enjoyed this conference. I always come away from these things feeling refreshed, in no small part from spending time with people who are genuinely enthusiastic about what they do. Using libraries day to day and then hearing the people who actually built them talk about them will never not be inspiring.

© 2025 Lee Morris