A Not-So Primordial Soup

Inactive Learning?

Wed, 25 Sep 2024 12:00:00 -0700

I totally stole the title from a paper (Attenberg & Provost, 2011).

In theory, Active Learning (AL) is a tremendous idea. You need labeled data, but your kind of labeling comes at a cost, e.g., you need to obtain them from a domain expert. Now, lets say, your goal is to use this labeled data to train a classifier that gets to a held-out accuracy of \(90\%\). If you randomly sampled points to label, you might require \(1000\) points. Active Learning lets you strategically pick just \(500\) points for labeling, to reach the same accuracy. Half the labeling cost for the same outcome. This is great!

Except that in a lot of real-world cases this is not how it plays out. I suspected this from my personal experiments, and then in some stuff we did at [24]7.ai. So we decided to thoroughly test out multiple scenarios in text classification, where you believe (or current literature leads us to believe) Active Learning should work … but it just doesn’t. We summarized our observations into the paper “On the Fragility of Active Learners for Text Classification” (Ghose & Nguyen, 2024) [PDF], and th...

Jensen's Inequality - A Visual Intuition

Sun, 15 Sep 2024 23:44:00 -0700

Jensen’s inequality finds widespread application in mathematical proofs. I am fond of a particular intuitive explanation of it, which doesn’t seem to be very popular. I will try to present it in brief here.

I am not sure when this argument originated, but Google does turn up a paper (Needham, 1993). Even if this is not the source, it is a good reference. On a related note, the author of the paper, Tristan Needham, has the very well-reviewed book “Visual Complex Analysis” to his credit.

For the record, Wikipedia states one version in this way:

Jensen's Inequality, as on Wikipedia.

But don’t read too much into this yet; we’ll discover this form along the way. This is going to sound surprising but let’s start with the idea of the center of mass (CM).

Center of Mass

For our purposes, we can think of th...

Bayesian Optimization, Part 2: Acquisition Functions

Sat, 18 Nov 2023 11:00:00 -0800

This post continues our discussion on BayesOpt. This is part-2 of a two-part series. Now we take a look at the other pillar BayesOpt rests on: acquisition functions. My goal is to provide a flavor by looking at a few of them. I’ll go into depth for a couple; this would help us appreciate the role of GPs in conveniently calculating acquisition values. For the rest I’ll provide an overview.

Acquisition Functions

Bayesian Optimization, Part 1: Key Ideas, Gaussian Processes

Sat, 18 Nov 2023 11:00:00 -0800

The real reason I like Bayesian Optimization: lots of pretty pictures!

If I wanted to sell you on the idea of Bayesian Optimization (BayesOpt), I’d just list some of its applications:

Hyperparameter Optimization (HPO) (Turner et al., 2021).
Neural Architecture Search (NAS) (White et al., 2021).
Molecule discovery (Gómez-Bombarelli et al., 2018).
Liquid chromatography (Boelrijk et al., 2023).
Creating low-carbon concrete (Ament et al., 2023).
Plasma control in nuclear fusion (Mehta et al., 2022).
Parameter tuning for lasers MathJax.Hub.Config({ "HTML-CSS": { scale: 100, linebreaks: { automatic: true } }, SVG: { linebreaks: { automatic:true } }, displayAlign: "center" });
Generative Models have been all the rage in AI lately, be it image generators like Stable Diffusion or text generators like ChatGPT. These are examples of fairly sophisticated generative systems. But whittled down to basics, they are a means to:
- (a) concisely represent patterns in data, in a way that …
- (b) they can generate later what they have “seen”.
A bit like an artist who witnesses a scenery and later recreates it on canvas using her memory; her memory acting as a generative model here.

In this post, I will try to illustrate this mechanism using a specific generative model: the Gaussian Mixture Model (GMM). We will use it to capture patterns in images. Pixels will be our data, and patterns are how they are “lumped” together. Of course, this lumping is what humans perceive as the image itself. Effectively then, much like our artist, we will use a generative model to “see” an image and then have it reproduce it later. Think of this as a rudimentary, mostly visual, tutorial on GMMs, where we focus on their representational capability. Or an article where I mostly ramble but touch upon GMMs, use of probabilities, all the w...

Hello New Blog!

Wed, 26 Apr 2017 08:01:36 -0700

Moving to a new place can be hectic and tiresome. I am moving my blog, from here, and it’s none of those.¹ /s

I tend towards writing technical posts when I tend towards writing at all these days, and blogger doesn’t give me the presentation options I need. So, for now, its GitHub pages, but with my own domain. That way, if I decide to move again, my (almost non-existent) readers won’t be sent scrambling to find my (almost non-existent) content.

The old blog was titled “Random Thoughts”. I wanted something different and bit more original this time, so I Googled “Not So Random Thoughts”. Obviously.

So many hits it isn’t even funny. So many, that you couldn’t squint and ignore. And that is exactly why you are stuck with “A Not So Primordial Soup”; which, by the way, does a good job of telling you that this is going to be a mixed bag of the deep and the frivolous.

Just for the record, “The Psionic Poodle” was on the list. Since that isn’t the title, joy to us, things could have been worse.

I have been asked about the domain name “quipu strands” (by the 3 and a 1/2 readers I have) . These were a device used by the Incas to...