This is a live blog of Lecture 2 of my graduate class “Convex Optimization.” A Table of Contents is here.
This first technical lecture is about the Logo of convex sets, a “programming language” for generating all possible convex sets. I’ll tell you three rules that preserve convexity, meaning that if you apply the rules, you get convex sets back. From these three rules, we can construct all convex sets from exactly two convex sets: a line and a ray. What I appreciate most about Boyd and Vandenberghe’s book is this algorithmic perspective of geometry.
So let’s first start with the definition of what it means to be convex. A set of vectors is convex if it contains all line segments between all pairs of vectors. That’s it. Here is an example of a convex set and a nonconvex set from BV’s book.
Convex polygons are a great example to keep in mind as a fundamental convex set. Another important example is the ellipsoid.
We’ll generate many more as we go on.
Convexity has such a deceptively simple definition. It’s always wild in math how you can take a simple concept and turn it into an expansive research area. In the next lecture, I’ll tell you why it’s the right one as the primitive for algorithmic optimization. You will have to bear with me for some delayed gratification.
Now let me give you some simple rules for building convex sets.
RULE 1: Concatenation. (also called Cartesian product). Take two convex sets C1 and C2 in any two spaces. The concatenation is the set of all pairs from C1 and C2 stacked on top of each other. That is, you take an x from C1 and a z from C2 and return the vector [x,z]. I’ll denote the set of all such concatenations by C1xC2. If C1 and C2 are both convex, then C1xC2 is convex. This is because a line segment in C1xC2 is just a concatenation of line segments in C1 and C2.
RULE 2: Intersection. Now take C1 and C2 to be two convex sets in the same space. Then their intersection C1∩C2 is convex. The intersection is just the set of all points that are both in C1 and C2. If you have two points that are both in C1 and C2, the line segment between them is in C1, and it’s also in C2. Easy peasy so far.
RULE 3: Affine images and preimages. This last rule is the trickiest to verify. You’ll have to break out a sheet of paper to check that I’m telling the truth here. Take a convex set C, a matrix A, and a vector b. Then I can apply A and add b to every vector in C. The results set of vectors is an affine image of C. Any affine image is a convex set. I can also look at the set of all points x such that Ax + b is in C. This is the affine preimage of C. Both the affine image and preimage are convex sets.
I think the first two rules are a bit easier to understand than the last rule, so let me explain a couple of primitive affine transformations. The first one is scaling. If you multiply every element of a convex set by a positive number, you are either enlarging or shrinking the set. A second affine transformation is translation. If you add the same vector to every element in a convex set, you are sliding the set around in space. Affine transformations are a modestly more complex mixture of these two primitives.
Now, let me construct all convex sets using these three rules: a ray and a line. I’ll do this in stages. The first step is to introduce a primitive half space. Let H0 be the set of all vectors that are nonnegative in the first entry and arbitrary everywhere else. This space is the concatenation of a ray (first coordinate) and a bunch of lines (all the other coordinates). H0 is called a half space because it divides my vectors into two sets of “equal size.” Half of my vectors are positive in the first coordinate, half are negative in the first coordinate, and we’re taking the first half.
Now, I can make other half spaces. If I rotate H0, I can cut space in half along any direction. Rotation is an affine transformation, so all of these are convex sets. And I can translate a half space around. These are still half spaces. I’m going to assume my space is infinite, so the origin was an arbitrary center of its universe. This funny generative process lets us see what a half space is: take any line in space, pick a point on the line to be the origin, and then take all the vectors on one side of that origin.
Alright, with two of the three rules, we have generated all possible half-spaces. And with the final rule, intersection, we get everything. In two dimensions, if I intersect a bunch of half-spaces, I’ll get convex polygons like the one pictured above. In higher dimensions, we call these objects polyhedra. We let polyhedra have infinite extent like in this picture:
Polyhedra are the most important convex sets, and we’ll spend a lot of time on them. We can get even crazier shapes if we let ourselves take an infinite number of intersections. For instance, the circle is the intersection of all half spaces translated a unit distance from the origin:
I’ll come back to this in the next lecture, but it’s more or less the case that a set is convex if and only if it’s an intersection of half spaces (though you might need an infinite intersection).
The important takeaway from this first lecture is once we had the rules of composition, we never had to write proofs arguing about line segments. Modeling convex sets was then just a matter of seeing what you could create with the primitive operations. And the answer is much more vast than you might first imagine.
Wow, did you program in LOGO? One of my jobs, many years ago, was to be the official maintainer for the PDP-10. I still have a printout listing.
I think there's supposed to be a figure where you have "Xxx picture" (just before the sentence "Affine transformations are a modestly more complex mixture of these two primitives."). Or perhaps the "Xxx" version of affine transforms was too racy for Substack...