> Proving that something is impossible is a powerful act of mathematics. It shifts our perspective from that of rule follower to that of rule enforcer. And to enforce the rules, you must first understand them. You need to know not just how to apply them, but when they don’t apply. And you also need to be on the lookout for situations where rules might conflict with one another. Our octagon exploration exposes the interplay between polygons, convexity, right angles and angle sums. And it highlights how S = (n – 2) × 180º isn’t just a formula: It’s one condition in a world of competing conditions.

This paragraph puts to words something I’ve been mulling over for decades. Looking back to math class, or just looking at any discussion of math education, there’s always a discussion about students who just memorize and plug in the formulas, mentioned as the kinds of students who don’t understand what they’re doing. But that’s an incomplete explanation of what those students are missing that I believe the quoted paragraph completes.

It’s not about knowing how to apply a formula. It’s about understanding a formula as a property of the system you’re working with. It’s a great insight and one I’m probably going to mull over for a long while.

There's a story about when Gauss was a kid - a teacher asked the class to add up all the numbers between 1 and 100. Most kids licked their pencils and started working out the sum the hard way. Gauss immediately intuited that 0 could be paired with 100, 1 could be paired with 99, 2 could be paired with 98... and the total was obvious.

That kind of Gauss-level insight doesn't seem to be about properties so much as about understanding the relationships that define a domain - and being able to view problems in a kind of conceptual relationship space which can generate insights that massively simplify simple problems, and make harder problems tractable at all.

I remember school math was mostly rote learning, but some problems required that creative insight. Academic math seems to lean more heavily on creative insights.

Software dev is similar. You can always do things the hard way, but it takes a certain kind of insight to see time-saving shortcuts and simplifications that cut right to the heart of a problem. I suspect some people have a natural talent for that insight, while others are more likely to take the long way to a solution.

Using an algorithm discovered in 2019 (cf. Harvey and van der Hoeven), it's possible to multiply in O(log n * log log n), if you have a lot of time. This is presumably optimal.

Sure, or you just define the input (the n) to be the number of digits in (the logarithm of) the number you’re summing up to. That’s generally how people talk about big O notation when it comes to basic arithmetic operations, mostly because we’re usually talking about computers with constant-time instructions for basic arithmetic operations (granted, only up to the maximum integer or float supported by the software).

One level of mathematical understanding further would be to understand which properties (or even concepts) are the necessary consequences of other properties.

To given an example: the Pythagorean theorem, the parallelogram law, angles, transposes etc. are all properties/concepts a space has because it has an inner product.

> To given an example: the Pythagorean theorem, the parallelogram law, angles, transposes etc. are all properties/concepts a space has because it has an inner product.

Could you elaborate a bit? This sounds really interesting.

Well in short those properties all have (generalized) definitions that rely on the inner product one way or another. Denoting the inner product as <x,y> you get:

Angles: cos(angle from x to y) = <x,y> / sqrt(<x,x><y,y>).

Transpose: x^T is the dual vector defined by: x^T(y) = <x,y>.

the last one is especially subtle because most people think you can get the transpose by just flipping the vector, which is technically true but means you're basing your results on a possibly arbitrary choice of basis.

As an example of why knowing these relations is useful, imagine you want to measure angles on a map, then you might notice that there's no easily identifiable inner product, and indeed just measuring angles in an arbitrary projection will give different results depending on which projection you happen to be using. The trick is to go back to 3D space and measure the angles there (relying on the canonical inner product of 3D space) or to use a projection that preserves those angles like Mercator.

As another example conjugate gradient descent works better by using directions that are orthogonal through a more natural inner product rather than ones that just happen to be orthogonal because of your choice of basis.

The illustration at the top where sticks are used immediately made me want to provide the "outside the box" solution of arranging the sticks in 3D. But I guess an octagon is pretty clearly defined as a planar object.

Something the article didn't touch on but that I think is fundamental when trying to construct proofs: Often, you can get further in a proof by trying to prove the opposite. Assume that what you're trying to prove is impossible and try to construct a counterexample to the original proof. You will reach a point where the construction can't go further - examine closely what invariant prevents you from completing the counterexample and you will happen across something that will advance your actual proof. Note that this need not result in a proof by contradiction - it's usually more about illuminating the way forward.

In a sense, the octagon problem is teaching exactly this step. That is, the "real" problem here (and the proof that the students should arrive at) is "prove that you cannot construct a convex octagon with 4 right angles". How do you do that? You assume that it's wrong and try to find a counterexample to the original proposition - a convex octagon with 4 right angles. The ensuing "productive struggle" leads exactly to the insights you need to prove the original proposition, as detailed in the article.

I spent most of the article thinking about non-Euclidean solutions to the octagon problem; however, non-Euclidian geometry is clearly not something one would introduce to a group of students who did not already know four right angles cannot be in a convex Euclidean octagon.

In a non-Euclidean space with sufficient negative (hyperbolic) curvature it should indeed be possible to have a convex octagon with four 90° internal angles. Just like in a space with positive curvature, such as the surface of a sphere, it is possible for a triangle to have three 90° internal angles.

If you are willing to start stretching space then we can throw all the polygon definitions out the window. With a strong enough curve in space a sphere can start having corners.

No, polygons in non-Euclidean space makes sense. Polygons and more generally, geometric simplicial complexes in other spaces is a branch of mathematics all by itself (geometric group theory), so you certainly shouldn't "throw it out the window".

Proof by contradiction was a thought that went thru my mind as well when reading this article. Not sure if high school teaches that stream of thought, can't recall using such methodology in school.

Proof by contradiction is how we were shown that sqrt(2) cannot be a rational number. I'm not from the US though (and I don't even know how this generalizes within my country).

to be fair, that would have been valid. The picture is a 2D projection of the sticks, construct it in 3D and look straight down: you have understood something much deeper than just following what seem to be the rules without understanding. Which is exactly what the article is about still, so that's pretty on point.

In the theme of this excellent article: solving proof-based problems takes skill, but designing good problems in the first place is even harder. I wonder to what extent solving such problems automatically makes you better at formulating similar problems. Reminds me of how being able to understand a spoken language doesn't help with speaking as much as I expected it would (for me, anyway).

It does make you much better at formulating problems. In fact, from my experience as a mathematician and watching others do math, the vast majority of good problems comes after someone has solved a problem and then asked, "what if we change this instead?"

It's the same whether you draw it on the inside or the outside. Imagine the sphere is see-through, draw it on the inside then go out and look on the outside. The definition of "convex", "octagon", "right angle" or "straight line" doesn't change when you move from looking at the interior surface to the exterior one.

Thanks both. I still can’t visualize that octagon, but I now think you need saddle points to draw it, so a sphere wouldn’t do. If that’s true, a torus would do.

Flexing the sides requires negative curvature or non-convexity. Triangles are easy to draw because they have fewer sides than a square, thus positive curvature makes it possible to create a 3-right triangle. Octagons are the opposite.

The coin toss example given here seems tautological. Why would my unfair coin have to abide by some norm of probability? If I were designing such a coin, it would obviously include a mechanism for tracking state such that if the last toss was heads the next would always be tails. For example, it could literally redraw the upwards-facing side to the desired outcome an instant before striking the table.

I've never had a great head for mathematics because it feels like most impossibilities arise from arbitrary constraints and definitions. Reading such proofs feels like:
1) I've created an arbitrary universe with made up rules
2) This condition does/doesn't break one of those rules I made up
3) QED

For example, why on earth can't a straight line in a rectangle have a 180 degree angle at its center making that rectangle an octagon too? It's not clear why one system is epistemologically valid and the other is not. The proof of impossibility feels like a "no true scotsman"/"no true octagon" fallacy disguised by a bit of added notation.

> The coin toss example given here seems tautological. Why would my unfair coin have to abide by some norm of probability? If I were designing such a coin, it would obviously include a mechanism for tracking state such that if the last toss was heads the next would always be tails.

If a stateful coin is permitted, the problem is indeed trivial; but if we start with the assumption that the problem isn't trivial, we can assume a "passive" coin more in line with the intuitions most people would have about coins.

The unfair coin that can't be more likely to give two different results than two the same assumes that each flip has the same probabilities (so k doesn't depend on how many times you flipped it or what you got). Which is fair, but maybe a coin could be built that tends to toggle between outcomes. If it was half full of honey, and you flipped it over once in your hand after catching to read it, it might be more likely to give the other result the next time.

Anyway, this is a bit outside the box; the point was that within a set of rules there was a problem that was impossible to solve.

I’ll share one I’ve been pondering for 30 years since my geometry teacher told me it’s impossible: It’s not possible to trisect an arbitrary angle using only a straightedge and compass.

The reason basically comes down to the fact that every time you perform a compass and straight-edge construction, you are constructing quadratic extensions of number fields, whereas trisection (in general) needs a cubic extension.

The 'in general' part is important: of course, you can trisect specific angles like a 90 degree angle.

(The construction of number field extensions is happening because let's say you construct a 90 degree angle, well that allows you to construct an isosceles triangle with equal sides length 1, and therefore you construct the square root of 2, so the extension Q(sqrt(2))

Absolutely true, there is a lot to prove there. It's not an elementary explanation by any means, rather it is an outline that someone who knows Galois theory well could probably reconstruct.

The reason is because you want to be able to usefully talk about 'true vertices'. Another way to define a polygon is any convex hull of finitely many points in Euclidean 2-space with nonzero area. Then the 'real' vertices are those points on the boundary curve that are nondifferentiable ('pointy' or 'sharp').

Those are the special points you are interested in when you want to distinguish between vertices and other points on the boundary.

Conventions are not set in stone. In a math paper, you could say you allow for this case, if it makes sense. You could even allow for two of the vertices to be the same.

It takes less effort to define it without that restriction, then an octagon is just eight vertices with lines between them.

You can add various restrictions such as not allowing it to intersect itself or requiring that it is convex, but not allowing 3 successive vertices to be colinear doesn't really give you any useful properties so I'm not sure why you'd want that.

Because it has no net surface area? It is a 1-dimensional line/curve rather than a 2d surface? I thinks such definitions are a little more substantive than mere language conventions.

## 50 Comments:

> Proving that something is impossible is a powerful act of mathematics. It shifts our perspective from that of rule follower to that of rule enforcer. And to enforce the rules, you must first understand them. You need to know not just how to apply them, but when they don’t apply. And you also need to be on the lookout for situations where rules might conflict with one another. Our octagon exploration exposes the interplay between polygons, convexity, right angles and angle sums. And it highlights how S = (n – 2) × 180º isn’t just a formula: It’s one condition in a world of competing conditions.

This paragraph puts to words something I’ve been mulling over for decades. Looking back to math class, or just looking at any discussion of math education, there’s always a discussion about students who just memorize and plug in the formulas, mentioned as the kinds of students who don’t understand what they’re doing. But that’s an incomplete explanation of what those students are missing that I believe the quoted paragraph completes.

It’s not about knowing how to apply a formula. It’s about understanding a formula as a

propertyof the system you’re working with. It’s a great insight and one I’m probably going to mull over for a long while.There's a story about when Gauss was a kid - a teacher asked the class to add up all the numbers between 1 and 100. Most kids licked their pencils and started working out the sum the hard way. Gauss immediately intuited that 0 could be paired with 100, 1 could be paired with 99, 2 could be paired with 98... and the total was obvious.

That kind of Gauss-level insight doesn't seem to be about properties so much as about

understanding the relationshipsthat define a domain - and being able to view problems in a kind of conceptual relationship space which can generate insights that massively simplify simple problems, and make harder problems tractable at all.I remember school math was mostly rote learning, but some problems required that creative insight. Academic math seems to lean more heavily on creative insights.

Software dev is similar. You can always do things the hard way, but it takes a certain kind of insight to see time-saving shortcuts and simplifications that cut right to the heart of a problem. I suspect some people have a natural talent for that insight, while others are more likely to take the long way to a solution.

Gauss turned a O(n^2) problem into O(1)

The naive version is just O(n), right?

Addition is O(log n), so the naive version is O(n log n).

Gauss would have used a pre-[1960](https://en.wikipedia.org/wiki/Karatsuba_algorithm) multiplication algorithm, with complexity at best O((log n) ^ 2).

Using an algorithm discovered in 2019 (cf. Harvey and van der Hoeven), it's possible to multiply in O(log n * log log n), if you have a lot of time. This is presumably optimal.

Sure, or you just define the input (the n) to be the number of digits in (the logarithm of) the number you’re summing up to. That’s generally how people talk about big O notation when it comes to basic arithmetic operations, mostly because we’re usually talking about computers with constant-time instructions for basic arithmetic operations (granted, only up to the maximum integer or float supported by the software).

"Young man, in mathematics you don't understand things. You just get used to them." John Von Neumann to Felix Smith

One level of mathematical understanding further would be to understand which properties (or even concepts) are the necessary consequences of other properties.

To given an example: the Pythagorean theorem, the parallelogram law, angles, transposes etc. are all properties/concepts a space has

becauseit has an inner product.> To given an example: the Pythagorean theorem, the parallelogram law, angles, transposes etc. are all properties/concepts a space has because it has an inner product.

Could you elaborate a bit? This sounds really interesting.

Well in short those properties all have (generalized) definitions that rely on the inner product one way or another. Denoting the inner product as <x,y> you get:

Pythagoras: <x,y> = 0 implies <x+y, x+y> = <x,x> + <y,y>.

Paralellogram: 2(<x,x> + <y,y>) = <x+y, x+y> - <x-y, x-y>

Angles: cos(angle from x to y) = <x,y> / sqrt(<x,x><y,y>).

Transpose: x^T is the dual vector defined by: x^T(y) = <x,y>.

the last one is especially subtle because most people think you can get the transpose by just flipping the vector, which is technically true but means you're basing your results on a possibly arbitrary choice of basis.

As an example of why knowing these relations is useful, imagine you want to measure angles on a map, then you might notice that there's no easily identifiable inner product, and indeed just measuring angles in an arbitrary projection will give different results depending on which projection you happen to be using. The trick is to go back to 3D space and measure the angles there (relying on the canonical inner product of 3D space) or to use a projection that preserves those angles like Mercator.

As another example conjugate gradient descent works better by using directions that are orthogonal through a more natural inner product rather than ones that just happen to be orthogonal because of your choice of basis.

The axioms for a vector space and an inner product are exactly what you need to derive a full formal proof of all those.

The illustration at the top where sticks are used immediately made me want to provide the "outside the box" solution of arranging the sticks in 3D. But I guess an octagon is pretty clearly defined as a planar object.

Something the article didn't touch on but that I think is fundamental when trying to construct proofs: Often, you can get further in a proof by trying to prove the opposite. Assume that what you're trying to prove is impossible and try to construct a counterexample to the original proof. You will reach a point where the construction can't go further - examine closely what invariant prevents you from completing the counterexample and you will happen across something that will advance your actual proof. Note that this need not result in a proof by contradiction - it's usually more about illuminating the way forward.

In a sense, the octagon problem is teaching exactly this step. That is, the "real" problem here (and the proof that the students should arrive at) is "prove that you cannot construct a convex octagon with 4 right angles". How do you do that? You assume that it's wrong and try to find a counterexample to the original proposition - a convex octagon with 4 right angles. The ensuing "productive struggle" leads exactly to the insights you need to prove the original proposition, as detailed in the article.

I spent most of the article thinking about non-Euclidean solutions to the octagon problem; however, non-Euclidian geometry is clearly not something one would introduce to a group of students who did not

alreadyknow four right angles cannot be in a convex Euclidean octagon.In a non-Euclidean space with sufficient negative (hyperbolic) curvature it should indeed be possible to have a convex octagon with four 90° internal angles. Just like in a space with positive curvature, such as the surface of a sphere, it is possible for a triangle to have three 90° internal angles.

If you are willing to start stretching space then we can throw all the polygon definitions out the window. With a strong enough curve in space a sphere can start having corners.

No, polygons in non-Euclidean space makes sense. Polygons and more generally, geometric simplicial complexes in other spaces is a branch of mathematics all by itself (geometric group theory), so you certainly shouldn't "throw it out the window".

That can get arbitrarily close to true, but I think it is false

Can it ?

Proof by contradiction was a thought that went thru my mind as well when reading this article. Not sure if high school teaches that stream of thought, can't recall using such methodology in school.

Proof by contradiction? I remember doing some in high school in higher level math (Canada, of course high schools vary)

I learned this in the US, in high school geometry (Sophomore year).

It was also called indirect proofs by textbook, and it’s when I first fell in love with math.

Note: Linear algebra killed said love of math for me ever so slightly, when the instructor chose a terrible textbook.

Proof by contradiction is how we were shown that sqrt(2) cannot be a rational number. I'm not from the US though (and I don't even know how this generalizes within my country).

Am in US and can confirm that sqrt(2) is irrational here also.

We definitely cover it in the Indian (CBSE) syllabus.

to be fair, that would have been valid. The picture is a 2D projection of the sticks, construct it in 3D and look straight down: you have understood something much deeper than just following what seem to be the rules without

understanding. Which is exactly what the article is about still, so that's pretty on point.In the theme of this excellent article: solving proof-based problems takes skill, but designing good problems in the first place is even harder. I wonder to what extent solving such problems automatically makes you better at formulating similar problems. Reminds me of how being able to understand a spoken language doesn't help with speaking as much as I expected it would (for me, anyway).

It does make you much better at formulating problems. In fact, from my experience as a mathematician and watching others do math, the vast majority of good problems comes after someone has solved a problem and then asked, "what if we change this instead?"

One needs only a tiny bit of space curvature to get a space where a convex octagon with four right angles exists.

One could draw one on the surface of a sphere, for example, just as one can draw a triangle with three right angles (https://www.quora.com/How-can-I-draw-a-triangle-with-three-9...)

Wouldn't you need to draw the convex octagon on the inside surface of the sphere?

It's the same whether you draw it on the inside or the outside. Imagine the sphere is see-through, draw it on the inside then go out and look on the outside. The definition of "convex", "octagon", "right angle" or "straight line" doesn't change when you move from looking at the interior surface to the exterior one.

Thanks both. I still can’t visualize that octagon, but I now think you need saddle points to draw it, so a sphere wouldn’t do. If that’s true, a torus would do.

I'm pretty certain it's possible on the sphere, but I'm currently failing to demonstrate that with a pencil and a table tennis ball.

Seems it should be possible to draw the octagon that's shown in the article: draw a rectangle and flex the sides out a little bit.

I'm not convinced.

Flexing the sides requires negative curvature or non-convexity. Triangles are easy to draw because they have fewer sides than a square, thus positive curvature makes it possible to create a 3-right triangle. Octagons are the opposite.

The coin toss example given here seems tautological. Why would my unfair coin have to abide by some norm of probability? If I were designing such a coin, it would obviously include a mechanism for tracking state such that if the last toss was heads the next would always be tails. For example, it could literally redraw the upwards-facing side to the desired outcome an instant before striking the table.

I've never had a great head for mathematics because it feels like most impossibilities arise from arbitrary constraints and definitions. Reading such proofs feels like: 1) I've created an arbitrary universe with made up rules 2) This condition does/doesn't break one of those rules I made up 3) QED

For example, why on earth can't a straight line in a rectangle have a 180 degree angle at its center making that rectangle an octagon too? It's not clear why one system is epistemologically valid and the other is not. The proof of impossibility feels like a "no true scotsman"/"no true octagon" fallacy disguised by a bit of added notation.

> The coin toss example given here seems tautological. Why would my unfair coin have to abide by some norm of probability? If I were designing such a coin, it would obviously include a mechanism for tracking state such that if the last toss was heads the next would always be tails.

If a stateful coin is permitted, the problem is indeed trivial; but if we start with the assumption that the problem isn't trivial, we can assume a "passive" coin more in line with the intuitions most people would have about coins.

The unfair coin that can't be more likely to give two different results than two the same assumes that each flip has the same probabilities (so k doesn't depend on how many times you flipped it or what you got). Which is fair, but maybe a coin could be built that tends to toggle between outcomes. If it was half full of honey, and you flipped it over once in your hand after catching to read it, it might be more likely to give the other result the next time.

Anyway, this is a bit outside the box; the point was that within a set of rules there was a problem that was impossible to solve.

I’ll share one I’ve been pondering for 30 years since my geometry teacher told me it’s impossible: It’s not possible to trisect an arbitrary angle using only a straightedge and compass.

The reason basically comes down to the fact that every time you perform a compass and straight-edge construction, you are constructing quadratic extensions of number fields, whereas trisection (in general) needs a cubic extension.

The 'in general' part is important: of course, you can trisect specific angles like a 90 degree angle.

(The construction of number field extensions is happening because let's say you construct a 90 degree angle, well that allows you to construct an isosceles triangle with equal sides length 1, and therefore you construct the square root of 2, so the extension Q(sqrt(2))

You need to be careful how you phrase it though - it's possible with a _marked_ edge https://www.geogebra.org/m/cWfHr7pk

That's a long way from being a convincing argument that compass and straight-edge can

onlyproduce quadratic extensions to number fields, though.(I know it's true, just saying it looks like you've offered an explanation in elementary terms, but I don't think one exists).

Absolutely true, there is a lot to prove there. It's not an elementary explanation by any means, rather it is an outline that someone who knows Galois theory well could probably reconstruct.

My problem with the example is that it relies on (and illustrates) a convention that seems arbitrary.

Why is a polygon with three consecutive collinear vertices not a polygon? Convention. Does anyone here have insight into why it's a good convention?

To me it sounds something like saying "9/12 is not a valid fraction because it can be simplified".

The reason is because you want to be able to usefully talk about 'true vertices'. Another way to define a polygon is any convex hull of finitely many points in Euclidean 2-space with nonzero area. Then the 'real' vertices are those points on the boundary curve that are nondifferentiable ('pointy' or 'sharp').

Those are the special points you are interested in when you want to distinguish between vertices and other points on the boundary.

Conventions are not set in stone. In a math paper, you could say you allow for this case, if it makes sense. You could even allow for two of the vertices to be the same.

How do you define vertex in a way such that this would make sense? A square is a square, not an octagon with invisible vertices.

It takes less effort to define it without that restriction, then an octagon is just eight vertices with lines between them.

You can add various restrictions such as not allowing it to intersect itself or requiring that it is convex, but not allowing 3 successive vertices to be colinear doesn't really give you any useful properties so I'm not sure why you'd want that.

According to wikipedia, a polygon with three adjacent collinear points

can beconvex.https://en.wikipedia.org/wiki/Convex_polygon

Because it has no net surface area? It is a 1-dimensional line/curve rather than a 2d surface? I thinks such definitions are a little more substantive than mere language conventions.

The parent comment meant three consecutive collinear vertices

among other vertices, not a zero-area “flat triangle”.