I’ve been noticing for a while that certain kinds of mathematical operations are consistently harder to grasp than others, and I’ve been wondering what’s going on. Obviously, some kinds of operations are just more complicated and abstract than others, which is why we learn addition and subtraction before we learn derivatives and integrals, but that’s not really what I mean. The operations I’m thinking of all have something in common, namely, they’re the functions we usually label as “inverse functions.” In other words, they are the operations we usually think of as “undoing” other operations. For example, subtraction “undoes” addition, division “undoes” multiplication, square roots “undo” squares, and so on. So what I’m really wonder is, why is it so much harder for students to grasp the idea of taking square roots than the idea of squaring?
I don’t have an answer to any of this, so I’m just going to throw out some ideas. They pretty much all boil down to “inverse functions are cognitively more difficult to deal with than ‘regular’ functions,” but that’s essentially true by definition, since all mathematical difficulties are, by definition, cognitive, so I want to be more precise. What exactly causes the cognitive difficulties? Is the problem mathematical? Is it linguistic? Is it some sort of near-universal failure in pedagogy?
[There’s a lot of math in this post, but you should be able to get the point if you’re enough of a “math person” to understand addition and subtraction. You can ignore anything that doesn’t make sense.]
There’s a little bit of linguistic sleight of hand in what I wrote above, because inverses really come in pairs — addition “undoes” subtraction every bit as much as subtraction “undoes” addition. 10-4+4 gets you right back to 10 (where you started) just as well as 10+4-4 does. So really, we should say that addition and subtraction are inverses of each other. But for the most part, we don’t. We say that subtraction is the inverse of addition, square roots are the inverse of squares, arcsine (or inverse sine) is the inverse of sine, and so on. All of which suggests that there is something funky going on here.
Even though undoing functions is a symmetric operation, mathematically, there is an asymmetry between the operations we label as “normal” (or “forward” or “basic”) and the operations we label as “inverse.” It’s the inverse operations that are troublemakers. It’s the inverse operations that, almost inevitably, require us to either restrict the kinds of numbers (or functions) we can put into the operation or else expand the kinds of answers we’ll accept.
I can add any two natural numbers (=regular counting numbers) or zero and get another natural number out, but if I want to do subtraction, I either have to make sure that I’m always subtracting smaller numbers from bigger numbers, or else invent negative numbers. I can multiply any two whole numbers, but to do division, I either have to restrict the divisor to perfect factors or else introduce rational numbers (=fractions). Even then, I can’t divide by zero without doing some really major contortions to make infinity a sort of honorary number, contortions that most people only ever see if they study enough math to take complex analysis in college or grad school. I can square (or cube or…) any number, but I can only take the square root (or cube root or…) of a perfect square (or…) unless I invent irrational numbers. I still have to restrict my answers to positive numbers (because there are two possible answers), and I can’t take the square root of a negative number without inventing complex numbers. By the same token, I can’t take logarithms of negative numbers without using complex numbers, and I have to restrict the answers I get from inverse trig functions so I don’t get more than one answer.
By the way, the situation isn’t quite so clear when it comes to integrals. In theory, there are actually more functions that can be integrated than differentiated. To differentiate a function, it needs to be smooth so that it has a unique slope at each point. On the other hand, to find the area under a curve (the original (and still most basic) purpose of integration), you only need a continuous function, and actually, you can relax that requirement quite a lot if you work at it (by integrating piecewise, or by doing something more complex like a Lebesgue integral). On the other hand, the only kinds of integrals that are actually realistic to compute (at least from the perspective of a high school or college calc student) are the ones that can be done as antiderivatives, i.e. exactly the ones that can be thought of as inverses of derivatives. From that perspective, they really are more troublesome than derivatives. You can take the derivative of pretty much any cooperative function and the computation is more or less rote, but there are some really nice functions you can’t integrate easily at all (e^x^2 comes to mind), and integrating even some fairly simple functions requires a lot of ingenuity.
So there is a genuine mathematical difference between forward and inverse functions. It’s not really fair to say that the math is any harder. Math is just math and really doesn’t care. That said, it’s possible that all these domain and range complications create extra cognitive confusion. There are more things to think about. When you multiply numbers, you just multiply. When you divide, you either have to think about whether your choice of numbers will ‘work’ or else worry about more complicated kinds of numbers.
I hinted at another mathematical difference above, and I suspect it’s the more crucial one. Inverse functions are nearly always more complicated to compute. With the exception of subtraction, which really is just about as rote as addition, inverse operations take a certain amount of ingenuity, or guessing and checking, to carry out. Even division requires guessing how many times the divisor goes into the dividend and then multiplying to make sure you’re right. Squaring two numbers is easy, but taking square roots is a pain in the neck. (I once learned how to do this by hand, but couldn’t for the life of me tell you how.) I also have no idea how to take a logarithm by hand, unless I can break it down into logarithms I already know. Same goes for inverse trig functions. And as I said above, integration requires a lot of creativity.
A third difference between forward and inverse functions is linguistic. We often speak of inverse functions as undoing forward functions, and this can create some really nasty linguistic contortions. I can explain the concept of a square root by drawing a square with, say, an area of four and asking how long each edge is, which isn’t so bad. But the only way to actually talk about the computational aspect (possibly because of the complexities pointed out above) is to ask “how long would each edge have to be to make a square of area four?” which I suspect is harder to process than “what’s the area of this square whose edges are each length 2?” The problems get much more complicated when you get to higher powers where you can’t draw pictures easily. It’s pretty easy to break 2^4=16 down into easier operations. I can ask, “what’s 2 times 2 times 2 times 2?” Or maybe, “what’s 2 times itself four times?” On the other hand, the only way I have to ask about the fourth root of 16 is to say “what number would you have to multiply by itself four times to get 16?” That is NOT an easy question to parse.
Just from looking at the surface, it’s impossible to say whether unfortunate linguistic problems are the cognitive processing difficulties (keep in mind that any linguistic processing problem is cognitive) or whether our brains simply have no other way to think about these mathematical concepts, and so we’re stuck with linguistic contortions that reflect how we think about inverse functions. I suspect it’s the latter, but I can’t be sure. One way to tell would be to do some cross-linguistic studies to see if talking about inverse functions is so contorted in every language, but even this would only be useful if we found positive evidence (languages in which inverse functions are easier to talk about). Since most mathematical concepts get invented only once or twice and then disseminated to other people/cultures, it’s entirely possible that the awkward language just got borrowed wholesale from the language of the original inventor.
The last possible explanation I can think of is pedagogical: we just don’t do as good a job teaching inverse functions as we do normal functions. Given the range of different pedagogies, the consistency with which people have trouble with inverse functions, and the fact that these various inverse functions don’t have much in common except that they are all inverses, I find it hard to believe that the trouble with inverse functions is only a matter of some systematic pedagogical problem, though I suppose it’s possible. I also can’t imagine what those problems could be. Some students have little trouble with inverse functions, but they seem to be the same people who get everything else just fine, either because math comes naturally to them or because they had good teachers. Most of my Montessori classmates seemed to get the idea of square roots without much trouble, but then, they got squaring just fine, too.
That said, I’ve noticed some places where the typical pedagogy for inverse functions probably exacerbates the problems with learning them. At the very least, it would be nice if we didn’t teach inverse functions as an afterthought, which I’ve noticed happens a lot. This seems especially common when it comes to teaching inverse trig functions, perhaps because there aren’t very many interesting reasons for a trig student to use an inverse trig function except to undo a trig function. Even then, we could be focusing on why you might want to undo a trig function. I suspect this oversight comes from the bias in our traditional math curriculum towards computation. You can spend months and months and months learning various ways of computing trig functions, moving them around using identities, etc, etc. On the other hand, there isn’t much to be said about computing inverse trig functions except, “plug it into your calculator and see what comes out.” The idea of inverting trig functions is still important though, but the constant bias towards calculation means that students (my students at least) come away with the idea that inverse trig functions are either pointless or difficult and mysterious.
One thing that seems to be lacking for most students is any overarching awareness of the entire idea of inverse functions. I don’t think I knew what was meant by “inverse function” until sometime in college. Even then it came with all this mumbo jumbo about injective and surjective functions that obscured the basic idea that functions (usually) come in pairs and by doing one and then doing the other, you get back to where you started. (Now, I appreciate the importance of all the injective and surjective stuff that takes care of all those problems with what kinds of numbers you can put into or get out of an operation, but I don’t think it’s that important the first time around. *Ducks in preparation for mathematicians to lob insults at her*) Having that overarching picture of how functions pair up gives students a conceptual slot in which to put inverse functions, and that makes them a lot easier to think about. This is usually treated as an advanced concept, but I don’t think the idea is conceptually all that difficult. It’s an underlying organizing principle of mathematics, and I think it’s one that 7 or 8 year olds could grasp. So on the one hand, we should be paying more attention to inverse functions as inverses.
At the same time, I think it’s important to let these inverse functions stand on their own. We seem to do this really well with the most basic functions: subtraction and division, but totally lose track of it with more complicated functions. Logarithms and integrals are useful and interesting in their own right, not just as ways to undo exponentials and derivatives. Logarithms model all sorts of interesting behavior in the real world. Integrals (which, by the way, were invented before derivatives) let you do some pretty amazing geometry, and have some really intuitive visual interpretations. That said, I rarely see my students learn about integrals as integrals. Even when they do learn about integrals (as Riemann sums), the pretty much always comes after antiderivatives, which biases the whole discussion towards thinking of integrals as just undoing derivatives, and is usually presented as an afterthought, before immediately returning to the nitty gritty details of computing antiderivatives. I suspect this tendency also comes from the bias towards computation. There aren’t easy ways to compute logarithms or integrals except by undoing exponentials and derivatives, so if you’re concerned about teaching computation rather than concepts, you pretty much have to focus on the “undoing” aspect.
I’m not an expert on any of this stuff. I’ve studied math and linguistics, but everything I know about cognitive science I’ve picked up through linguistics or reading about child psychology, so I’m just brainstorming here. If you have any thoughts, or any references, please share!