I’ve been noticing for a while that certain kinds of mathematical operations are consistently harder to grasp than others, and I’ve been wondering what’s going on. Obviously, some kinds of operations are just more complicated and abstract than others, which is why we learn addition and subtraction before we learn derivatives and integrals, but that’s not really what I mean. The operations I’m thinking of all have something in common, namely, they’re the functions we usually label as “inverse functions.” In other words, they are the operations we usually think of as “undoing” other operations. For example, subtraction “undoes” addition, division “undoes” multiplication, square roots “undo” squares, and so on. So what I’m really wonder is, why is it so much harder for students to grasp the idea of taking square roots than the idea of squaring?

I don’t have an answer to any of this, so I’m just going to throw out some ideas. They pretty much all boil down to “inverse functions are cognitively more difficult to deal with than ‘regular’ functions,” but that’s essentially true by definition, since all mathematical difficulties are, by definition, cognitive, so I want to be more precise. What exactly causes the cognitive difficulties? Is the problem mathematical? Is it linguistic? Is it some sort of near-universal failure in pedagogy?

[There’s a lot of math in this post, but you should be able to get the point if you’re enough of a “math person” to understand addition and subtraction. You can ignore anything that doesn’t make sense.]

There’s a little bit of linguistic sleight of hand in what I wrote above, because inverses really come in pairs — addition “undoes” subtraction every bit as much as subtraction “undoes” addition. 10-4+4 gets you right back to 10 (where you started) just as well as 10+4-4 does. So really, we should say that addition and subtraction are inverses of each other. But for the most part, we don’t. We say that subtraction is the inverse of addition, square roots are the inverse of squares, arcsine (or *inverse* sine) is the inverse of sine, and so on. All of which suggests that there is something funky going on here.

Even though undoing functions is a symmetric operation, mathematically, there is an asymmetry between the operations we label as “normal” (or “forward” or “basic”) and the operations we label as “inverse.” It’s the inverse operations that are troublemakers. It’s the inverse operations that, almost inevitably, require us to either restrict the kinds of numbers (or functions) we can put into the operation or else expand the kinds of answers we’ll accept.

I can add any two natural numbers (=regular counting numbers) or zero and get another natural number out, but if I want to do subtraction, I either have to make sure that I’m always subtracting smaller numbers from bigger numbers, or else invent negative numbers. I can multiply any two whole numbers, but to do division, I either have to restrict the divisor to perfect factors or else introduce rational numbers (=fractions). Even then, I can’t divide by zero without doing some really major contortions to make infinity a sort of honorary number, contortions that most people only ever see if they study enough math to take complex analysis in college or grad school. I can square (or cube or…) any number, but I can only take the square root (or cube root or…) of a perfect square (or…) unless I invent irrational numbers. I still have to restrict my *answers* to positive numbers (because there are two possible answers), and I can’t *take* the square root of a negative number without inventing complex numbers. By the same token, I can’t take logarithms of negative numbers without using complex numbers, and I have to restrict the answers I get from inverse trig functions so I don’t get more than one answer.

By the way, the situation isn’t quite so clear when it comes to integrals. In theory, there are actually *more* functions that can be integrated than differentiated. To differentiate a function, it needs to be smooth so that it has a unique slope at each point. On the other hand, to find the area under a curve (the original (and still most basic) purpose of integration), you only need a continuous function, and actually, you can relax that requirement quite a lot if you work at it (by integrating piecewise, or by doing something more complex like a Lebesgue integral). On the other hand, the only kinds of integrals that are actually realistic to compute (at least from the perspective of a high school or college calc student) are the ones that can be done as antiderivatives, i.e. exactly the ones that can be thought of as inverses of derivatives. From that perspective, they really are more troublesome than derivatives. You can take the derivative of pretty much any cooperative function and the computation is more or less rote, but there are some really nice functions you can’t integrate easily at all (e^x^2 comes to mind), and integrating even some fairly simple functions requires a lot of ingenuity.

So there *is* a genuine mathematical difference between forward and inverse functions. It’s not really fair to say that the *math* is any harder. Math is just math and really doesn’t care. That said, it’s possible that all these domain and range complications create extra cognitive confusion. There are more things to think about. When you multiply numbers, you just multiply. When you divide, you either have to think about whether your choice of numbers will ‘work’ or else worry about more complicated kinds of numbers.

I hinted at another mathematical difference above, and I suspect it’s the more crucial one. Inverse functions are nearly always more complicated to compute. With the exception of subtraction, which really is just about as rote as addition, inverse operations take a certain amount of ingenuity, or guessing and checking, to carry out. Even division requires guessing how many times the divisor goes into the dividend and then multiplying to make sure you’re right. Squaring two numbers is easy, but taking square roots is a pain in the neck. (I once learned how to do this by hand, but couldn’t for the life of me tell you how.) I also have no idea how to take a logarithm by hand, unless I can break it down into logarithms I already know. Same goes for inverse trig functions. And as I said above, integration requires a *lot* of creativity.

A third difference between forward and inverse functions is linguistic. We often speak of inverse functions as undoing forward functions, and this can create some really nasty linguistic contortions. I can explain the concept of a square root by drawing a square with, say, an area of four and asking how long each edge is, which isn’t so bad. But the only way to actually talk about the computational aspect (possibly because of the complexities pointed out above) is to ask “how long would each edge have to be to make a square of area four?” which I suspect is harder to process than “what’s the area of this square whose edges are each length 2?” The problems get much more complicated when you get to higher powers where you can’t draw pictures easily. It’s pretty easy to break 2^4=16 down into easier operations. I can ask, “what’s 2 times 2 times 2 times 2?” Or maybe, “what’s 2 times itself four times?” On the other hand, the only way I have to ask about the fourth root of 16 is to say “what number would you have to multiply by itself four times to get 16?” That is NOT an easy question to parse.

Just from looking at the surface, it’s impossible to say whether unfortunate linguistic problems *are* the cognitive processing difficulties (keep in mind that any linguistic processing problem *is* cognitive) or whether our brains simply have no other way to think about these mathematical concepts, and so we’re stuck with linguistic contortions that reflect how we *think* about inverse functions. I suspect it’s the latter, but I can’t be sure. One way to tell would be to do some cross-linguistic studies to see if talking about inverse functions is so contorted in every language, but even this would only be useful if we found positive evidence (languages in which inverse functions are easier to talk about). Since most mathematical concepts get invented only once or twice and then disseminated to other people/cultures, it’s entirely possible that the awkward language just got borrowed wholesale from the language of the original inventor.

The last possible explanation I can think of is pedagogical: we just don’t do as good a job teaching inverse functions as we do normal functions. Given the range of different pedagogies, the consistency with which people have trouble with inverse functions, and the fact that these various inverse functions don’t have much in common *except* that they are all inverses, I find it hard to believe that the trouble with inverse functions is only a matter of some systematic pedagogical problem, though I suppose it’s possible. I also can’t imagine what those problems could be. Some students have little trouble with inverse functions, but they seem to be the same people who get everything else just fine, either because math comes naturally to them or because they had good teachers. Most of my Montessori classmates seemed to get the idea of square roots without much trouble, but then, they got squaring just fine, too.

That said, I’ve noticed some places where the typical pedagogy for inverse functions probably exacerbates the problems with learning them. At the very least, it would be nice if we didn’t teach inverse functions as an afterthought, which I’ve noticed happens a lot. This seems especially common when it comes to teaching inverse trig functions, perhaps because there *aren’t* very many interesting reasons for a trig student to use an inverse trig function *except* to undo a trig function. Even then, we could be focusing on why you might *want* to undo a trig function. I suspect this oversight comes from the bias in our traditional math curriculum towards computation. You can spend months and months and months learning various ways of computing trig functions, moving them around using identities, etc, etc. On the other hand, there isn’t much to be said about *computing* inverse trig functions except, “plug it into your calculator and see what comes out.” The *idea* of inverting trig functions is still important though, but the constant bias towards calculation means that students (my students at least) come away with the idea that inverse trig functions are either pointless or difficult and mysterious.

One thing that seems to be lacking for most students is any overarching awareness of the entire *idea* of inverse functions. I don’t think I knew what was meant by “inverse function” until sometime in college. Even then it came with all this mumbo jumbo about injective and surjective functions that obscured the basic idea that functions (usually) come in pairs and by doing one and then doing the other, you get back to where you started. (Now, I appreciate the importance of all the injective and surjective stuff that takes care of all those problems with what kinds of numbers you can put into or get out of an operation, but I don’t think it’s that important the first time around. *Ducks in preparation for mathematicians to lob insults at her*) Having that overarching picture of how functions pair up gives students a conceptual slot in which to put inverse functions, and that makes them a lot easier to think about. This is usually treated as an advanced concept, but I don’t think the idea is conceptually all that difficult. It’s an underlying organizing principle of mathematics, and I think it’s one that 7 or 8 year olds could grasp. So on the one hand, we should be paying more attention to inverse functions *as* inverses.

At the same time, I think it’s important to let these inverse functions stand on their own. We seem to do this really well with the most basic functions: subtraction and division, but totally lose track of it with more complicated functions. Logarithms and integrals are useful and interesting *in their own right*, not just as ways to undo exponentials and derivatives. Logarithms model all sorts of interesting behavior in the real world. Integrals (which, by the way, were invented *before* derivatives) let you do some pretty amazing geometry, and have some really intuitive visual interpretations. That said, I rarely see my students learn about integrals *as* integrals. Even when they do learn about integrals (as Riemann sums), the pretty much always comes *after* antiderivatives, which biases the whole discussion towards thinking of integrals as just undoing derivatives, and is usually presented as an afterthought, before immediately returning to the nitty gritty details of computing antiderivatives. I suspect this tendency also comes from the bias towards computation. There aren’t easy ways to compute logarithms or integrals except by undoing exponentials and derivatives, so if you’re concerned about teaching computation rather than concepts, you pretty much have to focus on the “undoing” aspect.

I’m not an expert on any of this stuff. I’ve studied math and linguistics, but everything I know about cognitive science I’ve picked up through linguistics or reading about child psychology, so I’m just brainstorming here. If you have any thoughts, or any references, please share!

I have been pondering this topic recently as well, particularly while tutoring. It keeps me busy updating the written version of my tutoring mini-lecture: http://mathmaine.wordpress.com/2010/01/05/arithmetic-operation-pairs/

Unfortunately, my description does not fill any of the gaping cognitive holes you so eloquently described. So, I guess we’ll all just have to keep trying!

http://mathmaine.wordpress.com

(sorry to have posted this in the wrong spot originally – feel free to delete my earlier comment)

I only had time to skim this, Alexa, because I’m due at the NAMTA Montessori conference in an hour. But I wanted to say, a child who has “experienced” the square of a binomial sensorially from about the age of 4 and has an impression of the 2 squares and 2 rectangles it’s made of, is delighted to learn how to calculate the binomial square root in the elementary class. If memory serves, you were!

I did find, however, that even when the on-paper extraction of square root was finally mastered, step by step, most of the children I worked with reserved that method for only roots with 3 or more digits. For 2 digit square root, most preferred to make a quick sketch that reproduced the material they’d worked with to learn the process.

What’s up with you for next year? Any sponsorships in the offing?

You’re absolutely right about square roots being easy when you’ve absorbed the experience over the course of most of a (rather short) lifetime. The way I see it, that approach is even more important for concepts like roots than for relatively simple ideas like addition, simply because the ideas are, I suspect, cognitively more difficult.

How unusual the Montessori approach to math is was driven home to me a few days ago when I read this description of 7th grade math on the website of a Waldorf school: “For the first time, the teacher introduces mathematical concepts with no relationship to physical perception: negative numbers, square and cube roots and ratios, which make real demands on students’ imaginative powers.” *What*?!?! I know what a cube root looks like. I know what a cube root feels like! Also, 7th grade? (Admittedly, Waldorf isn’t mainstream either, but that sounds much more like the mainstream approach than starting with a sensory experience of binomial squares and cubes at age 4 or so.)

I wasn’t actually thinking about extracting square roots on paper, I was thinking about grasping the concept. To be honest, that’s one of the only math lessons from my Montessori days that I have more or less never had any use for, and I barely remember it at all. (I vaguely remember that it was a little like long division and involved working with the number two digits at a time.) I suppose it’s always useful to be able to do calculations without needing to resort to a calculator or a table, but learning to extract square roots by hand still seems like a bit of an anachronism to me. On the other hand, that’s next on my list of things to learn to do on my abacus, just as soon as I get comfortable with division…

Dear All,

The pedagogic method of extracting roots taught in schools is really horrible to say the least.

The easy way of extracting roots (square roots, cube roots or any roots) is by Heron’s method.

Heron’s method of finding square root:

————————————————-

If n is the square root of N, obviously, dividing N by n gives n.

We can look at the square root as the average of the factor and the quotient.

If you divide N by a number x which is not the square root, you will get the quotient different from the square root.

However the average of the factor and the quotient is closer to the actual root than the starting number x.

This is the principle of Heron’s method of finding square root of a number.

Ex: To find the square root of 500:

——————————————–

Let us guess that the square root is 20.

Divide 500 by 20 to get the quotient 25.

Take the average of the factor 20 and the quotient 25 which is 22.5.

This 22.5 is closer to the actual root of 500 than the initial estimate of 20.

Repeating the above process:

500/22.5 = 22.2222

Average of 22.5 and 22.2222 is 22.3611.

For more accuracy, we can repeat the step once again to get the next estimate as 22.36068.

The actual square root of 500 is 22.36068.

I love to extend this Heron’s method for finding any root of any number.

For finding cube root, divide twice and take the average of the two divisors and the final quotient.

For finding fourth root, divide thrice and take the average of the three divisors and the final quotient.

Ex: To find the cube root of say 78654

————————————————

Let the initial guess be 40.

Step 1: 78654 / 40 = 1966.35

Step 2: 1966.35 / 40 = 49.15875

The average of 40, 40 and 49.15875 is 43.05292.

You can repeat the above process with the starting number as 43 (No need to start with 43.05292).

Actual cube root of 78654 is 42.84567.

Even if you start with a very wild initial guess, you will only need a few more iterations to reach the answer.

Ex: To find the fourth root of say 78654.

Let the initial guess be 20.

Step 1: 78654 / 20 = 3932

Step 2: 3932 / 20 = 196

Step 3: 196 / 20 = 10

The average of 20, 20, 20 and 10 is 17.5.

Repeat the above process with the starting number as say 17.

Step 1: 78654 / 17 = 4627

Step 2: 4627 / 17 = 272

Step 3: 272 / 17 = 16

The average of 17, 17, 17 and 16 is 16.75.

Actual 4th root of 78654 is 16.74674.

Ex: To find the fifth root of say 78654.

Let the initial guess be 10.

Step 1: 78654 / 10 = 7865.4

Step 2: 7865.4 / 10 = 786.54

Step 3: 786.54 / 10 = 78.654

Step 4: 78.654 / 10 = 7.8654

The average of 10, 10, 10, 10 and 7.8654 is 9.5731.

Actual 5th root of 78654 is 9.531125.

Our first iteration itself is quite close to the actual root. Is it not great?

Tips: Start with a convenient round figure as the initial guess to make divisions easier. The next starting number can again be rounded for easing future divisions.