Principled memorization

All else being equal, people tend to underestimate the value of drilling and memorization in learning.a It’s cheap and effective, and we should expect it to be undervalued due to the cultural forces acting on intellectuals across many domains. Below, I focus largely on the latter aspect, which is probably the most interesting and least important.

I. First principles

Physicists in particular seem to suffer here, and to the extent that physics is an archetypal intellectual endeavor, we can draw more general lessons. It’s part of the culture that physicists don’t memorize. To the undergrad,b that’s what separates the physicist from the engineer and the pre-med, who (one assumes) spend their evenings burning tables of numbers into their retinas. Physicists learn by solving problems, not by doing exercises; by gaining deep understanding, not by cramming with flashcards.c Wasn’t it Einstein himself who wouldn’t memorize something he could look up?d To walk the path of virtue is to derive from first principles.

This culture has roots in common with the shared habits of bright slackers everywhere. For those who coasted through high school absent study habits, yet took pride in academic achievement, physics is the next best thing. We know now to be wary of raising intellect over conscientiousness, but the culture of first principles only validates our old habits. We slackers find a warm welcome, and if we find ourselves working a little harder, it’s certainly no grind.

We’re not completely fooling ourselves. An education in physics comprises many pieces of mathematical and physical content that build on and reinforce each other in complicated ways. Things we learn early on are later revealed to be special cases of more fundamental ideas, and this top-down approach is necessary to understand the deeper principles at all. Students will see a given subject as a freshman, as an upperclassman, then as a graduate student, each time drawing on additional mathematical tools and physical analogies, and each time necessarily leaving out some of the build-up to simpler facts. This repetition and progression produces sufficient fluency in most students.

The ideal student is thus drawn into a (physically, if not mathematically) “post-rigorous stage” where conceptual fluency isn’t impeded by a need to build things up from the beginning every time. Perhaps another student is forced to become comfortable working with black boxes in order to keep up (or quasi-black boxes for things learned and not fully remembered). Then the former student is the moral superior, who will besides more quickly (for example) recognize and know what to do with near-analogies or disanalogies, and is more likely to be able to rebuild her knowledge from first principles should she suffer a momentary lapse in working memory. Sounds pretty good.

II. Second principles

If we’re not completely fooling ourselves, though, we should at least acknowledge that any individual student of physics has plenty of gaps in his knowledge and takes a good deal on faith, whether or not he’s aware and whether or not it matters in practice. After enough repetition, the mention of solving a partial differential equation by separation of variables will receive mere nods from either student above, in the end without evoking any thoughts on existence or uniqueness or orthogonality or completeness (let alone proofs of the relevant theorems). If, in practice, “first principles” ends up meaning “a few steps back, since that’s what I remember having convinced myself of,” do we sacrifice our moral high ground? Do we do so in that much of the math we use, in particular, we never learned or justified rigorously, relying on the fact that most physical quantities happen to be mathematically well-behaved in the important ways? Do we do so in failing to drop everything the moment we encounter an unfamiliar idea and proceed only when we thoroughly understand it? Do we do so in learning physics from teachers and textbooks in the first place, rather than deriving everything ourselves? Shall we not cease until we hold in our minds the simultaneous workings of the Universe in their entirety? What I’m trying to indicate is that the moral/aesthetic component of deep understanding is, like many such things, in the end somewhat a matter of tiny flags and campaign pins.

There’s more of a continuum between rote memorization and deep understanding than the culture tends to acknowledge, and a student falls at a different place along it every time she encounters or discovers an idea, even the same idea at different times. In first-semester calculus, the lecturer proves each ‘rule’ for taking derivatives exactly once, forever after assuming it to be in our toolbox. One student lets her eyes glaze over for some proofs, but memorizes the rules, does a bunch of practice problems, and later returns for a thorough look at the proofs once she’s become more comfortable with the idea of differentiation. Another follows the proofs, but feels it’s beneath him to remember the rules and just re-derives them every time. A physicist later in her career only ever remembers the derivative of tan(x), and works backward to the quotient rule from that, and another just treats the quotient as a product; neither have used epsilons or deltas in years. How shall they be judged? Of whom is differentiation truly part? Who will work the fastest, with the fewest mistakes, with the most original ideas?

III. Nobody say “light is waves”

Folks learning a foreign language have to learn quite a bit of vocabulary. You can build up a big stack of flashcards, but that’s not a great strategy on its own. You can get pretty good conscious knowledge of lexical meaning without real fluency, still falling short of where you process and drawn on foreign vocab automatically. You want to learn words in context and put them to use. And when you’re memorizing vocab you still want to fit things into some low-level linguistic framework for how words in the foreign language are formed, how they’re related to each other, and so on. Matching symbols to sounds is a pretty long road. The language-acquisition literature’s pretty lengthy; you can find for yourself plenty of surveys that find commitment to rote rehearsal of vocabulary to be negatively correlated with both vocabulary size and overall proficiency.e But non-babies don’t learn languages through in-context cues alone.

There’s an analogy to physics here. I wouldn’t dream of suggesting a student memorize formulae symbol by symbol. But depending on the domain, there’s a lot to be gained by memorizing through drilling early in the process of learning, rather than through the repetition of natural use. Perhaps “drilling” is the wrong word–review here isn’t so much rote rehearsal as the reiteration of short paths of reasoning. Yes, answers should come naturally in response to questions, but so should the logic and reasoning behind the answers, the web of connections and connotations.f Recalling a definition should be largely a matter of recalling why it is the way it is, as the student understands it at the time. Not just definitions, either, but also approximate values of natural constants, common combinations of those constants, equations, functional forms, integrals and derivatives, common expressions evaluated at particular frequencies or temperatures, proof techniques, proof steps, proof hints, intuitions, analogies, historical facts. Depending on your goals, you might practice any of these as the elements of a language to be learned to fluency. One can build stronger connections faster–both rigorous and intuitive–by putting pieces in place early on, keeping them there, and reviewing the connections as they accumulate. I’d expect that you can do similar things in other domains, but I don’t know anything besides physics the generalization is left as an exercise for the reader.

Think of it as a Sapir-Whorf hypothesis that abandoned linguistics in its freefall toward tautology: with a vocabulary comprising complex ideas and a grammar of abstract problem solving, one can form deeper and more original sentences.g One thus also lowers a certain kind of activation energy: if you have the right numbers and equations and ideas at your fingertips, you can do a calculation offhand that you’d never sit down for ten minutes to work out. That opens a surprising number of doors. On the other hand, one can also waste time and fool oneself regarding how well something has been learned. I suspect many people reading this, and especially those in physics, have more to gain than to lose by memorizing more, largely as a result of cultural bias towards “deep understanding.”h Even absent that consideration, when one can effectively permanently learn something through with spaced repetition software at such a low cost,i it seems silly to spend time re-deriving and looking things up.

IV. Still worried about being misunderstood, so:

Ravi Vakil acknowledges something like the continuum between rote and deep learning in his advice to potential students: “[M]athematics is so rich and infinite that it is impossible to learn it systematically, and if you wait to master one topic before moving on to the next, you’ll never get anywhere. Instead, you’ll have tendrils of knowledge extending far from your comfort zone. Then you can later backfill from these tendrils, and extend your comfort zone; this is much easier to do than learning ‘forwards’.”

But he immediately warns the reader: “(Caution: this backfilling is necessary. There can be a temptation to learn lots of fancy words and to use them in fancy sentences without being able to say precisely what you mean. You should feel free to do that, but you should always feel a pang of guilt when you do.)”

That’s all, then. Don’t take a principled stand against learning your field thoroughly. Remember that repetition isn’t always rote.

  1. Equality here indicating that I’m not recommending real pedagogical practice, where motivation and engagement are also at stake. Also, my observation is directed largely towards otherwise proficient learners. There’s plenty to master before one gets to the point of worrying about things like this. Perhaps in another post, or several.  (back)
  2. It’s come to my attention that the following sentences are not sufficiently self-parodying. On pain of pedantry I now feel obliged to explain that this was intended to be a self-aware dig at physicists’ arrogance, spoken in the voice of this hypothetical undergrad with an inflated sense of their chosen field and ignorance of all others. Of course the article is all about how this perspective is self-serving and narrow and unjustifiable but I reluctantly cede that people might not read my mind two paragraphs in.  (back)
  3. That one can ace the Physics GRE with judicious use of flashcards is only an indication of its uselessness as a metric of physical acumen.  (back)
  4. Supposedly the speed of sound. Or Rutherford: “All science is either physics or stamp collecting”?  (back)
  5. Well, you can also find studies that argue for the utility of rote rehearsal over more complicated/celebrated “keyword” or “semantic mapping” strategies. Note for the present author: maybe worth a closer look.  (back)
  6. I’ve met a few mathematicians who are basically wizards, who treat mystifying(-for-me) problems as routine. For all I can tell, and for all they can tell me, they get their ideas beamed into their heads directly from the Source of Magic: “It seemed like the natural thing to try” is the most I can get out of them. I suspect, from my own experiences in mystifying others, that their almost-reflexive wizardry comes down to rather sophisticated grooves worn in their mind from lifelong immersion in the subject, winding paths down which they start without even noticing. That’s all a topic for another post. But my further suspicion, which partially motivates this essay, is that sophisticated grooves can be worn more deliberately. The unsatisfying extended metaphor I’m working with here suggests “pacing back and forth.”  (back)
  7. The book Photonic Crystals by Joannopoulos teaches and uses such vocabulary to striking effect–much more so than any physics textbook I can recall. I might write a full review or a problem set from this perspective, for the sake of specificity.  (back)
  8. Judging, for example, by the comments on the relevant Sequences post.  (back)
  9. Gwern provides an estimate toward a rule of thumb: “don’t use spaced repetition if you need it sooner than 5 days or it’s worth less than 5 minutes.”  (back)

2 thoughts on “Principled memorization

Leave a Reply

Your email address will not be published. Required fields are marked *