Your Servant

Every mathematician develops his own preferences for notation. This is necessary because there are often (I’m tempted to say “usually”) many notations for the same concept.

Consider, for instance, something as fundamental as a vector. A statistician will probably indicate that some quantity is a vector by writing its symbol in bold-face type

{\bf x}.

A physicist is more likely to indicate its vector nature by placing a small arrow (or half-arrow) above the symbol

\vec x.

If you’re working in quantum mechanics, it’s not uncommon to use the “bra-ket” notation, in which a vector is indicated by a vertical bar, followed by the symbol, followed by a right angle bracket

| x \rangle.

Alternatively, one can denote a vector by placing a subscript on the symbol

x_j .

The subscript is usually meant to denote which component of a vector is referred to. An n-dimensional vector has n components in some reference frame, e.g., a position vector may have x-, y-, and z-components which are denoted x_1, x_2, x_3.

More savvy readers may object that x_j is therefore not a vector — it’s one of the components of a vector, depending on the value of the index j. You’d be right … unless …

There’s yet another notation for vectors, not for its components, which happens to be my favorite. It’s to write a vector with a subscript


but in this case the index marker “j” is not an ordinary index. It doesn’t represent some number specifying which component of the vector is referred to. Instead it’s an “abstract index,” which is merely a marker to indicate that the given symbol x refers to a vector. It’s really no different from putting a small arrow above the symbol to indicate its vector nature, but the “vector indicator symbol” is a subscript below rather than an arrow above. I learned this notation from Penrose & Rindler’s Spinors and space-time, which is possibly my favorite book.

One can even indicate vectors with a superscript rather than subscript


Note that I’ve used a Greek letter for the index. This is common notation in tensor analysis, which is used in advanced physics (especially general relativity). If the superscript is an abstract index then this indicates a genuine vector, but if it’s a “specific index” then it indicates one of the components of the vector (depending on the value of the index). When I first learned tensor analysis (I kinda cut my teeth on it), it took some getting used to that the superscript was not an exponent, i.e., that x^2 referred to the 2nd component of the vector x rather than the square of some number x.

Just as there are many notations for vectors, so too there are multiple notations for simple vector operations like taking the inner product, or dot product, of two vectors. A statistician would indicate this simply by writing the vectors (in bold-face type) next to each other

{\bf xy}.

The physicist, however, would more likely write the symbols with little arrows and a dot between them

\vec x \cdot \vec y,

unless he was doing quantum mechanics, in which case one vector would be a mirror-image “bra” vector, written next ot the other “ket” vector

\langle x | y \rangle.

With index notation we can indicate the dot product as a simple sum

\sum_{j=1}^n x_j y_j.

However, there’s a beautiful shortcut notation known as the Einstein summation convention, in which if any index is repeated then it is to be summed over all possible values. The dot product is then

x_j y_j,

which is really just a shorthand way of writing the sum indicated in the preceding equation.

When the index is an abstract index, the inner product is written in the same way

x_\alpha y_\alpha.

This does not, however, indicate a sum — unless we know what the components are because we’ve chosen some basis for our vector space. Instead, in abstract-index notation a repeated index indicates the operation of transvection, which for vectors is the inner product. If all that seems unnecessarily confusing, rest assured there are situations in which the distinction is valuable.

When superscripts are used, there are generally two “versions” of each vector. One has a superscript and represents an “ordinary” vector, sometimes called a contravariant vector. The other has a subscript and is usually called a covariant vector, but might more rigorously be referred to as a dual vector. Strictly speaking, one is only allowed to take the inner product when one vector is contravariant and the other covariant (when one is a ordinary vector and the other is a dual vector), and it’s written as

x_\alpha y^\alpha.

This can represent a sum of components if the index \alpha is specific, or the abstract operation of transvection if the index is abstract.

The “bra-ket” and subscript-superscript notations may seem unnecessarily complicated because there are two “forms” for each vector — “bra” and “ket”, or “covariant” (dual) and “contravariant” (ordinary). It’s usually a good idea to forego this complication except in circumstances where the distinction is important. In general relativity, for instance, the contravariant and covariant vectors will have different values for their components even with the same coordinate reference frame. In quantum mechanics the “components” of a bra vector with respect to some reference frame will be the complex conjugates of the components of its associated ket vector. This reflects the fact that in such circumstances the vector spaces themselves have added complications.

For instance in quantum mechanics, the inner product is no longer a symmetric operation, so the inner product of x and y is not the same as the inner product of y and x — they’re complex conjugates of each other. This ensures that the inner product of any vector with itself will always be real-valued and positive, so it makes a suitable norm for the vector space.

In general relativity, all vectors are usually real-valued so the complex-number thing doesn’t enter, but the vector space itself has a structure such that the inner product of a vector with itself can actually be negative. In the usual (or at least, my preferred) convention, vectors with positive “norm” are time-like vectors while vectors with negative norm are space-like. There are even vectors which are not themselves zero, but give zero “norm”, called null vectors.

I guess the point of all this is that there’s a reason for each of these notations. Each has its own usefulness, and each has its own field in which it tends to be used most often. At least, that’s the case when you publish — you should use the notation common in the field so as to communicate most clearly with your intended readers. But in your own notebook, it’s the custom to use whatever notation you damn well please. At least, that’s what I do. As I said, my favorite is the abstract-index notation but the fact is that at various times and under various circumstances I use them all.

Students often tend to adhere slavishly to the notation introduced in their textbooks or by their teachers. When I was young I did that myself, until I received the best mathematical advice I ever heard:

notation should be your servant, not your master.


12 responses to “Your Servant

  1. Gavin's Pussycat

    > called null vectors.

    or ‘light-like’

    One way I found useful for looking at co/contravariant or dual vectors is, that an ordinary vector made up of components is written contravariantly, but the gradient vector of a scalar field in the same space is to be written covariantly.

  2. “Penrose & Rindler’s Spinors and space-time, which is possibly my favorite book”

    You are a dangerous individual.

  3. In most of the computational geometry literature, uv is a typo, you have to say uTv to denote the inner product (uvT for the less common outer product).

    [Response: It’s that way for statisticians too — I goofed. I use that notation so rarely, it slipped my mind.]

    In computer graphics, the default is often row vectors rather than column vectors, so you transpose everything. It’s “fun” to figure out which notation is being used in any given paper. Notation is the servant of the writer, and the master of the reader.

    But I’m curious what triggered this post.

    [Response: I was reading about some different methods for identifying empirical modes in data sets, which uses a lot of matrix algebra, and I have some quirky notation conventions which I like. Then there was the discussion in a previous post about vectors (“velocity”) vs scalars (“speed”). So …]

  4. Good points Tamino.

  5. Oh, boy. I’m studying atmospheric sciences and the notation issue is a mess.
    In Algebra, inner product was notated like this: (with arrows, but I don’t know how to make them). But it’s very confusing, since is the same exact notation we used for linear combination (so is the vector space generated by a and b). Now in Maths 3 (a mismatch of different topics regarding integrals and differential equations) linear combination was introduces with brackets so = [a,b].

    One example of useful notation is dF/dx for derivatives. I hated that notation since it was so much cumbersome that just using F’x. I only found out the power of dF/dx in physics, when we realized we can use them as separate variables and when using the chain rule.

  6. That structure of space you refer to is defined by the metric… Euclidean space has basically the Kronecker delta as its metric, Minkowski space has an opposite sign time component… g mu nu baby…

  7. And this is what makes attempting self teaching from books and papers virtually impossible.

  8. A related issue is that words can have such different meaning in different subjects. Take “positive feedback”, for example. As used in climate science you can have positive feedback and a stable system, but as the term is defined in other areas positive feedback would imply an unstable system.

    • David B. Benson

      Not so. The question of stablity is, for a linear system with feedback, entirely in the gain. If the gain is small enough the system stabilizes and indeed feedback controlled audio amplifiers do just that.

      In general for stability the Nyquist criterion must always be satisfied and simply small gain may not be enough.

    • Thomas,

      You’re right about terminology, but this is a trivial matter of becoming initated into the lingo of the field. It doesn’t change the physics anymore than changing the mathematical notation used in a problem changes the physics. But in both cases, how one formulates the problem can end up lending insight into the physical phenomena that would not be apparent if one chose an alternative setup.

      Even in climate science the net system feedback is negative for most circumstances. The problem just boils down to what someone will or will not call a feedback, which is entirely arbitrary depending on how I set up my reference system.

    • David B. Benson

      This action of feeding back of the signal from output to input gave rise to the use of the term “feedback” as a distinct word by 1920. from
      and so all other subsequent meanings are derivative.

      However, an abstraction was soon found to be highly desirable for analysis: According to Ashby, mathematicians and theorists interested in the principles of feedback mechanisms prefer the definition of “circularity of action”, which keeps the theory simple and consistent. For those with more practical aims, feedback should be a deliberate effect via some more tangible connexion. “[Practical experimenters] object to the mathematician’s definition, pointing out that this would force them to say that feedback was present in the ordinary pendulum … between its position and its momentum – a ‘feedback’ that, from the practical point of view, is somewhat mystical. To this the mathematician retorts that if feedback is to be considered present only when there is an actual wire or nerve to represent it, then the theory becomes chaotic and riddled with irrelevancies.” from ibid

      This general theory, largely worked out in the 1930s, is a great and general applicability, so much so that the term has crept into popular usage and so diluted the precise meaning used in engineering and science. This precise meaning in studying in control theory texts but I first learned it over 50 years ago from David K. Chneg’s “Analysis of Linear Systems”, intended for ee majors. It was a memorable course.

  9. Billy Joel had the right idea:

    If you search for tensorness
    it isn’t hard to find.
    You can have the script you need to live.
    But if you look for usefulness
    You might just as well be blind.
    It always seems to be so hard to give.

    Complexity is such a nerdy word.
    Everynum is so unreal.
    Complexity is simply quite absurd.
    And mostly what math will reveal