Numbers Matter

• •

Explorations in Introductory Representation Theory


Introduction

In this post, we aim to glean as much as we can about the characters of the symmetric groups (today we’ll be focusing on S_3 and S_4) using simple properties of characters.

In particular, here are the main tools we’ll be utilizing today and in upcoming posts where we plan to tackle finite matrix groups like \mathrm{GL}_{n}(\mathbb{F}_p) for a positive integer n and prime p. In what follows, let G be a finite group and h the number of conjugacy classes of G.

  • The irreducible characters \chi_{1}, \cdots, \chi_{h} forms an orthonormal basis for the vector space of class functions \mathcal{C}(G). Recall that a class function is one that is constant over when restricted to a single conjugacy class of G.
    In particular, this means \langle \chi_{1}, \chi_2\rangle_{G} = 1 if \chi_{1}=\chi_2 and \langle \chi_1, \chi_2\rangle_{G}=0 if \chi_{1}\neq \chi_{2} for irreducible characters G.
  • Let (\rho_i, V_i) for 1\leq i\leq h be the irreducible representations of G. Then we have the sum of squares formula: |G| = \sum_{i=1}^{h}\dim (V_i)^2.
  • Lastly, given two representations (\rho_{V}, V) and (\rho_{W}, W) of G, we can use the tensor product of vector spaces to create another representation – the tensor product representation – of G. We let the vector space of the new representation be V\otimes V. Next, a g\in G acts on V\otimes V by the diagonal action: g\cdot (v\otimes w) = g\cdot v \otimes g\cdot w, for all elementary tensors v\otimes w an extend linearly to all tensors. Later, we’ll see the effect on tensor-ing two representations on the character.

First up, S_3!

The most natural representation of G=S_3 (in fact, this hardly seems like a representation at all!) would to be let \rho: S_3\rightarrow \mathrm{GL}(V) be such that it sends g\in S_3 to its 3 by 3 permutation matrix and V = \mathbb{C}^3, the three dimensional space. For instance,

\rho(123)=\begin{bmatrix} 0 & 0 & 1 \\ 1 & 0 & 0 \\ 0 & 1 & 0 \end{bmatrix} \in \textrm{GL}(\mathbb{R}^3),

and

\rho(23) = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{bmatrix}\in\textrm{GL}(\mathbb{R}^3).

Unfortunately, this is not an irreducible one: all the \rho(g)‘s leave (a,a,a)\in \mathbb{R}^3 for some complex number a. In other words, W:=\textrm{span}\{(1,1,1)\} is an invariant subspace of V (an invariant line, to be sure). The corresponding projection operator p:V\rightarrow W onto W is

p(x_1,x_2,x_3)=\frac{x_1+x_2+x_3}{3}(1,1,1),

for all (x_1,x_2,x_3)\in \mathbb{R}^3, and

W':=\ker(p)=\textrm{span}\{(1,0,-1),(0,1,-1)\} = \{(x,y,-x-y):x,y\in \mathbb{R}\}.

That is, W' is the orthogonal complement of W under the usual dot product. By definition, V=W\oplus W'.

According to Maschke’s theorem,  W' is also an invariant subspace. We can do a quick spot check: \rho(g)((-x-y,x,y))=(-x-y,x,y) for g=(132). Setting (-x-y,x,y)=\alpha(1,0,-1)+\beta(0,1,-1), we see that \alpha=-x-y and \beta=x, works and so (-x-y,x,y) \in W', as we expect.

The sub-representation (\rho|_{W},W) is thus just the trivial representation: There just isn’t much freedom offered by a good ‘ol line. However, the degree 2 representation (\rho|_{W'},W') is more interesting. Notice that \textrm{dim}W'=2 so \textrm{GL}(W') can be identified with 2 by 2 matrices. Fixing \mathcal{B}=\{(1,0,-1),(0,1,-1)\} as a basis for W', and then writing \rho(g) for g\in G as a matrix, gives us:

\rho(12)=\begin{bmatrix}0&1\\1&0\\\end{bmatrix},\quad \rho(13)=\begin{bmatrix}-1&-1\\0&1\\\end{bmatrix},\quad \rho(23)=\begin{bmatrix}1&0\\-1&-1\\\end{bmatrix},

and

\rho(123)=\begin{bmatrix}0 & 1 \\ -1 & -1 \end{bmatrix} \quad\textrm{and}\quad \rho(132)=\begin{bmatrix}-1 & -1 \\ 1 & 0 \end{bmatrix}.

Of course, the identity goes to I_2 as usual. This representation, by definition, is irreducible: vectors here must have at least two distinct coordinates, which will switch places when one applies the relevant transposition to the vector from S_3.

The corresponding character \chi_{\rho|_W'} (recall that the trace does not depend on the choice of the basis of W') is just 0 if the permutation is even and not the identity and -1 is the permutation is odd. Taking into account the identity, we can say \chi_{\rho|_{W'}}(\sigma)\equiv 0 \mod 2 if \sigma is even and \chi_{\rho|_{W'}}(\sigma)\equiv 1 \mod 2 is \sigma is odd.

We call this character \chi_{\textrm{standard}} and the corresponding representation the standard representation, (\rho_{\textrm{standard}},W). Check that the inner product of the standard character with itself is 1, confirming the fact that this representation is irreducible.

Now, let that unknown third and final (recall that S_3 has 3 conjugacy classes, one for each partition of 3, so we’re looking at three irreducible characters) irreducible character be \chi: Let \chi(\sigma)=\alpha for transpositions \sigma and \chi(\tau)=\beta for 3-cycles \tau. To find \alpha and \beta, we’re going to leverage the orthogonality of irreducible characters. Indeed,

\langle\chi, \chi_{\text{trivial}}\rangle=\frac{1}{6}\sum_{\sigma \in S_{3}}\chi(\sigma)\overline{\chi_{\text{trivial}}(\sigma)}=0\implies 3\alpha+2\beta=-1,

and utilizing the other character we have

\langle \chi, \chi_{\text{standard}}\rangle=0\implies \beta =1.

Putting these numbers together, we get

\chi(e)=\chi(123)=\chi(132)=1

and

\chi(12)=\chi(13)=\chi(23)=-1.

Thus, \chi(\sigma) is simply the sign of \sigma! We call it the sign character: \chi_{\text{sign}}.

All in all, we now have the character table of S_3! While

\text{identity} 2-\text{cycles} 3-\text{cycles}
\chi_{\text{trivial}} 1 1 1
\chi_{\text{standard}} 1 0 -1
\chi_{\text{sign}} 1 -1 1

While to an outsider this table might seem to contain 9 independent pieces of information, the restrictions induced by the deeper symmetries of the group means that in reality the entire table can be filled out given just two entries: \chi_{\text{standard}}(12) and \chi_{\text{sign}}(12).

Throughout mathematics, the notions we tend to study are the ones that strike the right balance between generality and structure. A specific object, while it may have much structure to analyze, would be incapable of describing the bigger, more general picture. On the other hand, too abstract an object would lose all restrictions, making it difficult to conjecture patterns about them in the first place.


Onto S_4!

Let’s continue our analysis of symmetric groups with the next one: S_4. As there are 5 integer partitions of 4, we’ll have a total of 5 irreducible characters (a conjugacy class of S_4 corresponds to a partition of 4: for instance, 2+2 corresponds to two 2-cycles like (12)(34) and 3+1+1 corresponds to 3-cycles such as (123)).

As usual, we have the trivial character: \chi_{\text{trivial}}, that returns 1 for all g\in S_4. In much the same way as last time, we can construct a natural representation for S_4, that assigns a g\in S_4 to the corresponding 4 by 4 permutation matrix, as viewed as an element of \textrm{GL}(\mathbb{C}^4). This won’t be irreducible however, as again, the vectors that have all coordinates equal in \mathbb{R}^4 will be invariant under the action of the \rho(g)‘s. The 3 dimensional complement of this invariant line will be also be invariant, and that is our standard representation, which character \chi_{\textrm{standard}}. Doing the computations, we get \{(12)\} \rightarrow 1, \{(12)(32)\} \rightarrow -1, \{(123)\} \rightarrow 0 and \{(1234)\} \rightarrow -1.

It’s time to invoke orthogonality! We still have two unknown characters: \chi_1 and \chi_2. Using the sum of squares formula, we have 1^2+1^2+3^2+x_1^2+x_2^2=|S_4|=24, which implies x_1^2+x_2^2=13, which forces x_1=\chi_1(e)=3 and x_2=\chi_2(e)=2. Letting \chi_1 take on values a_1, a_2, a_3 and a_4 and using the three equations:

\langle \chi_1, \chi_{\text{trivial}}\rangle=\langle \chi_1, \chi_{\text{standard}}\rangle=\langle \chi_1, \chi_{\text{sign}}\rangle=0,

we get:

\begin{matrix}6a_1+3a_2+8a_3+6a_4&=-3\\-6a_1+3a_2+8a_3-6a_4&=-3\\2a_1-a_2-2a_4&=-3\end{matrix}

Adding the first two equations, 3a_2+8a_3=-3. Notice that the a‘s must be integers, so this is a linear Diophantine equation. Upon solving, we get a_2=-8n-1 and a_3=3n, for n\in \mathbb{Z}. Adding the last two equations, we get 2a_3-3a_4=-3 and so a_4=2n+1, and substituting these expressions into the first equation yields a_1=-2n-1.

We do the same drill with \chi_2 (which takes on the values b_1,\cdots,b_4) to get that b_1=-2m, b_2=-8m+2, b_3=3m-1 and b_4=2m for m\in \mathbb{Z}.

Lastly, we have an equation involving both the n‘s and m‘s as \langle \chi_{1},\chi_2\rangle=0, which gives us 312mn+48m-72n=0. This implies m=0 and n=0 by SFFT.

This completes our character table for S_4 – just using orthogonality.

() (12) (12)(34) (123) (1234)
\chi_{\text{trivial}} 1 1 1 1 1
\chi_{\text{standard}} 3 1 -1 0 -1
\chi_{\text{sign}} 1 -1 1 1 -1
\chi_1 3 -1 -1 0 1
\chi_2 2 0 2 -1 0

But dealing with those many variables all at once was not pleasant, to say the least. The trouble there was that we’re taking the inner product of two rows with each other, and it so happened that one of the rows consisted entirely of unknowns, giving us simultaneous equations, each with three unknowns, which is bound to be computationally heavy.

Instead, notice what happens when take the inner product of two columns, viewed as normal vectors in Euclidean space. For instance, take the 1st and 3rd column from the table above. We get

1\times 1+3\times -1+1\times 1+3\times -1 +2\times 2=0.

The two columns are orthogonal!

As you might’ve guessed, in general, any two columns are orthogonal! Mathematically, we would right this as

\sum_{i=1}^{h}\chi_{i}(g)\overline{\chi_i(h)}=0,

given that g and h come from different columns – that is, g and h are not conjugate to each other. This expression can be proved by starting with the function f_s:G\rightarrow \mathbb{C} for a s \in S defined such that f_s(r)=1 if r is conjugate to s and f_s(r) is zero otherwise. Then, the result follows as soon as you write this function as a linear combination of the irreducible characters, which you can do as f_s is clearly a class function.

In general, this is extremely useful if we have determined all characters except for one: we could just inner product each column with column corresponding to the identity element, which we would know, allowing us to deal with one variable at a time. As a bonus, we don’t have to know the size of each conjugacy class any more, which we required when dealing with row orthogonality.


Tensor Products

Recall that we have the notion of the tensor product of two representations – a tool that we can use to possibly build \chi_1 and \chi_2 from \chi_{\text{trivial}}, \chi_{\textrm{sign}} and \chi_{\textrm{standard}}. In that direction, we will derive an expression for the character of a tensor product. But before that, we look at a slightly different expression for a character of a representation.

Let’s fix a basis \mathcal{B}_V=\{v_1, \cdots, v_n\} for V. Recall that we have a corresponding basis, \{v_1^*, \cdots, v_n^*\} for V^*, the dual space of V: Given a v\in V, v_i^*(v) is the coefficient of v_i in the expansion of v in terms of the \mathcal{B}_V basis. Thus, the matrix representation (with respect to \mathcal{B}_V) for \rho_V(g)\in \textrm{End}_{\mathbb{C}}(V) has (i,j)-entry v_i^*(\rho(g)(v_j)). Taking the sum of the diagonal entries to get the trace, we have

\chi_{V}(g)=\sum_{i=1}^{n}v_i^*(\rho(g)(v_i)).

In turns out that this expression for the trace of a linear operator and thus the character is quite versatile, primarily because it does not rely on concrete matrices all that much.

Now we can work with the character of a tensor product better: Let W be a vector space with basis \mathcal{B}_W=\{w_1, \cdots, w_m\}. Then a basis for V\otimes W is T=\{v_i\otimes w_j:1\leq i\leq n, 1\leq j \leq m\}. A corresponding dual basis for (V \otimes W)^* would be \{(v_i\otimes w_j)^*:1\leq i\leq n, 1\leq j\leq m\}, where we define (v_i\otimes w_j)^*(v_k\otimes w_l)=\delta_{ik}\delta_{jl} and extend linearly. That is, (v_i\otimes w_j)^* extracts the coefficient of v_i\otimes w_j in the expansion of the input in the basis T. That gives us (v_i\otimes w_j)^*(z) for an elementary tensor

z=v\otimes w=\left(\sum_{i=1}^{n}a_iv_i\right)\otimes \left(\sum_{j=1}^{m}b_jw_j\right)=\sum_{i=1}^{n}\sum_{j=1}^{m}a_ib_j(v_i\otimes b_j)

is a_ib_j, which is also just v_i^*(v)w_j^*(w).

All in all, armed with this new formula for the trace, we have

\begin{array}{rcl}\chi_{V\otimes W}(g)&=&\sum_{i,j\in [n]\times [m]}(v_i\otimes w_j)^*(\rho_{V\otimes W}(\sigma)(v_i\otimes w_j))\\&=&\sum_{i,j\in [n]\times[m]}(v_i\otimes w_j)^*(\rho_V(g)(v_i)\otimes \rho_W(g)(w_j))\\&=&\sum_{i,j\in [n]\times[m]}v_i^*(\rho_V(g)(v_i))w_j^*(\rho_W(g)(w_j))\\&=&\left(\sum_{i=1}^{n}v_i^*(\rho_V(g)(v_i))\right)\left(\sum_{j=1}^{m}w_i^*(\rho_W(g)(w_j))\right)\\&=&\chi_{V}(g)\chi_{W}(g)\end{array}

So the tensor product of representations just has the effect of multiplying the corresponding characters. In fact, going back to our character table for S_4, we can see that \chi_1=\chi_{\text{sign}}\chi_{\text{standard}} – a much more hassle free way of constructing new characters out of old ones, which we’ll explore and use heavily in upcoming posts!


Before You Go…

Here’s another fact to illustrate the intimate connection between the irreducible characters of a group and its conjugacy classes. Here, \mathcal{C}_1, \cdots, \mathcal{C}_h denote the conjugacy classes of G, and \mathcal{C}(g) for a g\in G denotes the conjugacy class of g.

Consider \alpha : G\rightarrow G, an automorphism of G. There is an natural way by which we can view \alpha as acting on S =  \{\mathcal{C}_1, \cdots, \mathcal{C}_h\}: we define \alpha \cdot \mathcal{C}_i = \mathcal{C}(\alpha (g_i)), where g_i is an arbitrary representative of \mathcal{C}_i.

Next, we let \alpha act on the set \textrm{Irr}(G) of irreducible characters of G by setting \alpha \cdot \chi = \chi \circ\alpha^{-1}. Put together, this means that we’ve defined \textrm{Aut}(G)-actions on the sets S and \text{Irr}(G).

Given this setup, the proposition is that

\#\{i : \alpha \cdot \mathcal{C}_i = \mathcal{C}_i\} = \#\{\chi \in \text{Irr}(G) : \alpha \cdot \chi = \chi\}.

In other words, the number of fixed points corresponding to either of the two group actions is the same.

To prove this, we’ll translate this into inner products. First, note that the operation \cdot: \text{Aut}(G)\times \text{Irr}(G)\rightarrow \text{Irr}(G) is actually an action, in that \alpha \cdot\chi \in \textrm{Irr}(G) for all \chi \in \text{Irr}(G). This follows for the same reason the averaging inner product expression used in a proof of Maschke’s theorem:

\langle \alpha \cdot \chi, \alpha\cdot \chi\rangle_{G}=\frac{1}{|G|}\sum_{g\in G}\left|(\chi(\alpha^{-1}(g)) \right|^2.

At this point, note that \alpha^{-1} is a bijection from G into itself, so \alpha^{-1}(g) will still run through G as g varies over G. Hence, we have \langle\alpha\cdot \chi, \alpha \cdot \chi\rangle_{G}=\langle \chi, \chi\rangle_{G}=1, where the last equality follows as \chi is irreducible.

Now we’re free to use the orthogonality relations! We get \langle \chi, \alpha \circ \chi \rangle_{G}=1 if \alpha fixes \chi and is 0 otherwise. Adding all these inner products up, we get:

N = \sum_{\chi \in \text{Irr}(G)} \langle \chi, \alpha \cdot \chi \rangle_G =\frac{1}{|G|} \sum_{\chi \in \text{Irr}(G)} \sum_{g \in G} \chi(g)\, \overline{\chi(\alpha^{-1}(g))},

where N=\#\{\chi \in \text{Irr}(G) : \alpha \cdot \chi = \chi\}. Switching the order of the summations (cause they’re always in the wrong order), we get

N=\frac{1}{|G|}\sum_{g\in G}\sum_{\chi\in \text{Irr}(G)}\chi(g)\chi(\alpha^{-1}(g)).

Aha, look at the sum now – we can invoke column orthogonality that saved us earlier! Doing that and collecting terms, we get

 N = \sum_{g \in G} \frac{1}{|\mathcal{C}(g)|} \, \delta_{\mathcal{C}(g), \mathcal{C}(\alpha^{-1}(g))} = \sum_{i=1}^{h} \sum_{h \in \mathcal{C}_i} \frac{1}{|\mathcal{C}(h)|} \, \delta_{\mathcal{C}(h), \mathcal{C}(\alpha^{-1}(h))}

where \delta_{\mathcal{C}(h), \mathcal{C}(\alpha^{-1}(g))} is 1 if \mathcal{C}(h)=\mathcal{C}(\alpha^{-1}(g)) and 0 otherwise. Recognising that the summand is a class function, we get

\sum_{h \in \mathcal{C}_i} \frac{1}{|\mathcal{C}(h)|}  \delta_{\mathcal{C}(h), \mathcal{C}(\alpha^{-1}(g))} = \delta_{\mathcal{C}_i,\alpha\cdot\mathcal{C}_i}

and so

 N = \sum_{i=1}^{h}\delta_{\mathcal{C}_i, \alpha\cdot\mathcal{C}_i}

which is the cardinality of the first set, the fixed points under the action on the conjugacy classes.