Why is symmetry so important in particle physics? (opens in new tab)

(mfaizan.github.io)

121 pointsmfn3y ago60 comments

60 comments

I find it incredibly frustrating that again and again, the Lagrangian gets introduced and then said it should be minimized without ever explaining the motivation behind doing so. What is the Lagrangian and why should it be minimized?

I totally get how L gets defined mathematically, how it is derived from Newton's laws (this part is typically well explained by textbooks), and why in the case of point particles, a curve that violates Newton's laws does not minimize L. But there is no understanding at all, just saying "okay, it checks out" on a math level.

It doesn't help at all that on a math level, L isn't actually minimized but its derivative set to 0, which isn't even equivalent to "minimized or maximized". Why doesn't a single textbook explain why maximizing L is also okay when they first stated "minimized"? Or why derivative=0 is sufficient? As a reader, I always get the impression that "well, of course they cannot explain that, because they don't even know why L should be minimized in the first place". It's just all formulas that are easy to verify but don't convey a single bit of understanding.

Just for comparison, I found quantum mechanics based on the Schrödinger equation and the Hamiltonian rather easy to grasp, because every piece of it has an easy-to-understand meaning, that also gets explained really well. Why is this seemingly impossible for the Lagrangian?

pa7x13y ago

In science why questions are not answered within the framework that raises them, you need a larger encompassing theory that explains the phenomena within a larger scope to answer them. In this larger encompassing theory you can sometimes answer some previous why questions but possibly new why questions are uncovered.

In classical mechanics there is no explanation for the stationary action principle, it just is the procedure you follow to derive the equations of motion (it answers a how, not a why). You need Quantum Mechanics, in particular the path integral formulation of Quantum Mechanics, to answer why classical solutions correspond with action extrema.

Long-story short, in quantum mechanics all possible paths contribute equally to the probability of a particle going from A to B but they interfere with each other. It can be shown that most paths interfere destructively between themselves except when the action gets to a extreme (minima, maxima or saddle point) where constructive interference occurs.

Further reading:

https://en.wikipedia.org/wiki/Path_integral_formulation#Stat...

Maro3y ago

I don't know if this will help you, but I'll try to give you a possible way out. It's in line with Feynman's famous "shut up and calculate" advice on QM.

Field theory and QFT in themselves are mathematical frameworks, not physics; we can think of it as a piece of applied mathematics. It becomes physics when we plug in a specific Lagrangian, and then apply the framework, eg. we draw Feynman diagrams, regularize, renormalize, all that jazz, and we get eg. a cross-section out of it. So, when you ask "why should L be minimized?", the answer is, because this whole construction works. If you follow the complicated (and convoluted) playbook of QFT, you will be able to calculate physical quantities (like cross-sections, or the fine structure constant), and then when you do an experiment, you find that the numbers match.

This doesn't work for all Lagrangians. Most L(x, p, t) or L(phi, phi', ..) functions we come up with don't correspond to physics, and the numbers the framework emits do not line up with experiments.

You may be dissatisfied by this, this is a black-box picture. Over time physicists have developed a lot of intuition for parts of the theory, and come up with heuristic explanations what the parts mean. This is also how the whole thing was constructed, by analogy from Lagrangians in classical mechanics, where things can be reduced to Newton's equation of motion, which we know works. But in the end, the reason eg. L should be minimized, is because that's how nature is and that's what works, with specific Ls.

yccs273y ago

Starting from quantum mechanics, the Lagrangian describes the phase change per unit of time. For most evolutions of a system, similar evolutions have very different phases and cancel out. But when the Lagrangian is stationary (derivative 0), they interfere constructively. The stationary point is often a minimum, but it could also be a maximum or saddle point.

prof-dr-ir3y ago

Meh :) in classical mechanics one cares little about maximization versus minimization; it's really just (stationarity of L) <---> (equations of motion) that matters. Practically speaking it's just a cute gimmick to quickly find the equations of motion, especially for constrained systems.

In quantum mechanics everything is different, but for that you should study path integrals. To re-establish the connection with classical mechanics you have to learn about the saddle point approximation; it might also help to read Feynman's book called QED.

gpsx3y ago

I'm not exactly sure from what perspective you ask this question. As far as I know people misspeak when they say to minimize the lagrangian rather than find the stationary points (min max or maybe saddle). Applied to classical physics in the absence of quantum mechanics I don't think there is a good answer. It is just a rule. But, when combined with qunatum mechanics, it explains how classical physics is an approximation to quantum mechanics.

In the path integral formulation of quantum machanics, the evolution of a wave function from state 1 to a later state 2 can be thought of as occuring as a superposition of all possible paths between the two states, with each path contributing a factor of exp(iS) where S is the lagrangian. This means all paths either obeying classical physics or not.

In situations where classical physics is valid these contributions from different paths changes very quickly. The contributions from neighboring paths cancel each other out except at stationary points in the lagrangian, where there is a zero change between neighboring paths. Hence, we see classical physics are the trajectories that are the stationary points of the lagrangian.

nh23423fefe3y ago

I found this playlist informative when I was having the same thoughts

https://www.youtube.com/playlist?list=PL2ym2L69yzkamORF9DGWR...

It isn't L that is differentiated and set to zero. Instead the variation of the action is zero. So L isn't minimized or maximized, the variation of the action is zero. This implies that the solution path y(x) when varied by dy is a stationary point of the action. So for this path all nearby paths have the same action.

ok but then

> What is the Lagrangian and why should it be minimized

the form of the lagrangian is derivable from the d'alembert principle, principle of minimum potential, and then hamilton's principle.

It seems to me the principle is that real system behave in such a way that is characterized by hamiton's principle (the variation of the action is zero for real paths), and then we operationalize that principle by the calculus of variations to get real paths which have the properties established by the principles (use the Euler-lagrange equations to find paths for a particular system)

elashri3y ago

It is not minimum or maximum when you take derivative L = 0. You get sudden points which could be minimum or maximum or not. In cases of classical Lagrangian with V is the gravitional Field. It would be minimum. If you want to derive optics laws (snell/reflection) you will find that the path it not a minimum path. Actually you will find that the light will take the path that minimize the time it takes from point a to b.

For more complicated theorie. You always tty to start from one point (usually symmetric or equilibrium) and try to build your theory's Lagrangian. Usually this involve some inputs from experiments (i.e. Standard Model Lagrangian). If we are Lucky enough then the true theory wouldn't be too far.

zinclozenge3y ago

Unfortunately, it's not an intuitive idea and it's deeply rooted in the history and development of classical mechanics.

One way of looking at the Lagrangian is via the Legendre transform from the Hamiltonian, which in my opinion is highly intuitive since it's total energy, H = T + V. I'm not going to go through the whole explanation of the Legendre transform. Instead, I want to point out that in the Hamiltonian formulation, you can think of the independent variable as momentum, mv. When you transform to the Lagrangian, you're making, among other things, a change of variable to the velocity, v. If you have a velocity independent potential, then it only really matters in the kinetic energy, and the Lagrangian picks up a sign change due to the transform.

So in the end, it's recasting an intuitive idea into a more mathematically tractable form. The action starts making more sense in the path integral formulation of quantum mechanics.

lokedhs3y ago

In Susskinds book The Theoretical Minimum, he explains this and suggests that the word stationarising would be more accurate, as the goal is to make the lagrangian stationary and not minimal.

I haven't read any other textbooks on the topic so I don't know how common this is. I do suspect you're right though, as he made a point of explaining this

psychphysic3y ago

I believe he also goes over the legendre transform which lets you move between Hamiltonian and Lagrangian mechanics.

Just occured to me that the above poster may believe that QM is Hamiltonian only and classical is Lagrangian only. That'd be a forgivable but fatal mistake to understanding the subject!

Hamilton Jacobi equation may also help them remember QM is distinctly less intrusive than classical mechanics.

psychphysic3y ago

Physics does not answer why questions it's about modeling and predicting.

QM and esp. Schrödinger equation has way too many shortcomings for someone to claim it's easy to understand.

You're question about maximise, minimise and deritive set to 0 is a bit bizarre. Is it genuine? If so it's that it's about stationary points.

Koshkin3y ago

> Physics does not answer why questions

Well it does, but often it takes a more fundamental theory than where the question was posed.

On the other hand, an answer may involve logic and/or a mathematical derivation which is not so different from calculation, so where intuition ends and “shut up an calculate” begins is not all that clear and may depend on the level of expertise.

psychphysic3y ago

Since we don't (yet) have a fundamental theory it stands.

Not only that but the shut up and calculate mentality is still the most dominant one within physics.

Should string theory turn out to be the fundamental theory then it's possible the 'Why' will be that's the geometry of our universe. Which is not more for filing than tweaked constants.

Unfortunately no physics won't tell us Why. That's a question for philosophy.

Aardwolf3y ago

I would love to get intuition about this, but everytime I try to read about it I get lost in the math behind Lagrangians etc... and never get the intuition.

Specifically, I'd love the intuition behind why it must be so that if the laws of physics are time/position invariant, it must be impossible to create or destroy energy/momentum.

This because I can perfectly imagine a universe where you can create/destroy energy or momentum at will, yet have this work with the same physics at any time or position (e.g. a battlemage in some RPG able to cast fireball spells with heat and momentum out of nothing, where it works the same no matter at what position or point in time the battlemage is). So what about Lagrangians is preventing this, in an intuitively imaginable way?

pa7x13y ago

> Specifically, I'd love the intuition behind why it must be so that if the laws of physics are time/position invariant, it must be impossible to create or destroy energy/momentum.

The geometrical intuition is that the momentum operators are the generator of spatial translations (i.e. acting with the momentum operator P_x on a system is the same as applying an infinitesimal displacement in the direction x). https://en.wikipedia.org/wiki/Translation_operator_(quantum_...

And the Hamiltonian is the generator of time evolution (i.e. acting with the Hamiltonian on a system shifts it in time an infinitesimal amount). This is quite literally what the Schrödinger equation says, btw. H |Psi> ~ d/dt |Psi>

If the physics of a system (which are given by its Hamiltonian) are invariant under translations then it must be the case that a shift in time (Hamiltonian generates shifts in time) of the momentum (generates shifts in space) of a system is 0.

As the Hamiltonian gives us the energy of the system, if the system is invariant under time translations then its energy is conserved. Using the previous argument.

Rinse and repeat for any symmetry of the system. For instance, angular momentum operators are the generators of rotations. If your system is invariant under rotations, then it conserves angular momentum. Invariance under relative movement (relativistic invariance) gives conservation of center of mass. Etc... https://en.wikipedia.org/wiki/Symmetry_(physics)#Conservatio...

Cleonis3y ago

Specifically about the Lagrangian of Classical Mechanics (Hamilton's Action) I have discussed that on physics.stackexchange https://physics.stackexchange.com/a/670705/ The ideas are expressed in diagrams. What is expressed in the diagrams is repeated/corroborated in mathematical expressions (stackexchange supports Mathjax). The visualizations leave no room to get lost.

3170703y ago

I think there is the underlying notion that among any set of possible paths, it is always possible to find for each path a quantity for which a particular path is optimal.

That seems a bit trivial to state, but reasoning on the space of those quantities is more flexible. It doesn't mind the non-linearity between parameters and resulting paths as much.

So when you are trying to find the laws of physics, reasoning on that higher level space of quantities-that-will-get-minimized seems to simplify things.

Koshkin3y ago

Read The Lazy Universe by Coopersmith.

(Also, the Fermat’s Principle of least time is perhaps more intuitive and has a firm grounding in the wave description of how light propagates.)

dhoe3y ago

Imagine you find _the_ fundamental law of the universe, say "the gralb must be flombed". It's the fundamental law, so there's no other law explaining why it exists. All other laws follow from it and therefore feel understandable.

(This is mostly theoretical, as the explanation with phases cancelling in the sibling comment is satisfactory to me at least. But nevertheless there will be some law without a "more fundamental reason").

Cleonis3y ago

It is in fact possible to explain Hamilton's action within the context of classical mechanics.

On physics.stackexchange I have discussed that, in an answer posted in oktober 2021. That discussion is illustrated with animated GIF's. The animated GIF's are composed of successive screenshots of interactive diagrams that are on my own website.

https://physics.stackexchange.com/a/670705/

Stackexchange has mathjax support, and support for uploading images, that is why I refer to my post on physics.stackexchange

The following is to give you an idea of what I discuss.

We have that if F=ma is granted as axiom then the Work-Energy theorem follows as theorem.

(As we know: the derivation of the Work-Energy theorem is subject to the following condition: it is only applicable if it is possible to define an unambiguous expression for potential energy. In order to have a well-defined expression for potential energy the force that is involved needs to be a conservative force.)

The Work-Energy theorem implies the following: In the process of interconversion of potential energy and kinetic energy: the rate of change of kinetic energy always matches the rate of change of potential energy. (If the potential energy is decreasing then the kinetic energy is increasing at the same rate)

In terms of exploring a variation space of trial trajectories:

The true trajectory has the following properties: A property of derivative with respect to time: - At every point along the trajectory the derivative of the kinetic energy with respect to time matches the derivative of the potential energy with respect to time.

A property of derivative with respect to _position_: - At every point along the trajectory the derivative of the kinetic energy with respect to _position_ matches the derivative of the potential energy with respect to _position_.

I want to highlight this: In mathematical models that describe changes taking place we are accustomed to taking the derivative with respect to _time_.

But: When we are representing the physics taking place in terms of _Energy_ it is powerful to take the derivative with respect to _position_.

In classical mechanics: When you insert the Lagrangian in the Euler-Lagrange equation then the operation that the Euler-Lagrange equation performs is that it takes the derivative of the Lagrangian with respect to position.

You are looking for the point where the _derivative_ of the Lagrangian with respect to _position_ is zero.

when that derivative is zero the derivative-of-the-kinetic-energy-with-respect-to-position _matches_ the derivitive-of-the-potential-energy-with-respect-to-position.

For the concept of stationary action minimum or maximum is immaterial. Stationary action is about identifying the point in variation space such that at every point along the trajectory the derivative-of-the-kinetic-energy-with-respect-to-position matches the derivitive-of-the-potential-energy-with-respect-to-position

Finally: There is a concept that I will refer to as Jacob's lemma. (This concept was introduced by Jacob Bernoulli in the course of presenting his solution to the Brachistochrone problem. The Brachistochrone problem had been presented by Johann Bernoulli, as a challenge.)

Jacob's Lemma was stated decades before Euler started development of calculus of variations. Jacob's Lemma is crucial to understanding calculus of variations.

Take the Brachistochrone curve. Divide it in subsections. Then each subsection is in and of itself an instance of the Brachistochrone problem. This process of subdivision can be repeated indefinitely. In the end you have a concatenation of infinitissimally short subsections, and we know that each of those subsections is an instance of the Brachistochrone problem.

This tells us that a differential equation must exist that solves the Brachistochrone problem. (It does not narrow down what that differential equation is, but at least you have logical proof that it _must_ exist.)

In fact, Jacob Bernoulli succeeded in solving the Brachistochrone problem using a differential calculus approach.

The Euler-Lagrange equation takes the variational formulation, and converts it to differential equation form.

blueprint3y ago

The Lagrangian is a simple statement of the energy of a system. Have you heard of the relationship between energy and time? Have you heard of the principle of least time for photons? Have you considered that a "photon" is not a wave or particle of energy before it is detected, but rather the probability that one will be detected?

teleforce3y ago

This is an excellent book on the subject by Jakob Schwichtenberg [1].

[1]Physics From Symmetry:

http://physicsfromsymmetry.com/

mfnOP3y ago

Yes, that's an excellent book, along with his book on QFT.

I also can't recommend this course enough, Susskind has done a remarkably good job at making advanced physics concepts accessible: https://theoreticalminimum.com/

Also, "Symmetry and the Standard Model: Mathematics and Particle Physics" by Matthew Robinson does a great job of developing the group theory needed before diving into the physics.

adamnemecek3y ago

This idea shows up in essentially all scientific fields. It’s the idea of adjointness. Together with norm, they give you the idea of fixed points, (invariants, spectra, embeddings, braids etc).I'm

Lawvere's fixed point theorem is I think the best formulation of the idea

https://ncatlab.org/nlab/show/Lawvere%27s+fixed+point+theore...

I've been putting together a brain dump on the topic

https://github.com/adamnemecek/adjoint/

Join the discord https://discord.gg/mr9TAhpyBW

VirusNewbie3y ago

Wouldn’t adjointness be distinctly not “symmetry”, given it relies on a forgetful functor?

adamnemecek3y ago

The beauty is that under certain circumstances you can infer the lost things lost due to the forgetfulness of the forgetful function.

xeonmc3y ago

When being "here" and "there" is exactly the same, you move spontaneously.

When being "here" and "there" is almost-but-not-quite the same, you move easily.

The degree of "not-the-same-ness" is called the Lagrangian.

In other words, symmetry is "fungibility of states".

Things happen because the before and after is not very different, the transaction of different-ness is the energy involved.

In a classical system it's nigh impossible to encounter an "exactly the same" situation, because there are just too damn many participants to rule out every possible interaction.

In a quantum system you encounter "exactly the same" situations frequently because there is only a tiny number of participants interacting.

misja1113y ago

After reading to half way into the article, there is this disappointing text:

> It turns out that there are a range of particles in nature that exhibit this kind of symmetry. There isn’t any easy way to argue why this kind of symmetry must exist, but it does.

And then the article continues to show some implications and predictions that follow when one assume that this symmetry must exist. But I opened the article expecting to find the why ..

bigbacaloa3y ago

When we don't have symmetry we don't know what to do and can't compute anything. Fortunately many situations can be modeled as near a symmetric one. We solve the symmetric one and study it's asymmetric perturbations.

The prevalence of symmetry reflects our inability to do anything in its absence.

psychphysic3y ago

That's one way to consider it.

The truth is there are many ways to skin a cat and we've found quite effective ways to do it.

The real cause of the unreasonable effectiveness of mathematics is humans ability to constantly rephrase tasks within the wheelhouse of our mathematical tools.

Eventually... There's long periods of humans being stuck and then getting unstuck and history compresses that to appear like constant smooth progression.

But you're right, many people have a bizarre view of the world painted by the Schrödinger equation. That the world is made up for snap shots of fixed particle number defined energy states. Really peculiar if you think about it and quite clearly incorrect (we know it doesn't apply to 'collpase' which is really the way the world is experienced by us). And compared to QFT.

ranger2073y ago

I don't like to be critical, but I've been wanting to understand symmetry in physics for a while, so here's a few points of confusion I have

> The principle, then, is that the particles and fields that were used to build up the theory will move in a way that minimizes or maximizes the sum of L over the path taken by the system

The next couple of examples only minimize the Lagrangian; are there any systems in this article that maximize it?

> a collection of objects and a recipe to build a Lagrangian from those objects, with the movement of those objects determined by a path that minimizes the Lagrangian

What in this case are the objects? Just particles? Is mass in the first example (of classical motion) an object? I'm trying to figure out what kind of objects to use to build an equation, or basically, what type (in the programming sense) an object is

> This is a surprising fact

Why? Are there other theories, maybe from earlier in the development of physics, that used a different approach?

> We could use particles as the building blocks, and represent each particle by its position and velocity. However, fields turn out to be a much more useful way of representing the way particles behave. A field ϕ(x,t) is a function that takes a point in spacetime and spits something out for each point.

I assume that in this passage "building blocks" is equivalent to "objects" in the last passage? Why are fields more useful? Is there an example of what using particles as objects would look like? In particular, a field looks to me like a function; if you used particles as an object, would you represent a particle as a function using its position and velocity? Would that function have time as a parameter like fields do? Typing that out I can kind of see why you'd use fields

> For example, a field could take a position (in spacetime) and spit out a number (which could be real or complex)

What does the output represent? Anything in particular? If not, then it seems like you could define the field function to be anything since the output doesn't represent anything, then when you feed the field function into the Lagrangian eventually you'd get massively different results

> The simplest Lagrangian we can write for this field is: L=δtϕ⋆δtϕ

How is the Lagrangian constructed? This "simplest" Lagrangian is the derivative of the field with respect to time, along with the derivative of the field's complex conjugate with respect to time, but how'd you know to do that? What makes this the simplest possible Lagrangian? Calling this the "simplest Lagrangian" hints that there are other equally valid ways to create a Lagrangian; is that correct? What are the rules for that? Why would you make a more complex Lagrangian?

> One interesting observation about Lagrangians is that any term of the form V(ϕ) represents potential energy.

What is V(ϕ)? My initial assumption would be velocity, but how do you take velocity of a field? Actually, I can see what they're doing: velocity of a particle is the derivative of it's position with respect to time, so I guess V(ϕ) aka velocity of the field is the derivative of the field with respect to time. That could've stood to have been spelled out

> Since there are no time or space derivatives involved

Ok I guess V(ϕ) doesn't represent the velocity of the field. I've got no clue what it is

> Now let’s take this a step further. In the previous example, we rotated the field at all points by some angle. But why do we need to rotate the field the same way everywhere? What if we measure things at one point with one coordinate system, but measure them at a different point using a different coordinate system that’s rotated. Although it’s hard to imagine why anyone would want to do this, one would still expect that this shouldn’t affect the actual physics predicted by the theory... Multiplying these together, we see that the Lagrangian is different. This is not what we wanted - rotating the complex plane in has affected the results of our theory.

I have no idea what's going on here. Why would you measure different points of the field with different coordinate systems and expect sensical results? I'm imagining a surveyor walking in a line starting from the origin: he takes a measurement at the origin, then at (1,0), then at (2,0), then at (3,0), etc. (Imagine that the underlying field is frozen in time so we aren't dealing with the Lagrangian yet.) Since we know the field equations we can predict what he'll measure at each of those points in the line.

But if the coordinate system changes with every step, he's still moving in a straight line as seen from a bird flying overhead, but at his first step he's at (1,0), then at the next step (2,0) turns into (5,1), then at the next step (6,1) (aka (3,0)) turns into (12,-3), etc, because the coordinate system changes each step. It's still (1,0),(2,0),(3,0) if you measure in the original coordinate system. But the underlying field wouldn't change in that case. Sure, if you put (5,1) into the field equation you'll get a different result than if you put in (2,0), but if you're only changing the coordinate system then that has to be compensated for in the field equation itself and you're not going to get different results for the same physical point. I mean, you should get the same result if you do f((2,0), coordinate system a) as if you did f((5,1), coordinate system b)

Edit: I think the core of my confusion is that in order for f((2,0), coordinate system a) to equal f((5,1), coordinate system b) then you need knowledge of how the coordinate system changes, and I don't see how that gets incorporated into the function

> Note that the issue here is that when we take the derivative of the field, we get an extra iϕδθδt term proportional to the derivative

Proportional to the derivative of what?

> This property - that a theory is not affected by changing some symmetry parameter throughout spacetime - is called gauge invariance

Why is it called that? I assume someone chose that name because it made sense to them for good reasons

> Also note that this new term, iAϕ, looks like a potential from the perspective of our field, with V(ϕ)=iAϕ

There's V(ϕ) again. I still don't know what it represents

> our theory now predicts some type of force involved in the interaction between our field and this new field

Wait, "our field" and "the new field"? What fields are those? We were talking about a field defined by the function ϕ(x,t) and thinking about its Lagrangian. We added a term A to the Lagrangian and that was it. What's the "new field"? Why does ϕ(x,t) have an interaction with it? Is A the new field?

> This mechanism of introducing an additional field to make an existing theory gauge invariant is exactly what gives rise to photons in the Standard Model! ‘Rotations’ of the electron field correspond to an additional field, called a gauge field, that behaves exactly the way photons do.

Ok, I guess A is a new field. I can see how it arises, but I'm not sure how you actually get to it

> This derivative doesn’t really make sense if we aren’t using the same coordinate system everywhere

you don't say

> The way we measure our field at x is different than the way we measure it at x+δx, so to get the actual difference, we need to make the field comparable by fixing it up before subtracting it

What is "the way we're measuring it"? I think it's, basically, the coordinate system of the surveyor changes each step, so that's a different "way" of measuring it? I still don't see why changing the coordinate system makes new stuff pop out of the equation

> As a first step, we can expand it: W(x,x+δx)=1−iδxA(x)+O(δx2)

How do you expand it? Are you giving the definition of W(x, x+δx)? How'd you get that? What's O(δx^2)?

> A group is a set of elements associated with some operation... What’s important here is that these two sets - the set of rotations, with the operation being composition, and the set of 1×1 complex matrices with the operation being multiplication - have the exact same behavior

Where'd the second set come from? Wait, I see, it's just saying that you can say that "multiplying a number by a 1x1 matrix" is the same thing as saying "you can compose a number with a rotation". It's literally the same thing, just said in a less clear manner. Does the new terminology get us anything useful?

> U(1) invariance of the electron field gives rise to the photon field. > SU(2) invariance across lepton fields (such as the electron and electron neutrino) leads to W+, W−, and Z bosons. SU(2) has two generators, so there are three gauge bosons. > SU(3) invariance across quark fields leads to eight gluons, since SU(3) has eight generators.

What? How do you know how many generators there are? Why does SU(2) have two generators but three gauge bosons?

> It’s remarkable how the observation that an equation doesn’t change under some operation, which seems quite trivial, can have deep consequences, dictating the nature of forces and interactions in the theory.

Yeah, I think the part I'm not getting is how changing coordinate systems affects the equation. I think I can see that if you insist on doing something ridiculous like this you'd need some math to correct for it and if the new correction functions are fields then it looks like new particles popping out, but I don't see how that doesn't result in an infinite number of new particles. Like, I can add a function f(x) = x^2 to the Lagrangian, then a g(x) = -x^2 to compensate for it, but those don't represent new particles do they? Why do those cancel out but A doesn't? I just don't see how changing coordinate systems results in different results

Despite my questions, I think I have a better idea of what's going on. You have a function; it should spit out the same numbers when you rotate it; you need a function to correct for the rotation; in physics the new function looks like a particle. I can kinda sorta see how it works now. Thanks for the article!

Edit: Ok, I think I've narrowed down my confusion to the θ(t). I can see that if you want to measure the same (x,y) over time as the coordinate systems change even though that (x,y) represents a different physical point every t, then you'd need to take the change in coordinate system over time into account in the derivative. But I'm not sure how that would be useful, nor how that would result in new physics over the case of a fixed θ

mfnOP3y ago

(1/2) Hey thanks for the in depth review!

> The next couple of examples only minimize the Lagrangian; are there any systems in this article that maximize it?

So it's not really relevant whether we minimize or maximize it - AFAIK, the action principle states that any path that maximizes OR minimizes the Lagrangian would be a path that the system could take. I'm actually not sure why it so happens that there's always a unique path in the theories that physicists use - I'm guessing that the argument could be that a Lagrangian that gives multiple paths would be unphysical? Not sure here.

> What in this case are the objects? Just particles? Is mass in the first example (of classical motion) an object? I'm trying to figure out what kind of objects to use to build an equation, or basically, what type (in the programming sense) an object is

I'm using objects as a superclass/base class (in the programming sense) of both fields and particles. In classical mechanics you have two things - particles, where particles are described by their position (along with derivatives). You also have fields (electric, magnetic) that are described by the amplitude of the field at all points in space (along with derivatives).

> Why? Are there other theories, maybe from earlier in the development of physics, that used a different approach?

It's surprising at first glance since it's hard to imagine an intuitive reason for why every theory we've developed, classical mechanics, quantum mechanics, quantum field theory - can be reformulated in a way where some function L is being minimized or maximized. I suppose there could be theories that can't be framed in such a way, but AFAIK all theories 'in use' can.

> I assume that in this passage "building blocks" is equivalent to "objects" in the last passage? Why are fields more useful? Is there an example of what using particles as objects would look like? In particular, a field looks to me like a function; if you used particles as an object, would you represent a particle as a function using its position and velocity? Would that function have time as a parameter like fields do? Typing that out I can kind of see why you'd use fields

This is a really good point and something I definitely should clarify. So this jump from particles to "everything is a field" happens when we jump from classical mechanics to quantum mechanics (and quantum field theory). In quantum mechanics you never really know the 'position' of a particle anymore, it's not defined - a particle is represented by a probability distribution over all space, which is where its field nature comes in - you're assigning a probability density to each point in space.

This is definitely something that I hand waved away for simplicity, but can definitely see how this is confusing.

> What does the output represent? Anything in particular? If not, then it seems like you could define the field function to be anything since the output doesn't represent anything, then when you feed the field function into the Lagrangian eventually you'd get massively different results

So what the output is is completely up to you, as the 'author' of the theory. If the field is a complex number representing the probability density of the particle, then you'll end up with a theory of scalar particles, such as the Higgs boson. If the field represents spinors (vector like objects that transform differently), then you get electrons. If the field spits out vectors, then you get a theory of the electromagnetic field.

You bring up an interesting point - we can have fields with arbitrarily exotic objects, so which ones do we use? I believe this just comes down to experiment.

> How is the Lagrangian constructed? This "simplest" Lagrangian is the derivative of the field with respect to time, along with the derivative of the field's complex conjugate with respect to time, but how'd you know to do that? What makes this the simplest possible Lagrangian? Calling this the "simplest Lagrangian" hints that there are other equally valid ways to create a Lagrangian; is that correct? What are the rules for that? Why would you make a more complex Lagrangian?

Another really good point. Schwichtenberg's books flesh out the argument in more detail, but you're right that there are many ways to create a Lagrangian, in principle, any theory you come up with that follows the action principle can be (by definition) expressed by a Lagrangian. L can be whatever you want.

Now, why this particular choice of L? One argument is - let's assume that L can be expanded out in a series. In that series we'll have terms with time derivatives of the field, and terms that involve just the field. This is where the 'simple' part comes in - we're chopping off all time derivatives except the first derivative, multiplied by nothing else. The only other consideration is that the term with the time derivative needs to be of even order, because otherwise you will not have a stable theory. So this leaves the 'simplest' time derivative terms as (d/dt)(phi) squared, or multiplied by the complex conjugate.

The key thing to note is that there aren't really precise rules the conclusively dictate exactly what the Lagrangian must be. The Standard Model uses a set of terms, and there are some heuristic reasons for why those terms are used (Lorentz invariance being an imporant one), but at the end of the day - you can build any Lagrangian you want for your theory. The one used in this piece is just a simple Lagrangian has enough 'interestingness' that it somewhat reproduces the effects of how symmetry influences the actual, full scale Standard Model Lagrangian.

> What is V(ϕ)? My initial assumption would be velocity, but how do you take velocity of a field? Actually, I can see what they're doing: velocity of a particle is the derivative of it's position with respect to time, so I guess V(ϕ) aka velocity of the field is the derivative of the field with respect to time. That could've stood to have been spelled out

Apologies for the confusion here - this just represents any function that doesn't depend on the time derivative of phi. It's called potential energy since it depends on the value of the field, so where it 'is', not how it's changing.

andreareina3y ago

Geodesics ("shortest path" trajectories) in spacetime maximize proper time.

mfnOP3y ago

(2/2) > I have no idea what's going on here. Why would you measure different points of the field with different coordinate systems and expect sensical results? I'm imagining a surveyor walking in a line starting from the origin: he takes a measurement at the origin, then at (1,0), then at (2,0), then at (3,0), etc. (Imagine that the underlying field is frozen in time so we aren't dealing with the Lagrangian yet.) Since we know the field equations we can predict what he'll measure at each of those points in the line.

> But if the coordinate system changes with every step, he's still moving in a straight line as seen from a bird flying overhead, but at his first step he's at (1,0), then at the next step (2,0) turns into (5,1), then at the next step (6,1) (aka (3,0)) turns into (12,-3), etc, because the coordinate system changes each step. It's still (1,0),(2,0),(3,0) if you measure in the original coordinate system. But the underlying field wouldn't change in that case. Sure, if you put (5,1) into the field equation you'll get a different result than if you put in (2,0), but if you're only changing the coordinate system then that has to be compensated for in the field equation itself and you're not going to get different results for the same physical point. I mean, you should get the same result if you do f((2,0), coordinate system a) as if you did f((5,1), coordinate system b)

So I guess a simpler way to see this is to note that changing coordinates should never affect the results of a physical theory. For example, if I measure things with the origin at x = 0, but you use x = 5 - all our measurements of position will differ by 5 units. But when we apply the equations of motion to some object we're both looking at - F = ma, our predictions of how the object will move will agree. My predictions will be the same as yours, but the positions will differ by 5 units. This is because the equation of motion, F = ma, does not care about translations in the coordinate system, the acceleration is the second derivative. If we had an equation like F = ma + x, then we would no longer be coordinate invariant, and the equation would be unphysical. It wouldn't make sense, you could look at the system in a different way (i.e. using different coordinates) and get completely different results.

> Why is it called that? I assume someone chose that name because it made sense to them for good reasons

This I'm not sure about - I haven't actually seen the reason mentioned anywhere, other than just an indication that it's for some historical reason.

> Wait, "our field" and "the new field"? What fields are those? We were talking about a field defined by the function ϕ(x,t) and thinking about its Lagrangian. We added a term A to the Lagrangian and that was it. What's the "new field"? Why does ϕ(x,t) have an interaction with it? Is A the new field?

Yup! A is the 'new field' - the terminology could be clearer here. So by requiring that our toy Lagrangian with phi be gauge invariant, we now have a need to introduce another field, A, and way this field appears in the Lagrangian (by multiplying by phi) is something that will end up acting like a force.

> What is "the way we're measuring it"? I think it's, basically, the coordinate system of the surveyor changes each step, so that's a different "way" of measuring it? I still don't see why changing the coordinate system makes new stuff pop out of the equation

Yes - by way of measuring it I mean that we are using a different coordinate system at each point, just to see what the effects of doing so are. And this is the magic bit - if we don't have this other field A, then the equation will produce different results depending on which coordinate system you use. So any theory without A will not consistently give the same results regardless of coordinate system.

> How do you expand it? Are you giving the definition of W(x, x+δx)? How'd you get that? What's O(δx^2)?

Power series expansion. So we are defining W(x, y) as a function that allows us to compare the field at x and y. We don't know what this function is, we just assume it exists. We then expand it out in powers of delta x. Since we are eventually going to take the limit, any higher order term - that involves delta x squared or higher - will be too small to matter, so we just care about the constant term and the linear term.

> Where'd the second set come from? Wait, I see, it's just saying that you can say that "multiplying a number by a 1x1 matrix" is the same thing as saying "you can compose a number with a rotation". It's literally the same thing, just said in a less clear manner. Does the new terminology get us anything useful?

Once we make the connection between a symmetry of our Lagrangian and some abstract group like SU(3), we can immediately bring in group theoretic results about that group. For example, since we know (from group theory) that SU(3) has eight generators, we can now use that result and infer that we need eight gauge bosons to make the theory gauge invariant.

> What? How do you know how many generators there are? Why does SU(2) have two generators but three gauge bosons?

Typo - will fix! Should be three generators.

> Yeah, I think the part I'm not getting is how changing coordinate systems affects the equation. I think I can see that if you insist on doing something ridiculous like this you'd need some math to correct for it and if the new correction functions are fields then it looks like new particles popping out, but I don't see how that doesn't result in an infinite number of new particles. Like, I can add a function f(x) = x^2 to the Lagrangian, then a g(x) = -x^2 to compensate for it, but those don't represent new particles do they? Why do those cancel out but A doesn't? I just don't see how changing coordinate systems results in different results

So anytime something gets added to the Lagrangian, you effectively have a new theory - it's a proposal. The point here is that you don't really have infinite flexibility in adding things to the Lagrangian - if you add complex scalar fields, then you must also add gauge bosons. So symmetry doesn't constrain everything - you still need to figure out what the Lagrangian should be, but it'll force you to add other fields as well to make things gauge invariant.

> Despite my questions, I think I have a better idea of what's going on. You have a function; it should spit out the same numbers when you rotate it; you need a function to correct for the rotation; in physics the new function looks like a particle. I can kinda sorta see how it works now. Thanks for the article!

Appreciate the thorough review! Lots of things that I should have been more thorough about - I will fix :) Thanks again.

ranger2073y ago

Thanks again for the article! I did learn a lot from it, and I really appreciate your time answering my questions

nyc1113y ago

“Let’s say we have some complex field [. . .] Just to keep things as simple as possible, let’s say that the field is only a function of time.”

This is not simplifying things. First they assume that spacetime is not space and time separately but a different entity then reduce this entity to time only. Then we are not dealing with spacetime. This is like assuming a cube then reducing the cube to one dimension but calling the line a cube. It’s not.

mfnOP3y ago

Yeah I wasn't sure how to set things up there - if I kept a single space dimension + a time dimension, then I'd have to explain the negative sign on one of the terms, and probably also talk about the Einstein summation convention to keep things clean. Whereas with a single time dimension, it's not really 'spacetime' as you pointed out.

What motivated this post was that I wanted to give a concrete example of what it really means for some symmetry to 'dictate' the structure of a physical theory, but do so in the simplest way possible - i.e. not deal with spinors, gamma matrices, quantum fields - and the rest of the actual machinery of the standard model. The core idea is so profound that I felt like there has to be a way to get a taste of it across in a way that's accessible.

Turned out to be a lot harder than I thought - I had to skip quite a few steps in the post to keep it from becoming too long, but I'm hoping the model still conveys the essence of how a symmetry + action principle can 'predict' particles.

nyc1113y ago

I think what you did is standard. But what I question is, if we can solve our problem without assuming spacetime, why do we need the abstraction called spacetime? Spacetime looks like a historical quirk that physicists feel obligated to carry.

For instance, bending of the light experiment is not done in spacetime but in space and time.

kryptiskt3y ago

But you can't rip them apart in our physical theories, in relativity different observers will have different ideas about space and time, but will agree about certain invariants if you take both space and time into account. Then a few years later, Minkowski formulated special relativity very elegantly using a four dimensional space-time. That view is basic to general relativity, where the foundation is the spacetime metric and the energy-momentum tensor.

drran3y ago

Spacetime is not a physical thing. It is a 4D array of measurements: `[x,y,z;t]`, a collection of frames in a simulation. It's important to simplify calculations, to predict something, but it connected to reality in same way as height field (a 2D array of height measurements) connected to Earth. Height field is just numbers, while Earth is a planet full of rock, water, sand, air, etc. You cannot study geology by studying height fields.

nyc1113y ago

"Spacetime is not a physical thing."

But physicists shamlessly reify spacetime. How can something that is said to have a "fabric" not physical? How can something that expands not be physical.

At this point physicists enter into semantics and ask, "it all depends on how you define 'physical'". And indeed, physicists heavily use casuistry, and they have a definition for every case.

To me, if spacetime is not physical, then, it does not exist. Similarly, I don't think such mathematical abtractions as "point particles" exist. Physicists do not worry about existence. They only care to get a prediction.

r0uv3n3y ago

Spacetime has a notion of distance between its different points attached to it. These distances are important for physics. When people say that spacetime "stretches" or "expand", they mean that these distances change.

psychoslave3y ago

Let me guess, because symmetry is far more easier to handle from a human cognitive point of view?

Koshkin3y ago

Symmetry is “handled” by Group Theory which is rather involved.

behnamoh3y ago

I wonder how much of this can be ELI5'ed with diagrams and intuitions instead of algebra.

jiggawatts3y ago

Most of it.

Or conversely, a lot of it can be explained without having to resort to category theory or similarly dense terminology.

I've found that Mathematicians like to come up with the most general, tersest definitions possible, where every symbol is overloaded with layer upon layer of meaning.

You end up with language that looks like:

    a = b c

For some absurd percentage of the statements. Sometimes the equal sign is an arrow, and the various symbols have superscripts and subscripts on them, but all meaning is lost unless you read the text, which then becomes an exponentially expanding set of hyperlinks to definitions that you need to unpack the definitions.

This is never useful as a method of pedagogy, yet this is about the only type of content you will ever find online in places like Wikipedia or nLab.

Any attempt to clarify with an example is resisted, because it's not "general", or "not the definition" of the concept. In the end, all practicality is erased, leaving definitions so pure that they could be referring to almost anything.

gizmo6863y ago

At the extreme end of this you get category theory, with Mathematicians affectionately refer to as "abstract nonsense".

I checked the homepage of nLab and found this description [0]:

> This is a wiki for collaborative work on Mathematics, Physics, and Philosophy — especially, but far from exclusively, from the n-point of view: with a sympathy towards the tools and perspective of higher algebra, homotopy theory, type theory, category theory and higher category theory.

And following to the description of n-point of view:

> In particular, the nLab has a particular point of view, which we may call the nPOV, the higher algebraic, homotopical, or n- categorical point of view.

So, nLab seems to have made a very deliberate choice to cater to the style you are talking about.

I think your complaint is more valid for Wikipedia, which seems to want to be more general purpose. However it is not specific to math. Most articles on Wikipedia have a tendency to assume a high level of subject knowledge on the part of the reader.

The math community is well aware of the pedagogical concerns. You don't see category theory until grad school, where it is taught with the assumption that you are well versed in many concrete instances of categories.

You don't even see abstract algebra until the later mart of a undergrad math major. Again, it is taught with the assumption that you are already well versed in several concrete instances of the algebraic structures.

If you get a math textbook, you will find that they almost never take the most generic approach; and instead tend to take the most concrete approach that would cover the subject matter.

[0] https://ncatlab.org/nlab/show/HomePage

galaxyLogic3y ago

I would like to see math and physics presentations always having source-code which would do the calculations. Then the explanation for theory could refer directly to explaining that computer-program. You could understand it all simply by understanding the programming language used. You could debug through it and see how the calculations happen step by step.

drran3y ago

Equality is very important concept in mathematics, because it allows to manipulate formulas by rules. When formulas are manipulated, it's important to keep them as small as possible, because it's hard to manipulate a wall of text. Instead of short sequence of formulas, which you can write and verify quickly, you will have lengthy books, like written by ancient mathematics.

For example, command line tools, which are used often, also have cryptic one-symbol options, like `tar xzf`, which looks like formulas, for same reason: to be able write, manipulate, and check them quickly. However, most command tools are distributed with a built-in help or manual for these cryptic options, while most formulas are not. IMHO, we need something like «formulapedia», with manuals for most popular formulas.

paulpauper3y ago

Because to expand the whole expression, such as the Einstein field equations, would fills many lines or pages.

teaearlgraycold3y ago

As someone that also doesn't like the status quo - I would love for there to be a better mathematical notation. If we could get some kind of functional programming-like syntax it would be nice and consistent.

Also what's up with the acceptance of single letter variables in math?

4 more replies

thereIsazrzn3y ago

When math “started” paper and writing utensils were harder to acquire. One had to transmit idea’s succinctly.

Still it would take many more books to capture it all in English than a simpler glyph set. One will still go further in math learning the meaning of the glyphs than reading it all explained in “human language”.

It’s not so hard when one accepts math is the four common operations (add, sub, div, mult) on a variety of data structures though. “Square root of” is a shorter coded language for “use numeric result of this sequence of operations against this given number” (the data structure).

Math isn’t hard. Government mandate we teach students in speaking traditions instead of educate them to measure for themselves has resulted in piss poor education in math.

photochemsyn3y ago

Feynman's Caltech lectures (Symmetry in Physical Law plus a few others) tackle the problem in this manner (at first) before going into vector analysis and vector algebra. For example with respect to rotation:

> "Another example in which the laws are not symmetrical, that we know quite well, is this: a system in rotation at a uniform angular velocity does not give the same apparent laws as one that is not rotating. If we make an experiment and then put everything in a space ship and have the space ship spinning in empty space, all alone at a constant angular velocity, the apparatus will not work the same way because, as we know, things inside the equipment will be thrown to the outside, and so on, by the centrifugal or Coriolis forces, etc. In fact, we can tell that the earth is rotating by using a so-called Foucault pendulum, without looking outside."

https://www.feynmanlectures.caltech.edu/I_52.html

irjustin3y ago

PBS Spacetime has a very good video[0] that helps visualize what's going on in the equation, recent too!

https://www.youtube.com/watch?v=PHiyQID7SBs

nyc1113y ago

Great video, thanks. But I don’t understand why they call this thing “beautiful”. To me it looks as ugly as tax law.

j / k navigate · click thread line to collapse

60 comments

moring3y ago