Moreover, dot product throughput is limited by the memory read throughput.
Any matrix-matrix product implementation is best done based on tensor products of vectors, because each such product is composed of independent operations, so their latencies can be hidden. Moreover, a tensor product requires a number of multiplications equal to the product of the sizes of the operands, but a number of loads from memory equal to the sum of the sizes of the operands.
With enough registers to store the matrix result, it is easy to ensure that the product of the operand sizes is greater than the sum of the operand sizes, so that the throughput of the memory reads does not limit the attainable performance.
Where a matrix-matrix product is done by a single instruction, it normally also uses tensor products. Only when both the input operands and the result are stored in registers, the instruction could also be implemented by AXPY operations (where the fused multiply-add operations are also independent), but not by dot products (with dependent FMAs that prevent pipelining).
"AXPY" is a name that comes from the BLAS library and it refers to an operation fundamental in linear algebra, "A times vector X Plus vector Y". There are many cases when it is possible to choose between AXPY and scalar products. AXPY is normally the right choice, because it is composed of independent FMAs, which can be interleaved and pipelined.
Anyway these are basic stuff, but it's nice to see them formulated!
For a matrix-vector product, where there are only 2 nested loops, only 2 of these choices are applicable (AXPY operation of 2 vectors and scalar product of 2 vectors).
For a vector-vector product, where there is a single loop, only 1 of these choices is applicable (scalar product of 2 vectors).
For numeric computations, where possible, AXPY is preferable to scalar product, and where possible, tensor product is preferable to AXPY. Therefore matrix-vector products should be done with AXPY and matrix-matrix products should be done with tensor products of vectors.
I strongly dislike the misuse of the term "outer product" for the tensor product of 2 vectors, like in the parent article, because this is ambiguous. The original definition of the outer product of 2 vectors (due to Grassmann) is related to the so-called vector product of 2 vectors and "outer" refers to the fact that one vector is multiplied by the component of the other vector that is orthogonal to it, so it points outwards. The tensor product, whose value is a matrix, is completely different. For 2 vector operands, there are 3 distinct products, the first is inner a.k.a. scalar, the second is outer product or (only in 3 dimensions) its variant "vector product" and the third is the tensor product.
The name "tensor product" is also historically and semantically incorrect, but at least it is not ambiguous.
The "tensor product" should have been named "Zehfuss product", after the mathematician who has defined it in 1858 (about 60 years before the word "tensor" had any relationship with it). "Tensor" was originally a name for symmetric matrices, which correspond to affine transformations where a body is stretched in the directions of certain axes. The name was introduced by Hamilton, together with scalar (similarity affine transformation), vector (translation affine transformation) and versor (rotation affine transformation). For unknown reasons Einstein has chosen to use "tensor" with the meaning of "multi-dimensional array", breaking with the tradition, then everybody has imitated him (due to the huge popularity of the Theory of Relativity among non-mathematicians and non-physicists, after the end of WWI, which prompted some book editors to insert the word "tensor" everywhere in several mathematics books and advertise them as being useful to understand the theory of Einstein).
For example there is no linear map which maps a line (or segment) through the origin to a parallel line (or segment of the same length) that doesn't pass through the origin, even though these are clearly just the same object shifted around a bit.
A much more natural set of operations is (IMO) the affine transformations since then I can move things around as you expect. I find dealing with linear maps of lines or circles or polygons a bit unintuitive.
That's why god (well, projective geometrists) made homogeneous coordinates. Without them geometry isn't much fun when using linear algebra (as you have way too many special cases).
Probably when you already know how to work with matrices you can understand but otherwise I'm quite sure that you would not understand.
https://github.com/kenjihiranabe/The-Art-of-Linear-Algebra/b...
- If you understand it, it's a nice visualization but you don't learn anything new
- If you don't, you won't understand it.
Going one click in brings up a paywall, with no pricing. You need to give up your email to get a price.
This feel like a not-very-good business model. This would make a lot more sense as either:
- A fully-baked business model, competitive with other paid resources
- An open-source project on github.
To be a fully-baked business model, it would need:
- Enough teaser content to get people hooked and for people to be able to reshare content
- Things to do (e.g. writing Python code), a place to do it (e.g. an online repl, like most other similar systems), and ways to evaluate it for correctness.
- Clear marketing / branding copy (who did it? what's the privacy policy? what's it cost? etc.)
As an open-source thing, it could slot into a community of similar projects which fill those gaps. It has very nice interactives, but it takes a "tell" rather than a "do" approach, which is helpful in context, but isn't adequate for learning by itself.