> Each point on the Market Map represents a distinct company traded on the NYSE or NASDAQ and positioned according to a series of market metrics such as the Market Capitalization, the Price to Earnings ratio, EBITDA, and others.
(As a practical tool, not so much…)
- For the tour, I notice that pressing Esc exits it. Would be nice if there is a close button somewhere as well, or at least some way to let the user know that they can press Esc. Currently if someone is using the mouse, they only can exit using the left (Previous) button of the first slide or the right (Start Exploring) button of the last slide.
- Clicking into some nodes and then Reset View seems to spin the view many more times than necessary. Not sure if that's a bug or by design but I would prefer if it uses the least amount of camera movement necessary.
Overall this was really cool to see. I have been wanting to do something similar but based on price action correlation instead of fundamentals. Actually I have just launched a related feature this week called Similar Charts.
If you are interested you can see that here: https://base.report/ticker/ACLS/similar-charts. Please note that the free version only shows you tickers that start with the letter A. The paid version shows you 50 matches.
Er, actually it is possible if you click one of the (unlabeled) neighbors, then zoom back in and hover around to find the original company again. But whether this works seems to depend on which color scheme is active.
In general, most interactions force you to zoom out and lose your place, and sometimes it just keeps rotating for no apparent reason.
NuScale Power (SMR) is categorized as "Life Sciences / Agricultural Production-crops", which makes me question the rest of the data.
The color dimension is configurable, so it makes sense.
But I don't get what the 3 dimensions of spatial coordinates are, and how to change them.
--- Edit
Alright my bad it's actually explained in the "tour": https://pair-code.github.io/understanding-umap/
It seems to be some kind of multivariate PCA.
No, UMAP is nonlinear. The general idea is that you generate a neighborhood graph of your data points, do a spectral embedding on that to get your initial result, and then do gradient descent to make its neighborhood graph closer to the high-dimensional one.
I've experimented with zillions of 3d graphing layouts, usually in the context of PDM/ERP for manufacturing/logistics. Couple of roadblocks I've encountered that are also obstacles here (although he does a MUCH better job than I did in overcoming them, including dynamic distance between nodes, which I can't get away with, sad to say)
First is parallax, the phenomenon of things appearing larger when they are closer to the observer. What this means is that the node size CAN'T be significant in a 3d network unless the perspective is set to orthographic / isometric - because it's going to screw with parallax. How can you tell if the node is actually larger, or if it's just closer?
Second interesting thing about 3d networks is how the (Levenshtein or whatever parm) distance resolves in 3d space, and how that's going to be legible given that we don't have a fourth dimension to stick a camera in. On a 2d surface, the distance-driven force resolves in a 2d vector, so that looking down on it from above, no matter where the force vectors go, all the nodes will be theoretically visible. If you just plot plain distance as a force into 3 dimensions, just using geodesic or straightest line distance, the most tightly gathered nodes will disappear, i.e., be completely occluded. You won't see them!
One possible resolution for this problem, I've found, is classification of distance and assigning this class / category to a specific axis. For example, X axis can be time, Y axis can be a single vector (like, say, military / civilian adoption of a particular dual use part number, expressed as n), and Z axis can represent actual "real" distance (based on tokens, references, "where used", and whatever other factors, either all of them or some of them). This gives you structure where dimensions in the data viz are immediately significant, and simple isometric distance doesn't pile all the nodes in front of each other because they share the same space as the audience.
The takeaway here is that a 3d graph can't just use the same parameters as a 2d graph. The data has to be summarized differently so that the graph remains meaningful. Nothing WRONG with just dumping distance into straight 3d distance, but from the perspective of visual storytelling, it's not optimal.
Also, use isometric cameras. Sure, it's very pretty to have a camera swoop and dive, but it's not going to tell the data's story as well as an isometric camera. (Yes I know I am misusing "isometric" here, but it's the word most people recognize).
Actually the camera here is isometric! It doesn't look that way but it is.
Or for 3 dimensional data I like bubble charts like this: https://mezziapp.com/dashboard/?id=8qFZ3RPiLKTCK3uk4gmG
This actually seems like a potentially great application for LLMs – generating a semantic description of an n-dimensional correlation.
Would you consider sharing anything from the stack, data and tech side?
And on the backend it's a simple Node.js Express server. Data-wise, I can't share the exact APIs I'm using but they're easily searchable.
what it does is move points with a similar cosine distance close to each other (in an stochastic globally-on-average sort of way). a lot of the clusters and other formations are artifacts of the graph layout method moreso than anything else. this has been extensively studied
Analyzing the Direct Correlation between Federal Reserve's Reverse Repo Operations and S&P 500 Stock Prices
https://github.com/adam-s/fred-reverse-repo-analysis/blob/ma...