Reprojecting the Perseverance landing footage onto satellite imagery (opens in new tab)

(matthewearl.github.io)

633 pointsbmease5y ago37 comments

37 comments

An interesting side point is that the graph optimization approach used here is somewhat similar to modern graph-based visual SLAM

The graph in the article can be seen as a factor graph. VSLAM systems usually have a (kind of bipartite) factor graph with vertices/variables that are either keyframes or ‘good’ features, with edges/factors between features and the frames that see them and between adjacent frames; each of these edges are the factors in the graph. This structure results in very large but sparse graphs, and there are factor graph optimization libraries that take advantage of this sparsity (e.g. g2o or GTSAM.) These libraries also use specialized optimization techniques for some of the nonlinear manifolds (e.g. SO(3)) that arise in SLAM problems.

crowbahr5y ago

Not only impressively coded but a beautiful result as well fascinating to have that real time, frame by frame comparison. Great job!

It only reinforces that I really need to learn my matrix math.

heinrichhartman5y ago

Plug: Just wrote a piece on this topic that you might enjoy reading: https://www.heinrichhartmann.com/posts/2021-03-08-rank-decom...

koheripbal5y ago

Maybe they only uploaded a 1080p version of the video - but I was expecting higher def. ...then again, I suppose interplanetary bandwidths are probably not great.

snovv_crash5y ago

Pix4D made a 3D reconstruction:

https://www.youtube.com/watch?v=20wCGKTpOJw

canada_dry5y ago

Fascinating and well explained!

Reminded me of the video tracking work these folks do: https://forensic-architecture.org/

specialist5y ago

Wow. Thanks for sharing.

30+ years ago, I had friends reconstructing crime scenes for court proceedings. Architectural drawings and 3D scenes. They used AutoCAD and AutoSolid (?). Showing stuff like blood and ballistics.

Super effective. They turned my stomach.

I don't have words for these Forensic Architecture recreations. I almost feel like that I'm there (present).

I can only imagine their future VR recreations will be overpowering.

ncmncm5y ago

These are outstanding!

saulrh5y ago

It'd be pretty neat to lift this up into 3d - you could probably reverse the transforms to find the camera pose for each frame, then drop it into a scene alongside the camera frustum and the topography so we can see exactly how much steering the descent stage did to hit its target and how fast it was descending at every stage.

zokier5y ago

On the other hand, I'd fully expect NASA/JPL to have IMU telemetry from the EDL. If I'm not completely mistaken, they would eventually get published in PDS here: https://naif.jpl.nasa.gov/naif/data.html

For example for MSL (Curiosity, the previous rover) the EDL CK and SPK provides orientation and position data if I'm interpreting this description right: https://naif.jpl.nasa.gov/pub/naif/pds/data/msl-m-spice-6-v1...

The downside being that it'll take probably 6-12 months until the data is put to public

(EDL = entry, descent, landing; IMU = inertial measurement unit; PDS = planetary data system)

MrSourz5y ago

I’ve been thinking more about the navigation of their little helicopter.

On earth were used to being able to use GPS for route planning. If you could use this process in reverse to constantly determine ones position in 3D space above the surface using stored satellite imagery with a downward facing camera cross referenced with whatever gyro / accelerometer based positioning they’re using I wonder if there’d be any benefit. Maybe what they’ve got already is sufficient for anything you’d want to do in the near future.

1 more reply

londons_explore5y ago

The graph based approach is interesting... But I wonder if better and far simpler results might be had by simply using a few iterations of optical flow to perfect the alignment of each frame starting from the alignment of the previous frame?

As a benefit, the transformation could use images after being projected onto a deformable mesh to model the hills etc.

HotVector5y ago

Pretty sure this is optical flow

jcims5y ago

I love that you can see the approach angle in the distortion of the field. It also helps to convey how thin the atmosphere is to see how long it takes for that to square up.

Waterluvian5y ago

I've done this kind of stuff through a point and click UI in GIS software. It's really cool seeing a lot of the underlying math and concepts laid out like this.

parsecs5y ago

I'm curious - what did you do, in what software, and how?

ecommerceguy5y ago

ESRI software has had this raster function for quite a while, at least 20 years. Usually 2 or 3 points would suffice. Using hundreds of points was unnecessary.

1 more reply

zokier5y ago

I think what he is referring to is georeferencing and reprojecting, you can read how it works in e.g. QGIS here https://docs.qgis.org/3.16/en/docs/user_manual/working_with_...

Waterluvian5y ago

Georeferencing processes of various complexity in ArcGIS, PCI Geomatica, and ENVI.

qwertox5y ago

Very impressive.

A next step could be to leave the already projected images where they are, and only draw over them, while marking the latest frame with a border. Eventually use frame sections which cover multiple frames to perform multi-frame superresolution.

gspr5y ago

Beautiful!

Extra kudos to the author for not calling the work done in Torch "learning".

publicola19905y ago

Did Scott Manley do something similar with the Change 5 landing footage on the Moon.

https://youtu.be/lwLPzU8H3HI

0xfaded5y ago

FYI open cv has gradient based image alignment built in, findTransformECC.

https://docs.opencv.org/3.4/dc/d6b/group__video__track.html#...

villgax5y ago

OP should try out SuperGlue for features instead of SIFT

twright5y ago

Excellent post! I wonder why SIFT didn't find sufficient keypoints early on, it's typically a beast of a method for such a task. It looks like there's some intensity variation, the satellite image is darker, but I'm not sure that would explain it all.

bertylicious5y ago

The SIFT algorithm discards low contrast keypoints. In the beginning the surface looks quite blurry (it seems the camera is auto-focusing the heat shield) which probably causes only low quality keypoints to be found on the surface. Additionally, if the algorithm also capped the maximum number of keypoints per image, the situation gets even worse, because strong keypoints on the heat shield (which had to be discarded "manually" later) compete against weak keypoints on the surface.

milofeynman5y ago

I've been having trouble finding the answer to this. How close to it's intended target did it land?

Thanks

rfdonnelly5y ago

5 meters. However, the "intended target" is not simply defined.

The landing ellipse for Perseverance was 7.7km by 6.6km. The goal is to land at a safe spot within the ellipse rather than land at a specific location.

The new Terrain Relative Navigation capability determines the rovers position relative to the surface during descent by comparing camera images to onboard satellite imagery. On Earth you'd use GPS. No GPS on Mars.

Once the rover knows it's position, it can now determine the safest spot to land using an onboard hazard map. The spot it chose to land at vs the spot it actually landed at was 5 meters apart.

Cogito5y ago

> Once the rover knows it's position, it can now determine the safest spot to land using an onboard hazard map. The spot it chose to land at vs the spot it actually landed at was 5 meters apart.

To add a bit more info, poorly remembered from this excellent We Martians episode[0] interviewing Swati Mohan, who is the Mars 2020 Guidance, Navigation and Controls Operations Lead and was the voice of the landing. Go listen to it!

On the way down an image is taken. Using data about how the atmospheric entry is going, and with a lot of constraints that include the hazard map and what kinds of manoeuvres are possible with the descent system (in particular it does a divert and there are minimum and maximum distances the divert must lie between), a single pixel is chosen from that image to aim for. That pixel represents a 10m x 10m square, and the rover landed with 5m of that square.

The hazard map is created from images with a 1m x 1m resolution, from one of the orbiters (Mars Reconnaissance Orbiter I think). Those images are scaled down for the hazard map, as the on-board image processing had very tight bounds on how long it could search for a valid landing site. The podcast goes into some cool detail about that whole system and its technical design.

0: https://wemartians.com/podcasts/94-guiding-perseverance-to-t...

jojobas5y ago

There is an obvious case where you can't rely on GPS on Earth.

Pershing-2 missiles had radar correlation guidance back in the 80's.

An obvious consequence of Google maps imagery and open source is that a capable college student can make an optical terminal guidance unit out of a mobile phone.

Laremere5y ago

You can see the map of where it landed and it's path moving here: https://mars.nasa.gov/mars2020/mission/where-is-the-rover/

The yellow oval is the target landing zone, though it looks like it's a bit too tall on this map compared to other sources.

You can see it's landing targets within the oval here: https://www.jpl.nasa.gov/images/jezeros-hazard-map

So it looks like it landed a little over 1km from the center of the oval, if that's your question.

When precisely talking about space travel, things tend to be discussed as "nominal" instead of being on target or correct. This is because some variance is expected, and systems are designed to work successfully within that variance. In that sense, Perseverance landed within the landing oval and on a safe landing spot, so it was 0 meters away from target.

An analogy would be it hit the bullseye and got the points, even it if it wasn't exactly in the middle of the dart board.

easton_s5y ago

If you look at the last final seconds to the left of the landing you can make out an ancient river delta. That is one of the prime targets they want to investigate.

ralusek5y ago

Or perhaps more importantly, did the terrain navigation software correctly choose an optimal landing location? It seems like it chose one of the rockiest places.

pkaye5y ago

I was designed to have an error under 40m. I don't what it accomplished.

https://arstechnica.com/science/2019/10/heres-an-example-of-...

high_byte5y ago

interesting! couldn't you do it with blender's tracking without python? although it's much more impressive with python.

j / k navigate · click thread line to collapse

37 comments

pietroglyph5y ago

An interesting side point is that the graph optimization approach used here is somewhat similar to modern graph-based visual SLAM

crowbahr5y ago

Not only impressively coded but a beautiful result as well fascinating to have that real time, frame by frame comparison. Great job!

It only reinforces that I really need to learn my matrix math.

heinrichhartman5y ago

Plug: Just wrote a piece on this topic that you might enjoy reading: https://www.heinrichhartmann.com/posts/2021-03-08-rank-decom...

koheripbal5y ago

Maybe they only uploaded a 1080p version of the video - but I was expecting higher def. ...then again, I suppose interplanetary bandwidths are probably not great.

snovv_crash5y ago

Pix4D made a 3D reconstruction:

https://www.youtube.com/watch?v=20wCGKTpOJw

canada_dry5y ago

Fascinating and well explained!

Reminded me of the video tracking work these folks do: https://forensic-architecture.org/

specialist5y ago

Wow. Thanks for sharing.

30+ years ago, I had friends reconstructing crime scenes for court proceedings. Architectural drawings and 3D scenes. They used AutoCAD and AutoSolid (?). Showing stuff like blood and ballistics.

Super effective. They turned my stomach.

I don't have words for these Forensic Architecture recreations. I almost feel like that I'm there (present).

I can only imagine their future VR recreations will be overpowering.

ncmncm5y ago

These are outstanding!

saulrh5y ago

zokier5y ago

The downside being that it'll take probably 6-12 months until the data is put to public

(EDL = entry, descent, landing; IMU = inertial measurement unit; PDS = planetary data system)

MrSourz5y ago

I’ve been thinking more about the navigation of their little helicopter.

1 more reply

londons_explore5y ago

As a benefit, the transformation could use images after being projected onto a deformable mesh to model the hills etc.

HotVector5y ago

Pretty sure this is optical flow

jcims5y ago

I love that you can see the approach angle in the distortion of the field. It also helps to convey how thin the atmosphere is to see how long it takes for that to square up.

Waterluvian5y ago

I've done this kind of stuff through a point and click UI in GIS software. It's really cool seeing a lot of the underlying math and concepts laid out like this.

parsecs5y ago

I'm curious - what did you do, in what software, and how?

ecommerceguy5y ago

ESRI software has had this raster function for quite a while, at least 20 years. Usually 2 or 3 points would suffice. Using hundreds of points was unnecessary.

1 more reply

zokier5y ago

I think what he is referring to is georeferencing and reprojecting, you can read how it works in e.g. QGIS here https://docs.qgis.org/3.16/en/docs/user_manual/working_with_...

Waterluvian5y ago

Georeferencing processes of various complexity in ArcGIS, PCI Geomatica, and ENVI.

qwertox5y ago

Very impressive.

gspr5y ago

Beautiful!

Extra kudos to the author for not calling the work done in Torch "learning".

publicola19905y ago

Did Scott Manley do something similar with the Change 5 landing footage on the Moon.

https://youtu.be/lwLPzU8H3HI

0xfaded5y ago

FYI open cv has gradient based image alignment built in, findTransformECC.

https://docs.opencv.org/3.4/dc/d6b/group__video__track.html#...

villgax5y ago

OP should try out SuperGlue for features instead of SIFT

twright5y ago

bertylicious5y ago

milofeynman5y ago

I've been having trouble finding the answer to this. How close to it's intended target did it land?

Thanks

rfdonnelly5y ago

5 meters. However, the "intended target" is not simply defined.

The landing ellipse for Perseverance was 7.7km by 6.6km. The goal is to land at a safe spot within the ellipse rather than land at a specific location.

Once the rover knows it's position, it can now determine the safest spot to land using an onboard hazard map. The spot it chose to land at vs the spot it actually landed at was 5 meters apart.

Cogito5y ago

> Once the rover knows it's position, it can now determine the safest spot to land using an onboard hazard map. The spot it chose to land at vs the spot it actually landed at was 5 meters apart.

0: https://wemartians.com/podcasts/94-guiding-perseverance-to-t...

jojobas5y ago

There is an obvious case where you can't rely on GPS on Earth.

Pershing-2 missiles had radar correlation guidance back in the 80's.

An obvious consequence of Google maps imagery and open source is that a capable college student can make an optical terminal guidance unit out of a mobile phone.

Laremere5y ago

You can see the map of where it landed and it's path moving here: https://mars.nasa.gov/mars2020/mission/where-is-the-rover/

The yellow oval is the target landing zone, though it looks like it's a bit too tall on this map compared to other sources.

You can see it's landing targets within the oval here: https://www.jpl.nasa.gov/images/jezeros-hazard-map

So it looks like it landed a little over 1km from the center of the oval, if that's your question.

An analogy would be it hit the bullseye and got the points, even it if it wasn't exactly in the middle of the dart board.

easton_s5y ago

If you look at the last final seconds to the left of the landing you can make out an ancient river delta. That is one of the prime targets they want to investigate.

ralusek5y ago

Or perhaps more importantly, did the terrain navigation software correctly choose an optimal landing location? It seems like it chose one of the rockiest places.

pkaye5y ago

I was designed to have an error under 40m. I don't what it accomplished.

https://arstechnica.com/science/2019/10/heres-an-example-of-...

high_byte5y ago

interesting! couldn't you do it with blender's tracking without python? although it's much more impressive with python.

j / k navigate · click thread line to collapse