RoboSat: feature extraction from aerial and satellite imagery (opens in new tab)

(github.com)

80 pointsdanieljh7y ago15 comments

15 comments

Daniel from Mapbox here. Happy to answer questions or talk through design decisions. Interested to hear your feedback.

We are mostly focusing on making the process accessible to a broader audience in the geo space, building a solid production-ready end-to-end project.

There are more resources and a step-by-step guide for running on openly available drone imagery in Tanzania:

https://www.openstreetmap.org/user/daniel-j-h/diary/44145 https://www.openstreetmap.org/user/daniel-j-h/diary/44321

throwawaymath7y ago

Would you be open to chatting over email? I'm working on a noncommercial software project with a lot of geospatial data and I think you (and more generally folks working at Mapbox) could provide useful technical insight. My email is in my profile if you're up for it :)

smallhands7y ago

Congratulation to you and the team. I have just one question in the world of tensorflowjs how do I run this project on a browser I was hoping to use this project to introduce high students to data science ?

danieljhOP7y ago

I haven no experience with TensorFlow.js. That said, using the RoboSat ONNX model exporter (rs export) you should be able to go from a trained PyTorch model to a portable ONNX protobuf, then from there to a TensorFlow model, and eventually to TensorFlow.js. At least that's how I would approach it. Keep me posted if you look into it and get it working, interesting use-case for sure.

1 more reply

nl7y ago

Have you released pre-trained models?

It would be pretty useful if you did, even as a just basis for transfer learning.

Also a description of the model that is used? I assume this is the code[1], which references https://arxiv.org/abs/1806.00844, but the code doesn't seem to use WideResnet (although I really know Keras much better than PyTorch so I'm probably missing something.

[1] https://github.com/mapbox/robosat/blob/master/robosat/unet.p...

danieljhOP7y ago

We haven't released pre-trained models yet. Mostly for two reasons: 1/ The PyTorch checkpoints depend on the specific Python model class. Even if you refactor only e.g. a MaxPool layer into a direct functional.max_pool function call, loading old checkpoints will no longer work. We have an ONNX model exporter now (rs export) which allows for self-contained and portable protobuf model and weight files. This workflow needs some more time and careful evaluation, though. 2/ The models for Tanzania I was working on in my spare time I can open up for sure. If there is community interest maybe we can come up with a publicly available model catalogue hosting ONNX models and metadata where folks can easily upload and download models. For our internal models and the data we extract we are thinking through a broader strategy since a lot of time and resources are going into creating and cleaning datasets, doing hard-negative mining, running multiple training iterations and so on. They're also bound to the Mapbox aerial imagery on specific zoom levels.

The model architecture is kept simple on purpose. It used to be an encoder-decoder U-Net'ish architecture which we trained from scratch. Recently (https://github.com/mapbox/robosat/pull/46) I switched out the encoder to a pre-trained ResNet, as proposed by Alexander Buslaev. It's a mix of the papers listed in the docstring at the top with a focus on simplicity and maintainability:

https://github.com/mapbox/robosat/blob/1e687552fe9b254a14d55...

Internally we were also exploring a multi-class PSPNet but decided not to move forward with it right now: the RoboSat model is currently a binary model (feature vs. background) which makes a few things easier in practice, such as efficiently storing results which is needed when scaling it up e.g. to all of North America.

nl7y ago

Personally I always try pre-trained models very very useful.

If I'm working in a new domain (which this is to me) then I prefer to get the workflow right (files in the right directories etc) before changing the NN architecture. It's a pretty big time investment to train a NN just to try it.

Atanahel7y ago

Interestingly, we've taken the same approach to process historical document (like 18th Venetian manuscripts).

We even use a Unet architecture with a pretrained resnet50 encoder, and some postprocessing to go from prob maps to polygons, like this project does. Of course, we are much more limited than what you propose, but it is reassuring our side project took the same course as what bugger entities do.

https://dhlab-epfl.github.io/dhSegment/

rishabhj_says7y ago

Hi Daniel, in your experience, what features is this model most suited for and with what granularity of imagery? For example, buildings/roads with landsat(30m)? Cars with 30cm resolution imagery?

danieljhOP7y ago

We are running it internally on our aerial imagery from the Mapbox Maps API. The zoom level even there depends on the feature you want to extract, for example z18 seems to work well for parking lots.

There is not a single feature this model is most suited for: you can add arbitrary features (e.g. tennis courts, swimming pools) in pre-processing and train your model. Then the imagery quality depends on your feature, for example it will be hard to impossible to spot swimming pools in Landsat imagery.

alexcnwy7y ago

Very cool! Does anyone know any good global free satellite imagery datasets?

danieljhOP7y ago

Sentinel 2 data could be interesting for you. With its high refresh rate it can be used for change detection. The multi-band data is especially useful for adding more input channels to the model e.g. for water detection based on NDWI: https://apps.sentinel-hub.com/eo-browser/#lat=52.51747&lng=1...

I can also recommend http://openaerialmap.org for playing around with openly available drone imagery. I did this last week as a side-project on my evenings for drone imagery provided by http://www.zanzibarmapping.com. Here is a step-by-step guide for giving it a try on the Tanzania drone imagery: https://www.openstreetmap.org/user/daniel-j-h/diary/44321

throwawaymath7y ago

Sure, the NOAA has extensive and freely available satellite imagery data: https://www.ncdc.noaa.gov/data-access/satellite-data/satelli...

NASA also releases satellite imagery for the public.

alexnewman7y ago

Planet.com

j / k navigate · click thread line to collapse

15 comments

danieljhOP7y ago

Daniel from Mapbox here. Happy to answer questions or talk through design decisions. Interested to hear your feedback.

We are mostly focusing on making the process accessible to a broader audience in the geo space, building a solid production-ready end-to-end project.

There are more resources and a step-by-step guide for running on openly available drone imagery in Tanzania:

https://www.openstreetmap.org/user/daniel-j-h/diary/44145 https://www.openstreetmap.org/user/daniel-j-h/diary/44321

throwawaymath7y ago

smallhands7y ago

danieljhOP7y ago

1 more reply

nl7y ago

Have you released pre-trained models?

It would be pretty useful if you did, even as a just basis for transfer learning.

[1] https://github.com/mapbox/robosat/blob/master/robosat/unet.p...

danieljhOP7y ago

https://github.com/mapbox/robosat/blob/1e687552fe9b254a14d55...

nl7y ago

Personally I always try pre-trained models very very useful.

Atanahel7y ago

Interestingly, we've taken the same approach to process historical document (like 18th Venetian manuscripts).

https://dhlab-epfl.github.io/dhSegment/

rishabhj_says7y ago

Hi Daniel, in your experience, what features is this model most suited for and with what granularity of imagery? For example, buildings/roads with landsat(30m)? Cars with 30cm resolution imagery?

danieljhOP7y ago

We are running it internally on our aerial imagery from the Mapbox Maps API. The zoom level even there depends on the feature you want to extract, for example z18 seems to work well for parking lots.

alexcnwy7y ago

Very cool! Does anyone know any good global free satellite imagery datasets?

danieljhOP7y ago

throwawaymath7y ago

Sure, the NOAA has extensive and freely available satellite imagery data: https://www.ncdc.noaa.gov/data-access/satellite-data/satelli...

NASA also releases satellite imagery for the public.

alexnewman7y ago

Planet.com

j / k navigate · click thread line to collapse