We are mostly focusing on making the process accessible to a broader audience in the geo space, building a solid production-ready end-to-end project.
There are more resources and a step-by-step guide for running on openly available drone imagery in Tanzania:
https://www.openstreetmap.org/user/daniel-j-h/diary/44145 https://www.openstreetmap.org/user/daniel-j-h/diary/44321
It would be pretty useful if you did, even as a just basis for transfer learning.
Also a description of the model that is used? I assume this is the code[1], which references https://arxiv.org/abs/1806.00844, but the code doesn't seem to use WideResnet (although I really know Keras much better than PyTorch so I'm probably missing something.
[1] https://github.com/mapbox/robosat/blob/master/robosat/unet.p...
The model architecture is kept simple on purpose. It used to be an encoder-decoder U-Net'ish architecture which we trained from scratch. Recently (https://github.com/mapbox/robosat/pull/46) I switched out the encoder to a pre-trained ResNet, as proposed by Alexander Buslaev. It's a mix of the papers listed in the docstring at the top with a focus on simplicity and maintainability:
https://github.com/mapbox/robosat/blob/1e687552fe9b254a14d55...
Internally we were also exploring a multi-class PSPNet but decided not to move forward with it right now: the RoboSat model is currently a binary model (feature vs. background) which makes a few things easier in practice, such as efficiently storing results which is needed when scaling it up e.g. to all of North America.
If I'm working in a new domain (which this is to me) then I prefer to get the workflow right (files in the right directories etc) before changing the NN architecture. It's a pretty big time investment to train a NN just to try it.
We even use a Unet architecture with a pretrained resnet50 encoder, and some postprocessing to go from prob maps to polygons, like this project does. Of course, we are much more limited than what you propose, but it is reassuring our side project took the same course as what bugger entities do.
There is not a single feature this model is most suited for: you can add arbitrary features (e.g. tennis courts, swimming pools) in pre-processing and train your model. Then the imagery quality depends on your feature, for example it will be hard to impossible to spot swimming pools in Landsat imagery.
I can also recommend http://openaerialmap.org for playing around with openly available drone imagery. I did this last week as a side-project on my evenings for drone imagery provided by http://www.zanzibarmapping.com. Here is a step-by-step guide for giving it a try on the Tanzania drone imagery: https://www.openstreetmap.org/user/daniel-j-h/diary/44321
NASA also releases satellite imagery for the public.