AllConv networks perform better and are an even simpler modification of the typical CNN architecture: just replace max pool layers with strides convolution. Why did the authors not benchmark against state of the art (also see fractional max pooling)
https://arxiv.org/abs/1412.6806