Short of large and completely government funded development projects, I think it would be a struggle to get data. Few businesses would be willing to offer up development processes and surrounding data due to potential IP lost. Any organization that has good processes have it in their interest to striffle others from discovering how to improve theirs or learn from the success of others.
Part of the issue is that academia largely just doesn't pay for software development which they can leverage as an accessible cheap data source, it's done as a completely privatized exercise. Any research that requires highly protected commercial processes is pretty difficult to get any traction on unless you're inadvertently rediscovering the same processes (from my experience).
With that said, I feel like the amount of empirical data you'd need is going to be incredibly high, much if it not even currently being collected.