I work at a physical commodity operation where we move around 8-10 million metric times of goods CFR. In this connection we are looking to digitize and automate many things in the company which I will not bore you with. However, we are also looking to step up our SnD analysis by incorporating various type of big data analysis, e.g. satellite data. There are some SaaS providers that offers 25% each of what we are looking for via API, so we need to work with multiple vendors. Therefore I am faced with a choice:
1. Either we buy from four different suppliers and solve our data needs quite fast, or… 2. We buy more raw data and invest in building the algorithms ourselves
There is a data science team of two people with a calendar that is 80-90% full already. Therefore I am trying to review the pro’s and con’s of building myself:
Pro’s: - Full flexibility - I own the algorithms - New knowledge created in the organization which could lead to more innovation - No risk of vendor login
Con’s: - Higher costs - A lot of man hours spent - Unknown time frame for implementation - We need extra people to maintain the algorithms
I estimate that the absolute minimum cost of data for building will cost around USD 100.000 (without extra labor costs) and for buying off the shelf USD 150.000.
I would be happy to hear if anyone in here is struggling with the same hurdles and discuss how they are approaching it?