With command line based tools, you have to edit and explore in a sql editor, paste that into a code editor, use the command line tool and use a web browser to view your data catalog. And then you have to go back and forth constantly between all these tools and do this over and over again for the hundreds of models in your DAG.
Instead, we’ve built an open source editor + command line utility that integrates all of this into a single integrated experience. We feel that better tools lead to better data analysis which helps organizations make better data driven decisions
Here’s a video that shows how intuitive the structure editor is: https://www.youtube.com/watch?v=hskhBTyg258
Come check us out at www.structure.rest and join our slack (https://join.slack.com/t/structuresupport/shared_invite/zt-ddx04ho4-_q43i5o3zQ9jv00qx~dx8A) . Both the editor and command line utility are open source and the editor downloads as an app for Windows, Linux, and Mac. Our command line tool makes it easy to run your DAG as part of CI/CD.
We currently support snowflake (https://www.snowflake.com/), but we are looking forward to supporting other platforms. Let us know if there is a platform you would like us to support next.
We’ve been doing customer interviews for the past couple weeks, and the one feature that is a “table stakes”, “must-have”, “basic need” for all of the data engineers that we interviewed was version control. I made this video https://youtu.be/gVx4JhugCUc showing how we implemented version control. I built a simple version control menu that connects up to the GitHub Rest API (v3). At first, I thought this would be enough, but as I have talked to more people, the picture becomes clear that this is not a simple problem. If any of you guys or gals have similar problems please reach out. We’d be interested in learning about the problem, so we can offer better solutions in the future.
In data engineering, version control can be useful for situations such as when data sources change, ETL automation services change, schemas change, or when business goals change. The big problem is that you don’t want to either start from scratch or refresh all of your tables from scratch when some change happens upstream of the models you are currently working on. I think semantic versioning is an excellent solution to this problem. The idea behind semantic versioning is that each model has its own history - its own changes as well as all of the changes to the models that it uses.
Here's a blog article that goes deeper into the problem - https://www.structure.rest/blog/semantic-versioning-of-data-models)
If this kind of stuff excites you please free to check us out at https://structure.rest or visit or slack: https://join.slack.com/t/structuresupport/shared_invite/zt-ddx04ho4-_q43i5o3zQ9jv00qx~dx8A
Here's a link to our website: https://www.structure.rest
And here's a blog article, I published today in the space: https://www.structure.rest/blog/using-a-data-analytics-stack-to-gain-business-insights
Please checkout the video demo first, to get the idea.
To test out the product, use the url http://unwyre.io/home
Registration is two steps -> email+verification, and subscription. I don't have any customers right now, so if you just want to signup with your email, I'll contact you and give you a live demo.