But I am not going to pay $1000/month as a bootstrap startup. What open source alternatives exist that can be run on basic hardware?
It's not like running Postgres which "just works". When you self-host Airbyte, you're still building a good bit.
I felt the same way about the cost of data tools. Paying $1,000 for Fivetran, $2,000 for Snowflake, $2,000 for Looker seemed crazy. We bundle all three for $500 / month at https://www.definite.app
One of their pitfalls is charging by the row. If you're cost-conscious, you really need to watch what data you're syncing and you need to pare it down quite a bit during the 2-week period they give you when setting up a new connector. If you do all that though, you can get a lot of mileage out of the free plan for some use cases.
As I said, I totally understand this market and why these companies are valuable. I respect the work they do. But while I am a tiny, tiny startup I don't want to lock in to anything and I know I can handle the amount of data myself with little effort if I have a basic open source alternative I can manage myself.
To be honest, I hadn't really given much thought about what event streaming I would use anyway. So I imagine using redpanda along with redpanda connect could be that layer (I was considering just using Redis streams or even PostgreSQL) and then there is just another redpanda connector for the db to add into that mix. If someone is starting from scratch that might be a good path. But I agree the MIT license of warpstream is a bit nicer if all you need is the connectors.
It's like logging. Yeah, there is sentry, papertrail, splunk, datadog and the like. But something better than greping sys logs is nice and totally reasonable for a startup to standup with Kibana/Elastic running on a tiny instance. That can provide significantly higher value.
There is a middle ground between stone tools and jet aircrafts. I was asking: what are the middle ground tools in this space.
My best bet for now will be dlt if you have dedicated DE team, but sling will get you a long way for moving data around your warehouse
I built a company[0], SeekWell, in this space (launched before Census), but was mostly focused on Sheets and Slack as destinations. SeekWell was acquired a few years ago too.
Once you have customers and a good network of integrations with a large number of tools, I suspect it's easier to just buy that company than build it all yourself?
- Census last raised $60M Series B at a $630M valuation (upper bound)
- Census’s estimated annual revenue is $31.6 million with ~200 employees.
- Median private-SaaS EV/ARR multiple is 7× (7 * 31 = 217 = lower bound)
- Hightouch raises $80M on a $1.2B valuation(at ~60× ARR)
- Twilio completes $3.2B acquisition of Segment at ~21× ARR (upper multiple bound)
I can’t be the only one
if you want a data platform that's built to work as one cohesive unit, we got you: https://www.definite.app/
Definite has a data lake, ETL, and BI in one app.