A MySQL compatible database engine written in pure Go (opens in new tab)

(github.com)

405 pointsmliezun2y ago80 comments

80 comments

Hi, this is my project :)

For us this package is most important as the query engine that powers Dolt:

We aren't the original authors but have contributed the vast majority of its code at this point. Here's the origin story if you're interested:

https://www.dolthub.com/blog/2020-05-04-adopting-go-mysql-se...

webprofusion2y ago

This is very cool! Couple of suggestions:

- Don't use "mysql" in the name, this is a trademark of Oracle corporation and they can very easily sue you personally if they want to, especially since you're using it to develop a competing database product. Other products getting away with it doesn't mean they won't set their sights on you. This is just my suggestion and you can ignore it if you want to.

- Postgres wire/sql compatibility. Postgres is for some reason becoming the relational king so implementing some support sooner rather than later increases your projects relevance.

pbronez2y ago

PostgreSQL support here

https://github.com/dolthub/doltgresql

Background and architecture discussion here

https://dolthub.com/blog/2023-11-01-announcing-doltgresql/

geenat2y ago

What's the replication story currently like?

zachmu2y ago

The vanilla package can replicate to or from MySQL via binlog replication. But since it's memory only, that's probably not what you want. You probably want to supply the library a backend with persistence, not the built-in memory-only one

Dolt can do the same two directions of MySQL binlog replication, and also has its own native replication options:

https://docs.dolthub.com/sql-reference/server/replication

2 more replies

sa-code2y ago

Missed an opportunity to can this uSql!

jddj2y ago

I always found the idea behind dolt to be very enticing.

Not enticing enough to build a business around, due to it being that bit too different and the persistence layer being that bit too important. But the sort of thing that I'd love it if the mainstream DBs would adopt.

I didn't realise the engine was written in Go, and honestly the first place my mind wonders is to performance.

jchanimal2y ago

If you like the idea of the Dolt prolly trees[1], I'm building a database[2] that uses them for indexing, (eventually) allowing for shared index updates across actors. Our core uses open-source JavaScript[3], but there are a few other implementations including RhizomeDB in Rust[4]. I'm excited about the research in this area.

[1] https://docs.dolthub.com/architecture/storage-engine/prolly-...

[2] https://fireproof.storage

[3] https://github.com/mikeal/prolly-trees

[4] https://jzhao.xyz/thoughts/Prolly-Trees

zachmu2y ago

We haven't benchmarked the in-memory database implementation bundled in go-mysql-server in a while, but I would be surprised if it's any slower than MySQL, considering that Dolt runs on the same engine and is ~2x slower than MySQL including disk-access.

https://docs.dolthub.com/sql-reference/benchmarks/latency

timsehn2y ago

Version Control is not the type of thing "mainstream DBs would adopt".

We needed to build a custom storage engine to make querying and diffing work at scale:

https://docs.dolthub.com/architecture/storage-engine

It based on the work of Noms including the data structure they invented, Prolly Trees.

https://docs.dolthub.com/architecture/storage-engine/prolly-...

jbverschoor2y ago

This seems to be a wire-protocol proxy for mysql -> SQL.

The default proxied database is dolt. I'm guessing this is extracted from dolt itself as that claims to be wire-compatible with mysql. Which all makes total sense.

zachmu2y ago

Not a proxy in the traditional sense, no. go-mysql-server is a set of libraries that implement a SQL query engine and server in the abstract. When provided with a compatible database implementation using the provided interfaces, it becomes a MySQL compatible database server. Dolt [1] is the most complete implementation, but the in-memory database implementation the package ships with is suitable for testing.

We didn't extract go-mysql-server from Dolt. We found it sitting around as abandonware, adopted it, and used it to build Dolt's SQL engine on top of the existing storage engine and command line [2]. We decided to keep it a separate package, and implementation agnostic, in the hopes of getting contributions from other people building their own database implementations on top of it.

[1] https://github.com/dolthub/dolt [2] https://www.dolthub.com/blog/2020-05-04-adopting-go-mysql-se...

jsteenb22y ago

Really excellent work! For the curious, would you all be creating an in-memory database implementation that is postgres compatible for the doltgres project?

1 more reply

pizza2342y ago

The compatibility (and functionality in general) is severely limited, not usable in production:

> No transaction support. Statements like START TRANSACTION, ROLLBACK, and COMMIT are no-ops.

> Non-performant index implementation. Indexed lookups and joins perform full table scans on the underlying tables.

I actually wonder if they support triggers, stored procedures etc.

zachmu2y ago

Yes, triggers and stored procedures are supported. Concurrency is the only real limitation in terms of functionality.

The bundled in-memory database implementation is mostly for use in testing, for people who run against mysql in prod and want a fast compatible go library to test against.

For a production-ready database that uses this engine, see Dolt:

https://github.com/dolthub/dolt

zgk7iqea2y ago

Only for the in-memory implementation. It is also specifically stated that you shouldn’t use the in-memory stub in production

osigurdson2y ago

I suspect Go is probably better, but as a long time C# developer I cringe at the idea of implementing a DB with GC language. It seems that you would be fighting the GC all the time and have to write lots a lot of non-obvious low allocation code, using unmanaged structures, unsafe, etc., a lot. All doable of course, but seems like it would be starting on the wrong foot. Maybe fine for a very small team, but onboarding new devs with the right skill set would be hard.

jillesvangurp2y ago

There are quite a few database products and other data intensive systems written in Go, Java, and many other languages. Generally this is much less of an issue than you think. And it's offset by several benefits that come with having some nice primitives to do e.g. concurrency and nice language to work with.

On the JVM you have things like Cassandra, Elasticsearch, Kafka, etc. each of which offer performance and scale. There are lots more examples. As far as I know they don't do any of the things you mention; at least not a lot. And you can use memory mapped files on the JVM, which helps as well. Elasticsearch uses this a lot. And I imagine Kafka and Cassandra do similar things.

As for skillset, you definitely need to know what you are doing if you are going to write a database. But that would be true regardless of the language.

serial_dev2y ago

While it is true that Cassandra and Kafka are great software that countless developers rely on to handle massive scale...

It is also true that the JVM and the GC are a bottleneck in what they are able to offer. Scylla and Redpanda's pitch is "we are like this essential piece of software, but without the JVM and GC".

Of course, having a database written in Go still has its pros and cons, so each to their own.

1 more reply

klabb32y ago

I think this depends on the level of optimization you go for. At the extreme end, you’re not gonna use “vanilla” anything, even in C or Rust. So I doubt that you’ll get that smooth onboarding experience.

In Go, I’ve found that with a little bit of awareness, and a small bag of tricks, you can get very low allocations on hot paths (where they matter). This comes down to using sync.Pool and being clever with slices to avoid copying. That’s footgun-performance tradeoff that’s well worth it, and can get you really far quickly.

winrid2y ago

Well, with a manually managed language you have to do those things pretty much all the the time, but with a GC you can pick which parts are manually managed.

Also I suspect this project isn't for holding hundreds of GB of stuff in memory all the time, but I could be wrong.

neonsunset2y ago

You would be surprised by performance of modern .NET :)

Writing no-alloc is oftentimes done by reducing complexity and not doing "stupid" tricks that work against JIT and CoreLib features.

For databases specifically, .NET might actually be positioned very well with its low-level features (intrisics incl. SIMD, FFI, struct generics though not entirely low-level) and high-throughput GC.

Interesting example of this applied in practice is Garnet[0]/FASTER[1]. Keep in mind that its codebase still has many instances of un-idiomatic C# and you can do way better by further simplification, but it already does the job well enough.

[0] https://github.com/microsoft/garnet

[1] https://github.com/microsoft/FASTER

osigurdson2y ago

Using net6. I agree, performance is generally great / just as fast as its peers (i.e. Java and Go). However, if you need to think about memory a lot, GCed runtimes are an odd choice.

1 more reply

pjmlp2y ago

That is the idealy world do, profit from productivity of using automatic resource management, and only do low allocation code on the paths that actually matter, as the ASP.NET team has been doing since .NET Core was introduced, with great results in performance.

hnlmorg2y ago

There’s already mysql-like databases written in non-GC’ed languages. Such as myself itself.

Odds are if you need one written in Go then you’re requirements are somewhat different. For example the need to stub for testing.

raggi2y ago

yup, it's bad, and even if you "do everything right" minimization wise, if you're still using the heap then eventually fragmentation will come for you too

neonsunset2y ago

Languages using moving garbage collectors, like C# and Java are particularly good at not having to deal with fragmentation at all or marginally at most.

1 more reply

TechTechTech2y ago

It would be great if this evolves to support mysql to postgresql and mysql to sqlite.

Then we can finally have multiple database engine support for WordPress and others.

aargh_aargh2y ago

It is always the edge cases that will kill you. In the case of WP on PostgreSQL, the reason you want WP in the first place is the plugins and those will be hit or miss on PostgreSQL. Just give up on the combination of those two.

kreetx2y ago

Isn't there an adapter from mysql-to-postgres which would essentially mimic all the quirks in mysql onto an actual postgres?

1 more reply

didip2y ago

tidb has been around for a while, it is distributed, written in Go and Rust, and MySQL compatible. https://github.com/pingcap/tidb

Somewhat relatedly, StarRocks is also MySQL compatible, written in Java and C++, but it's tackling OLAP use-cases. https://github.com/StarRocks/starrocks

But maybe this project is tackling a different angle. Vitess MySQL library is kind of hard to use. Maybe this can be used to build ORM-like abstraction layer?

verdverm2y ago

Dolt supports git like semantics, so you can commit, pull, merge, etc...

fedxc2y ago

I always look at these implementations and go wow! But then I think, is there any real use for this?

zachmu2y ago

If your program integrates with mysql in production, you can use this for much faster local tests. It doesn't have to be a go program, although that makes it easier.

zbuttram2y ago

The readme mentions at least one interesting use which presumably is the impetus for its creation: https://github.com/dolthub/dolt

kitd2y ago

If you want to run arbitrary queries on structured data then SQL is a good language to do it in. This library gives you the opportunity to build such a SQL layer on top of your own custom structured data sources, whatever they may be.

west0n2y ago

Interesting, another project implemented in Go that is compatible with MySQL server, alongside others like Vitess and TiDB.

malkia2y ago

Is this for integration/smoke testing?

timsehn2y ago

Most direct users of go-mysql-server use it to test Golang <> MySQL interactions without needing a running server.

We here at DoltHub use it to provide SQL to Dolt.

speleding2y ago

How hard would it be to use this as an in-memory replacement for MySQL for testing, let's say, a Rails project?

Given how important the DB layer is I would be careful to use something like this in production, but if it allows speeding up the test suite it could be really interesting.

maxloh2y ago

I know it is a matter of choice, but why was MySQL chosen instead of PostgreSQL? The latter seems to be more popular on Hacker News.

tobinfekkes2y ago

Typically, things that are more popular on Hacker News are not most popular with the rest of the world.

timsehn2y ago

> Why is Dolt MySQL flavored?

TLDR; Because go-mysql-server existed.

https://www.dolthub.com/blog/2022-03-28-have-postgres-want-d...

We have a Postgres version of Dolt in the works called Doltgres.

https://github.com/dolthub/doltgresql

We might have a go-postgres-server package factored out eventually.

taf22y ago

Could this be used as kind of connection proxy to allow for more clients to a single pool of database servers?

neximo642y ago

Is there anything like this for postgres?

perplexa2y ago

cockroachdb might be close: https://github.com/cockroachdb/cockroach

hwbunny2y ago

What's the purpose of this idea? Snapshotted mysql server? Who uses that and for what purpose?

davgoldin2y ago

Congrats, looks like a lot of hard work!

Could I swap storage engine with own key value storage e.g. rocksdb or similar?

zachmu2y ago

Yes, that's the idea. Writing a simple read only database back end is not too tough.

davgoldin2y ago

Why read only? What's stopping this engine from using (for example) FoundationDB as storage?

1 more reply

karmakaze2y ago

Compatible has many aspects. I'd be interested in the replication protocols.

geenat2y ago

With Vitess likely consolidating its runtimes (vtgate, vtctl, vttablet, etc) into a single unified binary: https://github.com/vitessio/vitess/issues/7471#issuecomment-...

... it would be a wild future if Vitess replaced the underlying MySQL engine with this (assuming the performance is good enough for Vitess).

zachmu2y ago

I don't think this is in the cards for vitess, their whole architecture is built around managing sharded mysql instances.

ceving2y ago

Why not standard conforming SQL instead of MySQL?

kamikaz1k2y ago

shouldn't these projects have a perf comparison table? there was a post a couple days ago about the an in-memory Postgres, but same problem on the perf.

if someone is considering running it, they're probably considering it against the actual thing. and I would think the main decision criteria is: _how much faster tho?_

zachmu2y ago

This is a reasonable point, we'll run some benchmarks and publish them.

We expect that it's faster than MySQL for small scale. Dolt is only 2x slower than MySQL and that includes disk access.

https://docs.dolthub.com/sql-reference/benchmarks/latency

kamikaz1k2y ago

Thanks! Appreciate your response.

In dynamic language land, we tend to use real DBs for test runs. So having a faster DB wouldn't hurt!

sgammon2y ago

Isn't that........ Vitess?

zachmu2y ago

Vitess (a fork of it anyway) provides the parser and the server. The query engine is all custom go code.

sgammon2y ago

Ah, cool, that makes sense. Thanks for clarifying

cvalka2y ago

TiDB!

amelius2y ago

Performance comparison against the original?

j / k navigate · click thread line to collapse

80 comments

zachmu2y ago

Hi, this is my project :)

For us this package is most important as the query engine that powers Dolt:

https://github.com/dolthub/dolt

We aren't the original authors but have contributed the vast majority of its code at this point. Here's the origin story if you're interested:

https://www.dolthub.com/blog/2020-05-04-adopting-go-mysql-se...

webprofusion2y ago

This is very cool! Couple of suggestions:

- Postgres wire/sql compatibility. Postgres is for some reason becoming the relational king so implementing some support sooner rather than later increases your projects relevance.

pbronez2y ago

PostgreSQL support here

https://github.com/dolthub/doltgresql

Background and architecture discussion here

https://dolthub.com/blog/2023-11-01-announcing-doltgresql/

geenat2y ago

What's the replication story currently like?

zachmu2y ago

Dolt can do the same two directions of MySQL binlog replication, and also has its own native replication options:

https://docs.dolthub.com/sql-reference/server/replication

2 more replies

sa-code2y ago

Missed an opportunity to can this uSql!

jddj2y ago

I always found the idea behind dolt to be very enticing.

I didn't realise the engine was written in Go, and honestly the first place my mind wonders is to performance.

jchanimal2y ago

[1] https://docs.dolthub.com/architecture/storage-engine/prolly-...

[2] https://fireproof.storage

[3] https://github.com/mikeal/prolly-trees

[4] https://jzhao.xyz/thoughts/Prolly-Trees

zachmu2y ago

https://docs.dolthub.com/sql-reference/benchmarks/latency

timsehn2y ago

Version Control is not the type of thing "mainstream DBs would adopt".

We needed to build a custom storage engine to make querying and diffing work at scale:

https://docs.dolthub.com/architecture/storage-engine

It based on the work of Noms including the data structure they invented, Prolly Trees.

https://docs.dolthub.com/architecture/storage-engine/prolly-...

jbverschoor2y ago

This seems to be a wire-protocol proxy for mysql -> SQL.

The default proxied database is dolt. I'm guessing this is extracted from dolt itself as that claims to be wire-compatible with mysql. Which all makes total sense.

zachmu2y ago

[1] https://github.com/dolthub/dolt [2] https://www.dolthub.com/blog/2020-05-04-adopting-go-mysql-se...

jsteenb22y ago

Really excellent work! For the curious, would you all be creating an in-memory database implementation that is postgres compatible for the doltgres project?

1 more reply

pizza2342y ago

The compatibility (and functionality in general) is severely limited, not usable in production:

> No transaction support. Statements like START TRANSACTION, ROLLBACK, and COMMIT are no-ops.

> Non-performant index implementation. Indexed lookups and joins perform full table scans on the underlying tables.

I actually wonder if they support triggers, stored procedures etc.

zachmu2y ago

Yes, triggers and stored procedures are supported. Concurrency is the only real limitation in terms of functionality.

The bundled in-memory database implementation is mostly for use in testing, for people who run against mysql in prod and want a fast compatible go library to test against.

For a production-ready database that uses this engine, see Dolt:

https://github.com/dolthub/dolt

zgk7iqea2y ago

Only for the in-memory implementation. It is also specifically stated that you shouldn’t use the in-memory stub in production

osigurdson2y ago

jillesvangurp2y ago

As for skillset, you definitely need to know what you are doing if you are going to write a database. But that would be true regardless of the language.

serial_dev2y ago

While it is true that Cassandra and Kafka are great software that countless developers rely on to handle massive scale...

It is also true that the JVM and the GC are a bottleneck in what they are able to offer. Scylla and Redpanda's pitch is "we are like this essential piece of software, but without the JVM and GC".

Of course, having a database written in Go still has its pros and cons, so each to their own.

1 more reply

klabb32y ago

winrid2y ago

Well, with a manually managed language you have to do those things pretty much all the the time, but with a GC you can pick which parts are manually managed.

Also I suspect this project isn't for holding hundreds of GB of stuff in memory all the time, but I could be wrong.

neonsunset2y ago

You would be surprised by performance of modern .NET :)

Writing no-alloc is oftentimes done by reducing complexity and not doing "stupid" tricks that work against JIT and CoreLib features.

For databases specifically, .NET might actually be positioned very well with its low-level features (intrisics incl. SIMD, FFI, struct generics though not entirely low-level) and high-throughput GC.

[0] https://github.com/microsoft/garnet

[1] https://github.com/microsoft/FASTER

osigurdson2y ago

Using net6. I agree, performance is generally great / just as fast as its peers (i.e. Java and Go). However, if you need to think about memory a lot, GCed runtimes are an odd choice.

1 more reply

pjmlp2y ago

hnlmorg2y ago

There’s already mysql-like databases written in non-GC’ed languages. Such as myself itself.

Odds are if you need one written in Go then you’re requirements are somewhat different. For example the need to stub for testing.

raggi2y ago

yup, it's bad, and even if you "do everything right" minimization wise, if you're still using the heap then eventually fragmentation will come for you too

neonsunset2y ago

Languages using moving garbage collectors, like C# and Java are particularly good at not having to deal with fragmentation at all or marginally at most.

1 more reply

TechTechTech2y ago

It would be great if this evolves to support mysql to postgresql and mysql to sqlite.

Then we can finally have multiple database engine support for WordPress and others.

aargh_aargh2y ago

kreetx2y ago

Isn't there an adapter from mysql-to-postgres which would essentially mimic all the quirks in mysql onto an actual postgres?

1 more reply

didip2y ago

tidb has been around for a while, it is distributed, written in Go and Rust, and MySQL compatible. https://github.com/pingcap/tidb

Somewhat relatedly, StarRocks is also MySQL compatible, written in Java and C++, but it's tackling OLAP use-cases. https://github.com/StarRocks/starrocks

But maybe this project is tackling a different angle. Vitess MySQL library is kind of hard to use. Maybe this can be used to build ORM-like abstraction layer?

verdverm2y ago

Dolt supports git like semantics, so you can commit, pull, merge, etc...

fedxc2y ago

I always look at these implementations and go wow! But then I think, is there any real use for this?

zachmu2y ago

If your program integrates with mysql in production, you can use this for much faster local tests. It doesn't have to be a go program, although that makes it easier.

zbuttram2y ago

The readme mentions at least one interesting use which presumably is the impetus for its creation: https://github.com/dolthub/dolt

kitd2y ago

west0n2y ago

Interesting, another project implemented in Go that is compatible with MySQL server, alongside others like Vitess and TiDB.

malkia2y ago

Is this for integration/smoke testing?

timsehn2y ago

Most direct users of go-mysql-server use it to test Golang <> MySQL interactions without needing a running server.

We here at DoltHub use it to provide SQL to Dolt.

speleding2y ago

How hard would it be to use this as an in-memory replacement for MySQL for testing, let's say, a Rails project?

Given how important the DB layer is I would be careful to use something like this in production, but if it allows speeding up the test suite it could be really interesting.

maxloh2y ago

I know it is a matter of choice, but why was MySQL chosen instead of PostgreSQL? The latter seems to be more popular on Hacker News.

tobinfekkes2y ago

Typically, things that are more popular on Hacker News are not most popular with the rest of the world.

timsehn2y ago

> Why is Dolt MySQL flavored?

TLDR; Because go-mysql-server existed.

https://www.dolthub.com/blog/2022-03-28-have-postgres-want-d...

We have a Postgres version of Dolt in the works called Doltgres.

https://github.com/dolthub/doltgresql

We might have a go-postgres-server package factored out eventually.

taf22y ago

Could this be used as kind of connection proxy to allow for more clients to a single pool of database servers?

neximo642y ago

Is there anything like this for postgres?

perplexa2y ago

cockroachdb might be close: https://github.com/cockroachdb/cockroach

hwbunny2y ago

What's the purpose of this idea? Snapshotted mysql server? Who uses that and for what purpose?

davgoldin2y ago

Congrats, looks like a lot of hard work!

Could I swap storage engine with own key value storage e.g. rocksdb or similar?

zachmu2y ago

Yes, that's the idea. Writing a simple read only database back end is not too tough.

davgoldin2y ago

Why read only? What's stopping this engine from using (for example) FoundationDB as storage?

1 more reply

karmakaze2y ago

Compatible has many aspects. I'd be interested in the replication protocols.

geenat2y ago

With Vitess likely consolidating its runtimes (vtgate, vtctl, vttablet, etc) into a single unified binary: https://github.com/vitessio/vitess/issues/7471#issuecomment-...

... it would be a wild future if Vitess replaced the underlying MySQL engine with this (assuming the performance is good enough for Vitess).

zachmu2y ago

I don't think this is in the cards for vitess, their whole architecture is built around managing sharded mysql instances.

ceving2y ago

Why not standard conforming SQL instead of MySQL?

kamikaz1k2y ago

shouldn't these projects have a perf comparison table? there was a post a couple days ago about the an in-memory Postgres, but same problem on the perf.

if someone is considering running it, they're probably considering it against the actual thing. and I would think the main decision criteria is: _how much faster tho?_

zachmu2y ago

This is a reasonable point, we'll run some benchmarks and publish them.

We expect that it's faster than MySQL for small scale. Dolt is only 2x slower than MySQL and that includes disk access.

https://docs.dolthub.com/sql-reference/benchmarks/latency

kamikaz1k2y ago

Thanks! Appreciate your response.

In dynamic language land, we tend to use real DBs for test runs. So having a faster DB wouldn't hurt!

sgammon2y ago

Isn't that........ Vitess?

zachmu2y ago

Vitess (a fork of it anyway) provides the parser and the server. The query engine is all custom go code.

sgammon2y ago

Ah, cool, that makes sense. Thanks for clarifying

cvalka2y ago

TiDB!

amelius2y ago

Performance comparison against the original?

j / k navigate · click thread line to collapse