undefined | Better HN

0 pointsjsonis1y ago0 comments

> These days, my go-to solution is SQLite with two fields (well, three, if you count the implicit ROWID, which is invaluable for paging!): ID and Data, the latter being a JSONB blob.

Really!? Are you building applications by chance or something else? Are you doing raw sql mostly or an ORM/ORM-like library? This surprises me because my experience dabbling in json fields for CRUD apps has been mostly trouble stemming from the lack of typechecks. SQLite's fluid type system haa been a nice middle ground for me personally. For reference my application layer is kysely/typescript.

0 comments

PreInternet011y ago

> my experience dabbling in json fields for CRUD apps has been mostly trouble stemming from the lack of typechecks

Well, you move the type checks from the database to the app, effectively, which is not a new idea by any means (and a bad idea in many cases), but with JSON, it can actually work out nicely-ish, as long as there are no significant relationships between tables.

Practical example: I recently wrote my own SMTP server (bad idea!), mostly to be able to control spam (even worse idea! don't listen to me!). Initially, I thought I would be really interested in remote IPs, reverse DNS domains, and whatever was claimed in the (E)HLO.

So, I designed my initial database around those concepts. Turns out, after like half a million session records: I'm much more interested in things like the Azure tenant ID, the Google 'groups' ID, the HTML body tag fingerprint, and other data points.

Fortunately, my session database is just 'JSON(B) in a single table', so I was able to add those additional fields without the need for any migrations. And SQLite's `json_extract` makes adding indexes after-the-fact super-easy.

Of course, these additional fields need to be explicitly nullable, and I need to skip processing based on them if they're absent, but fortunately modern C# makes that easy as well.

And, no, no need for an ORM, except `JsonSerializer.Deserialize<T>`... (And yeah, all of this is just a horrible hack, but one that seems surprisingly resilient so far, but YMMV)

Izkata1y ago

> Fortunately, my session database is just 'JSON(B) in a single table', so I was able to add those additional fields without the need for any migrations. And SQLite's `json_extract` makes adding indexes after-the-fact super-easy.

Our solution for a similar situation involving semi-structured data (in postgres) was to double it up: put all the json we send/receive with a vendor into a json field, then anything we actually need to work on gets extracted into regular table/columns. We get all the safety/performance guarantees the database would normally give us, plus historical data for debugging or to extract into a new column if we now need it. The one thing we had to monitor in code reviews was to never use the json field directly for functionality.

jsonisOP1y ago

This is exactly what I've tried (and failed at) doing! Can I ask how you handle normalization from vendor data when it contains relationships and multilevel nesting? How do you know when to create a new child table, and which ones to create, and their relationships etc. I haven't found a good balance yet.

2 more replies

Vampiero1y ago

> Well, you move the type checks from the database to the app, effectively, which is not a new idea by any means (and a bad idea in many cases), but with JSON, it can actually work out nicely-ish, as long as there are no significant relationships between tables.

That way you're throwing away 50% of the reason you use a relational database in the first place. Has it occurred to you that MongoDB exists?

Also I don't understand why you're afraid of migrations, especially since you're the only developer on your own SMTP server.

PreInternet011y ago

> Has it occurred to you that MongoDB exists?

My original comment started with "but it feels "prior to the MongoDB-is-webscale memes""

So, care to take another guess? And, while we're here, does MongoDB run fully in-process these days? And/or allow easy pagination by ROWID?

1 more reply

randomdata1y ago

> Has it occurred to you that MongoDB exists?

What gain would MongoDB offer here?

You certainly would lose a lot of things, like a well supported path to linking with to the database engine, and a straightforward way to start to introduce relational tables as the project matures. Nothing completely insurmountable, of course, but carry a lot of extra effort for what benefit?

throwaway1737381y ago

How does MongoDB handle someone pulling the power cord out of the server? Because that’s another reason to use something like SQLite, and it often gets used in embedded systems.

2 more replies

throwup2381y ago

> That way you're throwing away 50% of the reason you use a relational database in the first place. Has it occurred to you that MongoDB exists?

Did you miss that he’s using sqlite? The dev experience with a sqlitedb is way better than running yet another service, especially for personal projects.

Sqlite is used just as much as an application file format as it is a relational database.

immibis1y ago

> Fortunately, my session database is just 'JSON(B) in a single table', so I was able to add those additional fields without the need for any migrations.

> And SQLite's `json_extract` makes adding indexes after-the-fact super-easy.

That's a migration.

> Of course, these additional fields need to be explicitly nullable, and I need to skip processing based on them if they're absent

That's an effect of not migrating - having to process null and absent fields instead of just null fields. After doing more of these, you'll run into the same thing that made people stop using NoSQL databases: with no schema, your code has to parse all previous versions of the data format and they probably aren't even well-documented. While an RDBMS can just set the new column to null in existing rows.

throwup2381y ago

> And, no, no need for an ORM, except `JsonSerializer.Deserialize<T>`... (And yeah, all of this is just a horrible hack, but one that seems surprisingly resilient so far, but YMMV)

I do the same thing with serde_json in Rust for a desktop app sqlitedb and it works great so +1 on that technique.

In Rust you can also tell serde to ignore unknown fields and use individual view structs to deserialize part of the JSON instead of the whole thing and use string references to make it zero copy.

hyperadvanced1y ago

“Without the need for any migrations” seems like a weird one to me. Of all of the things I don’t like having to do, migrations are pretty low on my list - far below any scripting-in-prod or applying hacks or even just rudimentary business logic changes. Granted, I used to fear them a lot back when I was expected to apply them by hand rather than use CICD and testing to validate/apply them

collinmanderson1y ago

> I recently wrote my own SMTP server (bad idea!)

Recently I’ve been tempted to make an SMTP server that translates emails into a web-hook. Please tell me more horror stories so that I might be convinced not do it.

jvans1y ago

why is a migration such a burden in that scenario

immibis1y ago

because they put everything in JSON. Migration means running a script to parse and edit each JSON item instead of letting the database do database things automatically.

cyberax1y ago

In my current company, we're using a similar approach: just shove everything into a JSON blob. If you need a constraint or an index, you can create a computed column (in PostgreSQL) that pulls out a field from JSON.

For the data schema, we're using Protobufs with buf validate. This works surprisingly well, you can use the same types in the backend API and on the frontend. We even have a cron job that reads all the data periodically and verifies that the JSON blobs conform to the schema. Our next intern is going to write a PostgreSQL extension to do that on commit/update :)

One real advantage of this approach is that it's easy to do stuff like "search everywhere".

kubectl_h1y ago

Do you go to the trouble of updating individual values in objects using some kind of deep/partial updating function or do you just accept race conditions that come with updating full objects?

cyberax1y ago

We use optimistic versioning, with a dedicated "version" field (that is actually always pulled out of the blob in all tables).

Classic fine-grained schemas are not that much different. A lot of high-level ORM frameworks simply save all the objects' fields on update, without doing fine-grained diffs.

In addition, our frontend apps also support offline mode. They can get all the relevant objects, and then operate on them locally. So our API was designed from the start to deal with conflicts.

2 more replies

duskwuff1y ago

A lot of databases include functions for manipulating JSON which can be used for atomic updates. SQLite has JSON_PATCH(), for instance.

1 more reply

icedchai1y ago

I've worked on a system were a guy "migrated" a database from NoSQL to SQL as a proof-of-concept. Except he really didn't migrate anything. He basically created tables with "ID" (UUID) and "data" (json) columns everywhere. No indexes.

The actual "data" is a mess: different data types for the same JSON field. Imagine storing "price" as both a float and a string, depending on whatever buggy code happened to do the inserts.

It worked enough for a prototype and that was enough for management to believe the project was a success. I can't wait until we actually try and deploy it.

j / k navigate · click thread line to collapse

0 comments

PreInternet011y ago

> my experience dabbling in json fields for CRUD apps has been mostly trouble stemming from the lack of typechecks

Of course, these additional fields need to be explicitly nullable, and I need to skip processing based on them if they're absent, but fortunately modern C# makes that easy as well.

And, no, no need for an ORM, except `JsonSerializer.Deserialize<T>`... (And yeah, all of this is just a horrible hack, but one that seems surprisingly resilient so far, but YMMV)

Izkata1y ago

jsonisOP1y ago

2 more replies

Vampiero1y ago

That way you're throwing away 50% of the reason you use a relational database in the first place. Has it occurred to you that MongoDB exists?

Also I don't understand why you're afraid of migrations, especially since you're the only developer on your own SMTP server.

PreInternet011y ago

> Has it occurred to you that MongoDB exists?

My original comment started with "but it feels "prior to the MongoDB-is-webscale memes""

So, care to take another guess? And, while we're here, does MongoDB run fully in-process these days? And/or allow easy pagination by ROWID?

1 more reply

randomdata1y ago

> Has it occurred to you that MongoDB exists?

What gain would MongoDB offer here?

throwaway1737381y ago

How does MongoDB handle someone pulling the power cord out of the server? Because that’s another reason to use something like SQLite, and it often gets used in embedded systems.

2 more replies

throwup2381y ago

> That way you're throwing away 50% of the reason you use a relational database in the first place. Has it occurred to you that MongoDB exists?

Did you miss that he’s using sqlite? The dev experience with a sqlitedb is way better than running yet another service, especially for personal projects.

Sqlite is used just as much as an application file format as it is a relational database.

immibis1y ago

> Fortunately, my session database is just 'JSON(B) in a single table', so I was able to add those additional fields without the need for any migrations.

> And SQLite's `json_extract` makes adding indexes after-the-fact super-easy.

That's a migration.

> Of course, these additional fields need to be explicitly nullable, and I need to skip processing based on them if they're absent

throwup2381y ago

> And, no, no need for an ORM, except `JsonSerializer.Deserialize<T>`... (And yeah, all of this is just a horrible hack, but one that seems surprisingly resilient so far, but YMMV)

I do the same thing with serde_json in Rust for a desktop app sqlitedb and it works great so +1 on that technique.

In Rust you can also tell serde to ignore unknown fields and use individual view structs to deserialize part of the JSON instead of the whole thing and use string references to make it zero copy.

hyperadvanced1y ago

collinmanderson1y ago

> I recently wrote my own SMTP server (bad idea!)

Recently I’ve been tempted to make an SMTP server that translates emails into a web-hook. Please tell me more horror stories so that I might be convinced not do it.

jvans1y ago

why is a migration such a burden in that scenario

immibis1y ago

because they put everything in JSON. Migration means running a script to parse and edit each JSON item instead of letting the database do database things automatically.

cyberax1y ago

One real advantage of this approach is that it's easy to do stuff like "search everywhere".

kubectl_h1y ago

Do you go to the trouble of updating individual values in objects using some kind of deep/partial updating function or do you just accept race conditions that come with updating full objects?

cyberax1y ago

We use optimistic versioning, with a dedicated "version" field (that is actually always pulled out of the blob in all tables).

Classic fine-grained schemas are not that much different. A lot of high-level ORM frameworks simply save all the objects' fields on update, without doing fine-grained diffs.

In addition, our frontend apps also support offline mode. They can get all the relevant objects, and then operate on them locally. So our API was designed from the start to deal with conflicts.

2 more replies

duskwuff1y ago

A lot of databases include functions for manipulating JSON which can be used for atomic updates. SQLite has JSON_PATCH(), for instance.

1 more reply

icedchai1y ago

The actual "data" is a mess: different data types for the same JSON field. Imagine storing "price" as both a float and a string, depending on whatever buggy code happened to do the inserts.

It worked enough for a prototype and that was enough for management to believe the project was a success. I can't wait until we actually try and deploy it.

j / k navigate · click thread line to collapse