Django models create a well-defined self-documenting structure for your schema, are easy to evolve using migrations, and there's a wealth of tooling built on top. IMHO, these far outweigh the perceived convenience of simply storing some stuff in a JSON field.
If you find yourself implementing your own validation and schema code for JSON fields, I'd say it's a sign that you should probably stop and migrate the data to Django models instead.
There are some cases when storing JSON is fine, of course, but in my experience they are few and far between.
Don't agree. Where JSON fields with validators are useful is when you would otherwise have inherited models for different variants of the same type of thing. It's a lot easier to understand and query one model with 50 Cerberus (or whatever) schema validators than it is to understand and query 50 Django models. With the JSON approach it leaves one part of the codebase that's complicated to understand, but you mostly shouldn't need to understand it. Whereas when you have tons of different models and each one needs to be used in different places and is queried in different ways and has its own serializers, the mess tends to spread throughout the entire codebase.
I mostly agree. The only JSON I store in the database is UI settings, because it's easy enough to parse it with JavaScript (and fall back in default if the JSON doesn't contain that prop). It also decouples your database schema from the UI, the latter of which can evolve much more rapidly.
For everything else, having a schema is better.
As it well should —this isn't a complaint— just a reminder that keeping denormalised, first party data that can be well indexed is invaluable when you're talking about huge datasets.
It'd also be nice to layer in JSON Schema directly into the modelling but there is a draft-7 compliant Python project that you can use for this in the model's clean() method.
[0]: https://www.postgresql.org/docs/current/datatype-json.html#J...
that's some syntax right there :)
My intuition says this is one of those tradeoffs that is fast and easy in the short run, but over time becomes technical debt. I haven't used these over the long term, though.
For example, if there are 1000 possible attributes, only 5% are populated for any given row. If you have 1000 columns you are going to have a ridiculously wide and mostly empty table.
There’s a number of ways to deal with the issue, e.g. a model per category or an EAV database pattern (this is what platforms like Magento does). None are really ideal, but storing the product attributes as JSON works pretty well. Even more so when your database supports querying the JSON blob.
Relational schema are flexible and allow for an optimizer to figure out how best to fetch your data based on the data are querying and filtering for and stored statistics about your data set.
Document oriented storage is great when you don't need the optimizer because you already know how the data is written and read. This means you can bundle it up into small, single fetch documents. No statistics or optimizer necessary. This is great if you understand your use cases really well, and they never change (good luck with that) or you have a large distributed data set that would be tough on an analyzer.
More importantly, JSONFields can be used as a substitute for EAV pattern.
Then there're a bunch of optional fields. Traditionally, a number of nullable cols would be created. But they're ultimately messy (need to null check every accesses) and unneeded since you can replace them with a single json item. Keys are always optional, so you always need to check for presence or absence, and modern dbs support json natively in many clauses.
Async views have dropped, and that is genuinely exciting. It's taken a lot of work, and there's still a ways to go before Django is truly async native, but this is one of the big steps, and being able to do some concurrent IO on a few routes could be a huge saving for some projects that call out.
Otherwise a lot of stuff has to be done in event queues, to avoid blocking the main thread, and sometimes that means a bad UX when users take actions and aren't offered the guarantee that they are complete - in times where that might be the best option, were you not risking blocking thread.
Other times, I wished Django had something analogous to a component, where everything is just encapsulated in a single file. I don't want to separate javascript/html/css view/template. I want a single file I need to update and an easy way to call to render.
The template system is also difficult to use, if you get complicated scenarios.
I needed to show event details in a modal if that was the user preference. But the page could also be embedded. This lead to me having to render differently depending on the device, whether it was embedded and if it was a popup or not. This lead to an explosion of possibilities, a total 2x2x2 = 8 different views for the same data in the same template.
The most practical way was with if / then statements but that still lead to me repeating html code. And being difficult to reason about and test.
I also got into situation where the template wasn't parsed and updated. Probably because of too many includes or inheritance in the template. For example, I wanted to write a custom css file that would use google fonts that the user selected in the UI. The only way I found to work was to bypass the template and write a function in the view that would render_to_string the css file.
This was always possible.
Any good (code/github) examples of this??
ASGI has kinda passed me by. For some small project's I use Gunicorn.
What's the setup for ASGI that's popular?
I think in python world uvicorn is more popular choice, might be wrong tho
We do take some care to (mostly) only use documented interfaces, so maybe that helps?
It's a lot like taking care of a house or car.
The world does not stand still, much less programming frameworks.
The question is, what it is, what frequently needs to change and why no appropriate abstraction has been found to reach stability.
That said, it takes a while for the Django folk to end support for a release so, at least, there is not much of a hurry to update a running app.
This seems like exactly the way that deprecations should be handled. Are you suggesting they should maintain backwards compatibility for much longer? They already have a pretty long deprecation policy.
I find Django has a pretty solid deprecation policy, with the goal being that if you have no deprecation warnings then upgrading is simple. Third party dependencies do get in the way of that sometimes though.
A lot of the imports seem to move around (which is easy to fix, once you know where they now are), and some things have just vanished.
I know documentation is a nightmare to produce, but there must be a way to automatically produce it for either import relocations or deprecations so you can find things like this easily?
In the web world it's quite frequent to have major versions every couple years that break everything and require substantial dev cycles to keep up.
That said, 1.x applications are quite old by now.
I feel like the last Django release that was at all complicated to upgrade to was maybe 1.9, in terms of getting the test cases to run properly in parallel if they hadn't been properly isolated. And even that was more of a Python3-like situation, where it really only exposed things that people had done incorrectly previously.
How does HN feel about the recent craze to make web python asynchronous?
To me, the performance gains are dubious in many cases and the complexity overhead of handling cooperative multitasking just seems like a step back for a language like python.
Why do I have to decide if my function is synchronous or not when I write it? I don't want to do that, I want to write only one function that can be called synchronously or asynchronously. In Go or Elixir, the decision is made by the caller, not by the API designer.
Which leads me to a parallel universe: Go-like asynchronicity should have been introduced with Python 3, when backward compatibility was explicitely broken. The gain of switching to Python 3 would then also have been a much easier sell than "just" fixing strings.
Of course, there are probably a thousand things that I'm overlooking, but this is my feeling...
[1] https://journal.stuffwithstuff.com/2015/02/01/what-color-is-...
I feel that sync and async functions are fundamentally different. In python coroutine is really just a function you can pause and while it might seem like it's the same thing as a normal function it's actually very different algorithmic-ally speaking as you can include a lot of low level optimizations which is really what async code is all about — getting rid of that IO waiting.
I love async python but after working with it for better half of a year now it is often a bit tiring as you've pointed out. It feels a bit like a patch to the language rather than a feature even with the newest 3.9 version.
Btw you might like https://trio.readthedocs.io which makes asyncio much more bearable!
The 1% where this is needed does exist, but I suspect that there are far more people using the new async features than actually have need for them. And if you don't need them, you're introducing a lot of complexity, without mature tooling around it to reduce that complexity.
Probably 5 years from now there will be mature tooling around this stuff that lowers the complexity so that it is a good tradeoff for average websites. But for now, I don't need to be an early adopter.
Otherwise your application might me humming along smoothly at some point and coming to a sudden complete standstill or performance plummets when a random external API endpoint starts to time out. Yes I have been bitten by this :-)
To fix this while running sync I have dedicated separate application processes for the views that do external calls, but this makes the routing complex. Alternatively you can juggle timeouts on the external API calls but this is hard to get right and you need to constantly keep track if calls are not timed out just because the external endpoint is a bit slower at some point.
So I think this solves a very real-world challenge.
You should add something like https://pypi.org/project/circuitbreaker/
Continuously failing external requests should not make each one of your responses slow.
From what I can tell it is not intended for that purpose, and outright will not work.
10% of the time I'm calling an API that takes 3s and tying up a worker for 3s _might_ be a problem. Being able to not do that would be really handy sometimes.
Not web servers, but I also do a lot of web scraping and Python is definitely the best tool I've used for that job (lxml is fast with great XPath support, Python is very expressive for data manipulation), using async for that could dramatically improve our throughput as it's essentially network bound, and we don't really care about latency that's in the 10s of seconds.
Source: I work on a large production Django site.
Django provides `sync_to_async` and `async_to_sync`, but it's trivial to do this yourself without Django:
https://docs.djangoproject.com/en/3.0/topics/async/#async-ad...
You can write sync code and use async calls only when needed.
Also, async python is awesome. Things were messy 2-3 years ago, but everything is so much better now.
This should have no (big) performance impact, but these will most probably allow better concurrency, which can be quite critical for a web framework.
Making views async won't do much to make an individual request faster (besides keeping it from being blocked by other slower requests).
https://news.ycombinator.com/item?id=23218782
Five times later, there are some new frameworks, but much of the ecosystem is still sync-only. This is actually one of the things that is pushing me towards Go lately. Python just doesn't seem to mature fast enough and tools heavily disagree on conventions.
HN submission: https://news.ycombinator.com/item?id=24048208
Additionally, (this is my pet use case) if you implement a GraphQL server on top of Django (using one of the many libraries), you tend to get subpar performance because GraphQL lends itself really well to parallelised data fetching, which is hard to take advantage of at the moment.
I’d never used it before and it was fantastic but I had to drop down to raw sql to do it. SQLAlchemy has had support for well beyond a year.
I’ve used Django since 2008 and I love it with all its warts but I’ve really grown to prefer SQLAlchemy.
On top of that, you’d need to come up with a decent frontend syntax that aligned with the existing methods.
I think Django made a mistake when first defining the language of aggregates by overloading the values() method and not using group(). To support rollup, values() would need to support it but only when used after an annotate. Not nice.
I often think about what it’d take to use alchemy for the sql building beneath the Django frontend. That would open up so many more possibilities and features.
I also ended up writing a very tiny transformer function and using that directly because core only has a couple supported casts and I needed Postgres timestamp types so I could extract and rollup on the year / month / day. That gave me some insight in to some of the patterns in use in “lower level” Django differ from the expressiveness / composability of SQLAlchemy.
If so, why?
I think Django is just a good package. It’s really productive and the ORM is better and easier to use than entity. The real winner for me is it’s stability though. In the time we’ve gone from classic ASP to web-forms to MVC to modern MVC to .net Core and now soon the new .Net version. Django has remained relatively stable and relatively easy to upgrade.
And make no mistake, we absolutely still operate web-forms applications that will likely never get rewritten in something else.
At the end of the day, tech stacks aren’t really that important though.
Additionally, it has amalgameted into a hybrid framework, where adding a high performance API to an existing MVC app has become trivial.
Not to mention, the excellent language and support for tech like gRPC. On the whole, Asp.net core looks poised to evolve and adapt to changing tech landscape.
I do agree that the stability of Django has made it extremely easy to get an MVP off the ground, especially for a seasoned developer.
I have used both and somehow, the strong typing in C# puts enough constraints on me to reason about my web app as a proper app.
In Django and flask, I would often settle into thinking everything in terms of pieces of data, moreso because of the dynamic nature of Python.
For other use cases, there are better frameworks out there.
Lots of people love ORMs although I’ve found complex queries slow with a hefty object model system, to the point where I’d rather write parameterised sql queries than work with a ORM optimization strategy.
The templating system is nice although these days I mostly use javascript talking to json endpoints, with nearly zero need for a templating system.
Honestly when your need is: “javascript talks to endpoint and endpoint talks to database” I don’t see a greater need than python, nodejs, golang or whatever language you prefer plus a couple of libraries. Most server frameworks add more stuff you probably don’t need, unless you can’t work without an integrated ORM.
We dropped PHP due to the lack of any really good frameworks, but now we have Laravel. PHP is still a solid choice and it’s fast.
In the end we went with Django because we liked Python and Django is really well documented and easy to learn.
For simpler sites Laravel vs Django probably just comes down to which ecosystem you are most familiar with.
We are a .NET shop, we'd love to use Django, Drupal, etc. But does that also mean we need a dedicated Python resource to support these CMS'?
Or could we use these CMS' out of the box?
Or would we just be better off with a .NET CMS like Umbraco or Orchid?
It would be far better to be able to define a function that you can call for bits of html code that might repeat in the same template.
Since I stopped using Django at 1.6 does the new version let you define functions in the templates?
Also lets say you define a block for rendering a button with a different color. Then use 20 different buttons on one page.
Does that mean that the block code would need to be loaded from a file 20 times and parsed each time. Seems like huge performance hit to do it that way.
Is there a technical reason why you can't have a function in the same file.
Am I misunderstanding how it works?
It's called "dashml" on PyPi if anyone is interested. I mostly use it in Django and haven't worked on it lately cause of $DAYJOB though.
async def current_datetime(request):
now = datetime.datetime.now()
html = '<html><body>It is now %s.</body></html>' % now
return HttpResponse(html)
seems like an example that I was thinking of.Talking with a db, or doing http requests are IO operations, so instead of blocking the process, django can now start processing another one. When the IO operation is done, it continues where it stopped.
I don't know much about Python, but in a typical PHP/Apache setup, Apache simply starts as many processes as needed to answer all requests.
In pseudo code I would think it would look like this:
def my_view(request):
name = async_get_name() // returns a promise
city = async_get_city() // returns a promise
waitUntilResolved(name,city)
return HttpResponse('Hello '+name+' from '+city)