The implementation of the UK Covid-19 dashboard (opens in new tab)

(techcommunity.microsoft.com)

210 pointszlib4y ago114 comments

114 comments

I do wonder if the power of a distributed database is really needed here; it gets ~1 update a day, so there's no need to have clever consistency stuff. Most of the queries relate to either today's data (you open the map and zoom in to see how doomed your area is today), or the graphs showing a standard set of history (e.g. cases over the last year). You'd think you could extract that data to be static and not require database queries, and only fire up the database for the tiny proportion that go digging in history.

mmcnl4y ago

I agree. For example, the Dutch corona dashboard (coronadashboard.rijksoverheid.nl) is a statically rendered dashboard using Next that gets updated daily. No backend and it's super fast.

Maybe I'm not objective because I'm Dutch myself, but from both a user-facing and technical perspective I think the Dutch dashboard is by far the best corona dashboard in the world. It's very fast, has a lot of detailed visualizations, provides a lot of context and has fair amount of accessibility features.

codefined4y ago

Looking at the Dutch site on my phone (Samsung S10) I noticed it took a little while to load compared to the nigh instant loading of the UK Gov variant. Looking at Page Insights [0] [1] tells a similar picture. Desktop time to interactive times of 0.4s Vs 3.5s and Mobile time to interactive times of 4.5s Vs 13.1s.

The Dutch website seems to spend a lot of that time running the Next JS framework stuff, which the Gov.uk variant does not. It might work quickly on fast computers, but even on modern phones it seems to visibly pause.

mmcnl4y ago

On my iPhone Xr, older than S10, it loads very fast. Also performance depends on which page you benchmark. Landing page is faster for UK, but the cases pages is about twice as fast for the NL dashboard (time to interactive 2.4s for NL vs 4.4s for UK). Also first meaningful paint is faster (0.5s vs 0.8s). This proves that you can get decent performance without an overly bloated costly architecture.

tinus_hn4y ago

It looks really nice. Unfortunately it turns out the main feature, the data, is phony.

https://dvhn.nl/groningen/Meer-ziekenhuispati%C3%ABnten-blij...

When the hospitals feel like it, they test patients that are already in the hospital for something else if they have COVID. And when they don’t feel like it, they don’t. Any patient found to have COVID, is added to the graph. So these numbers, and also derived numbers such as the R value, are statistically useless and vulnerable to manipulation.

Accacin4y ago

I'd say that using Next for a static site is just as over engineered, personally.

mmcnl4y ago

I disagree. Expressing your frontend layout as code is not over engineering at all imo. It makes it easier to re-use code and is great for testability. Next is perfect for this use case. I actually think the code is quite elegant too. It's open source, so don't take my word for it, but have a look for yourself: https://github.com/minvws/nl-covid19-data-dashboard/tree/dev....

1 more reply

xbar4y ago

Is there a dashboard of dashboards somewhere?

mslot4y ago

The dashboard has to deal with a complex data integration problem, with different sources with differences in completeness, accuracy, age, and granularity (at many levels), daily corrections in past data, changes in data structure and semantics over time, large data volume, 4pm traffic spikes. Moreover, an API that allows you to select different metrics for different areas. Being able to simply write a SQL query or update and have it be fast regardless of volume is quite a life-saver if you have a tiny (mostly 1 person) team and development speed/adaptability is essential.

Some example queries issued by the dashboard: https://github.com/publichealthengland/coronavirus-dashboard... https://github.com/publichealthengland/coronavirus-dashboard... https://github.com/publichealthengland/coronavirus-dashboard...

alexchamberlain4y ago

I think the point was that the analyst needs to do that, but the frontend doesn't really need to - it could render a bunch of statically aggregated data and the spikes become another CDN problem.

vlovich1234y ago

See the other comment about the Dutch dashboard. Covid data isn’t changing that quickly. Having the frontend render something more static simplifies the design. No sql queries are even needed and you don’t need to scale out your database.

Accacin4y ago

Well the Dutch site takes much longer to load. All these comments are (rightfully) discussing the back end being incredibly over-engineered, but 99% of people do not care about that. They care about how quickly a page loads, which the gov.uk site does much better.

I guess that implies that using Next for a "static site" is not a great idea.

1 more reply

lloydatkinson4y ago

I had the same thoughts and then it was confirmed how insane this setup is part way through:

“At the time of writing, the Citus distributed database cluster adopted by the team on Azure is HA-enabled for high availability and has 12 worker nodes with a combined total of 192 vCores, ~1.5 TB of memory, and 24 TB of storage. (The Citus coordinator node has 64 vCores, 256 GB of memory, and 1 TB of storage.)”

That’s beyond overkill for something that as you say could be generated statically a couple of times a day.

vidarh4y ago

It's probably overkill, but not really enough overkill to be worth spending much time on.

E.g. 12 worker nodes and 192 vCores means they've picked 16 core nodes. 1.5TB of memory across 12 nodes means 128GB per node. 24TB of storage is just 2TB per node.

So it's 12 relatively mid sized servers/VMs.

They could certainly do it with much less, and I have no interest in looking up what 12 nodes of that spec would cost on Azure, but at Hetzner it'd cost less than 1500 GBP/month including substantial egress. At most cloud providers the bandwidth bill for this likely swamps the instance cost, and the developer cost to develop this is likely many times the lifetime projected hosting cost even with that much overkill.

If they happen to have someone familiar with query caching and CDNs, I'm sure they could cut it significantly very quickly, and even an entirely average developer could figure out how to trim that significantly over time. But even at (low) UK government contract rates it's not worth much time to try to trim a bill like that much vs. just picking whatever the developers who worked on it preferred.

tgv4y ago

> generated statically a couple of times a day.

That would require actual work instead of selling an overpriced generic solution.

smarx0074y ago

Did you look at the 3 different (non-trivial) APIs they are offering on top of the dashboard? Though I have a hard time understanding why use PostgreSQL instead of ClickHouse, for example.

lloydatkinson4y ago

No I didn’t tbh, I didn’t read much further. Notice how one sentence says Postgres was chosen because it was somebody’s preference

1 more reply

sharken4y ago

My suspicion is that since this has to do with COVID, there is no real limit on what the cost should really be.

As for using the setup for other things, that seems less likely given this expensive setup.

1 more reply

samhw4y ago

> could be generated statically a couple of times a day

Hell, let's do some partial evaluation: just bake the computed HTML into the source code and recompile that a few times a day. No need to even read from a file when you can fetch it from rodata.

As for the reason why they did it this way, I assume it's a combination of CV-driven development along with the hackernoon-reading-junior-engineer-meets-cunning-salesperson effect which others have noted.

_ben_4y ago

Yes the static render option seems optimal however if an API is being offered then something dynamic is mandated forcing scaling of the data tier. It seems like even a basic app cache would suffice.

Alternatively, we're building https://www.polyscale.ai/ that is a good fit for this type of use case. It's a global database cache and integrates with Postgres/MySQL etc. We host PoP's globally so the database reads are offset and local to users.

Agree with the other comments in that this feels like a shiny use case to quote to other prospects, but all good :)

illwrks4y ago

My guess is that this is sales pitch. It will be rolled out to business customers to say "look at our shiny bells and whistles", and contracts will be signed.

glogla4y ago

I played with the website and it feels really nice.

My guess is that this was web people who were contracted to build a read-only daily updated dashboard instead of interactive web app so they treated it as another web app, just scaled up.

YZF4y ago

To add to this the scale of the data is presumably quite small as well. The geographical resolution is probably not super fine, there's only a handful of different kinds of data (deaths, vaccination whatnot) and the time resolution doesn't have to be too fine either (a day?). Even if you wanted to query it in very sophisticated ways you wouldn't need a database.

londons_explore4y ago

In fact, the UK dashboard had a suspicious outage when total case numbers exceeded the 1 million row limit of Excel... I suspect excel is used in the data prep stage, if not used in serving the dashboard.

riverdweller4y ago

It’s entirely unnecessary. The data is updated too infrequently to justify anything like this.

I built a one-pager vanilla JS site that polls the official Johns Hopkins aggregated data daily, and displays dynamically generated smoothed moving average charts, performs curve similarity analysis to identify similar patterns in different countries, and performs logarithmic regression to depict current doubling/halving times.

This happens entirely on the client side, with no server side component whatsoever (other than the http server to deliver the static HTML&JS that does all the work). See https://covid-19-charts.net/

kakakiki4y ago

I came here to say exactly this. Is there a reason why they didn’t do it? I couldn’t figure it out from the article.

greatgib4y ago

First, when you see public officials doing a blog post on "Microsoft.com" website instead of on a public website, you know that something fishy is going on...

On the other side, I have the feeling that this thing that clearly over-engineered. Just look at data their diagram... If I'm not wrong there is one writer and multiple reader for the data, or at least multiple writers on one side and multiple readers on another side, without a need for "real time" consistency.

So, this thing could probably have been better splitted to not have the use for "scaled" databases

sdoering4y ago

> First, when you see public officials doing a blog post on "Microsoft.com" website instead of on a public website, you know that something fishy is going on...

The article states it was written by Claire Giordano from San Francisco. Not sure where you got the UK Government official from.

To me it read like a b2b marketing piece and showcase. Kind of: We can power this, so we can power your BI dashboard as well.

Taking this into account it was a nice write up and from a data analyst's and consultant's pov interesting to read.

Rastonbury4y ago

Coming from consulting, this is exactly what it is - from a pure engineering standpoint it may be lacking but if you've read any other 'case study' targeted as business people this goes really deep. Normally it's SEO fodder crammed with jargon and buzzwords

arsome4y ago

It definitely had a bit of a marketing tone, but it was focused on what's ultimately an open source product you can run pretty much anywhere, from a different cloud provider to your own bare metal, they just happened to use it on Azure.

greatgib4y ago

Indeed I did not check the bio of the article writer, but when you read:

<<As a result, the GOV.UK Coronavirus dashboard became one of the most visited public service websites in the United Kingdom.>>

You don't expect the gov UK dashboard to be done us consultants...

DrBazza4y ago

> First, when you see public officials doing a blog post on "Microsoft.com" website instead of on a public website, you know that something fishy is going on...

Maybe, I'm naive, or not cynical enough, but I just read this as a case study of customer using Azure to provide the general public with information in a robust fashion.

In fact, if anything, the whole article is remarkably light on pushing Azure, and quite heavy on architecture details.

The open source code (on Github) uses Postgres (not MSSQL), and Python (not C# or Powershell), and in fact has a screen shot of Jetbrain's Pycharm, and not VSCode.

In fact it's probably quite an MS agnostic article.

Even though gov.uk is actually a really good IT company, I'm quite pleased that they're using "the cloud" rather than trying to create their own.

samhw4y ago

> Even though gov.uk is actually a really good IT company

For anyone who's wondering, the relevant team here is GDS[0]. We hired a bunch of engineers from there at one of my previous companies - which was doing some quite gnarly technical work - and they were superb. I believe the US equivalent is 18F.

[0] The Government Digital Service in full, but no definite article for the initialism.

ghassanmas4y ago

To be accurte, it's not completely Micrsoft agnostic, it make use case for a PSQL extension citus[1], the company behind this extension has been acquired by Microsoft two years ago[2].

[1]: https://github.com/citusdata/citus

[2]: https://blogs.microsoft.com/blog/2019/01/24/microsoft-acquir...

StringyBob4y ago

Not as much tech detail, but for an alternative source, here’s an intro by the team dashboard lead - https://ukhsa.blog.gov.uk/2022/01/20/reporting-the-vital-sta...

Also - I’ve been really impressed by the openness of the team actually doing the work - eg threads like https://twitter.com/pouriaaa/status/1476892793729654787

and in particular this analysis of debugging a problem that the dashboard encountered - which also gives a lot more background context: https://dev.to/xenatisch/cascade-of-doom-jit-and-how-a-postg...

CraigJPerry4y ago

I like the UI design in figure 1. There’s no crap in the way of the data but i don’t feel overwhelmed either. My eyes can scan across sections and it feels natural, theres no firehose effect. I like the thought that’s gone into showing the % vaccinated in the top right. I like the dashed underlines telling me that some explainatory text is available.

I think the page looks inoffensive but is clearly focussed on being informative. I wish more data repositories took care and attention towards how data is represented.

one-more-minute4y ago

In general the gov.uk website is stellar. Lots of good info, and plenty of thought has gone into making everything clear, accessible and pleasing to the eye. Things generally work in a way that they really don't on most public org websites. The team behind it blog a fair bit about how they've made these things happen.

https://insidegovuk.blog.gov.uk

The only downside is that they often send you to sites run by other, significantly less competent bodies (looking at you, student loans company).

jsmith994y ago

They had an early win getting a effective and consistent UI accross government sites, but digitising the underlying processes is a work in progress. Once you click that beautiful and accessible green submit button, it's not impossible a printer in Swansea whirs to life to print your answers out into the original paper form.

hunter3214y ago

http://ehmipeach.defra.gov.uk/

>Using the right browser - only use Microsoft Edge PEACH is compatible with Microsoft Edge, but only when Internet Explorer mode is enabled. For guidance on setting up Internet Explorer mode in Edge, follow this link to instructions on the Microsoft Support Page.

Yeah, there's still some way to go.

samhw4y ago

> it's not impossible a printer in Swansea whirs to life to print your answers out into the original paper form

Speaking of which: my mum went through the form to renew her Oyster card the other day, and, having finally completed it, it generated a PDF form and told her to print it out and take it to the Post Office.

So yeah, there are definitely some Jira tickets still on the left.

samwillis4y ago

While I almost entirely agree with you, having had to report positive LFT Covid tests and do the subsequent “test and trace” form this week I feel that while the UI and UX is nice the flow of those forms are really quite pore. There are repetitive/redundant questions plus inconsistent and out of date advise. It’s probably more an effect of the difficult moving target of changing rules and advice, but for what is an important government process everyone is coming into contact with I though it would be better. Maybe wishful thinking though.

cronin1014y ago

I think GOV.UK in general does a very good job of making important information readily available. It’s however extra jarring switching between the clean/efficient style of the online messaging and the underlying public services/offices that are still held together by chewing gum and a fax machine, whenever you have some issue that can only be resolved by persuading a human to stamp some paper (Visas, public records, etc.)

spaniard892774y ago

At least you got that. Try Spain, when the tax agency uses behavioral data and puts incentives for tax agents, but you can't get a f*ing appointment for pretty much any service as they use the most stupid appointment system ever, and 90% of the state services are subcontracted to the lowest bidder making everything a PITA to use.

All the fancy tech to get in your pockets. For everything else, go f*k yourself.

cameronh904y ago

It was a lot worse a decade ago. The GDS have done a pretty good job of unifying the design language and methodology of disparate government departments, but of course, it is a huge job. It clearly involves just as much cultural and organisational overhaul as it does technology work.

Most recently I found the DVLA license renewal was one of those ugly backwaters (albeit still fully online), but their license check code generator is great.

For real terrible stuff, check out local council websites.

eggy4y ago

Yes, I thought the same thing in finding it easier on my eyes/quick perception of the site.

I do think the UK and some other countries do a better job of presenting data compared to the CDC.

It's pretty much agreed that the rate of unvaccinated people vs. vaccinated people winding up in hospital beds is several times higher, however, all the CDC data presented is only rates. I want tallies or counts, and I cannot find them. For instance, on Ontario, Candada's site[1], the vaccinated are 74% vs. the unvaccinated's 26% of COVID hospitalizations. Most non-technical people think the hospitalizations of COVID patients is like over 90%. It's because more and more people are vaccinated, even with a lower rate of hospitalizations, the numbers are higher. Also, it's interesting to see on the Ontario site that COVID hospitalizations consist of 56% directly for COVID, and 44% were admitted for other reasons and then tested positive for COVID once hospitalized. The case is more telling for ICU with 81% admitted for COVID, and 19% for other reasons.

I am trying to play with raw data more for refreshing my munging skills than making a point or fodder to add to the COVID noise. I have been coding since 1978, played with neural nets, GAs, and GP in the late 1980s, but I don't code or do data analysis for a living right now (other than buisness strategy reports that require some basic analysis). There's a lot of data out there, and it can get very confusing. I am back to using R/RStudio from a brief stint using Julia/Pluto notebooks and previously using Python/Jupyter notebooks. I even did a toy DSEIR model in J back in April 2020 based on previous work by a couple of people, which I plan on updating to April[2]. I am going to try and do some Lisp work, and I think I will settle on RStudio and Lisp for more genomic/bioinformatic stuff (yes, I know biolisp has been supplanted by python, however, Lisp is having a renaissance in symbolic-related areas of ML again like NLP). BTW, in what language was GPT implemented, not API languages, but what PL(s) was used to create the code - C++, Java, Go?

I may be bad at navigating the CDC website, but I can't seem to get the dataset of numbers of hospitalizations by vaccination status, only rates or pre-filtered data. I do remember downloading raw data that seemed to have it (over 1.8gb, I think), but I can't seem to find it. I'd appreciate a link if anyone has it.

[1] https://covid-19.ontario.ca/data/hospitalizations#hospitaliz...

[2] https://github.com/phantomics/april

jll294y ago

It's funny to read about a dashboard with TBs of memory and distributed DBs when on HN, people pride themselves on getting Web servers to run on floppy disk based systems.

Joking aside, I liked the description of the dashboard, and generally speaking the UK's government Web sites are better quality, support open data more, are easier to read and navigate than other European countries from what I have seen. This includes this dashboard, which looks clean, simple and functional.

I was waiting for the big SQL Server advertising language and positively suprised that the article is very tech agnostic. I did all seem to be rather over-engineered, but Microsoft needs to make some money and government agencies don't generally have wizards from HN working for them, so I can live with an occasionally over-engineered system as long as important systems are working and remain up.

The most mysterious part for me was why one would put JSON inside relational tables?

wnolens4y ago

> The most mysterious part for me was why one would put JSON inside relational tables?

Cheap and easy way to permit a flexible schema for some part of the data. Performance tests probably showed that for their specific query workload, any slow down from parsing/lack of index was fine.

snthd4y ago

Elsewhere[0] Microsoft have redefined "Open Source" to not include the right to redistribute, or to host on a cloud service.

So while there's nothing wrong here with calling an MIT project open souce, it's not inconsistent with their own definition, and useable as propaganda.

[0] https://azure.microsoft.com/en-gb/services/developer-tools/d...

>Is Azure Data Studio open source?

>Yes, the source code for Azure Data Studio and its data providers is open source and available on GitHub. The source code for the front-end Azure Data Studio, which is based on Microsoft Visual Studio Code, is available under an end-user license agreement that provides rights to modify and use the software, but not to redistribute it or host it in a cloud service. The source code for the data providers is available under the MIT license.

chrisseaton4y ago

It’s still useful to have the source open for reference even if you can’t redistribute though.

nojito4y ago

How is the most common open source license not open source?

samhw4y ago

I mean, whatever the truth may be, you're begging the question somewhat by calling it an 'open source license'.

skylanh4y ago

BSDL, GPL, MIT License.

This has been an argument for at least 25 years that I've been around this stuff.

snthd4y ago

Both sides of that argument agree that the right to redistribute is fundamental to FLOSS.

Microsoft is defining their product, which you can't redistribute[0], as "Open Source".

[0] https://github.com/microsoft/azuredatastudio/blob/4012f26976...

llimos4y ago

> From the beginning of the COVID-19 pandemic, the United Kingdom (UK) government has made it a top priority to track key health metrics and to share those metrics with the public.

According to Dominic Cummings (ex-adviser to the PM), this isn't true at all - one of their biggest failings early on was to not have the data and not see the priority in getting it.[1]

[1] https://news.sky.com/story/dominic-cummings-hearing-the-insi... : He added later that there was no data system at that point, and he needed to use his iphone as a calculator to make predictions about the extent to which infections would spread, which he then wrote down on a white board.

zarzavat4y ago

It's worth noting for those who don't follow UK politics (I don't recommend doing so), that Dominic Cummings was fired as the top advisor, and like a jealous ex he is determined to bring down the government by any means possible. So he is an unreliable source to say the least. Although the government seems to have left enough rope to hang themselves without Cummings needing to invent anything.

Closi4y ago

The government was tracking key health metrics and sharing them at the point Dom was talking about using his iPhone. For instance on the same date, the government did it's first daily briefing and shared infection and death metrics with the public (see https://www.bbc.co.uk/news/uk-51901818).

Tracking key health metrics and sharing those metrics with the public doesn't mean that there is modelling about the extent to which infections would spread - although we also know that the imperial modelling was released a day later, so while he may have been using his iPhone to make predictions there were also academic teams modelling this that were collaborating with the government at the time (see https://www.imperial.ac.uk/news/196234/covid-19-imperial-res...).

It's also not clear what a 'data system' is in this context - there was clearly an effort to very quickly put something in place to capture data (because it couldn't wait a few weeks/months), but a more robust analytics system will inevitably take more than a few weeks to put in place if not already in place pre-pandemic (a lot of this is about how NHS trusts are structured in the UK, which operate fairly independently). It's not clear to me how quickly is realistic to implement what Dom thought was suitable in terms of a 'data system', particularly as I'm not particularly clear on his requirements (he seems to want an element of forecasting built into this system for instance?), so without knowing what the requirements are can we be confident that what he wanted to build was possible to build, test and implement in his expected timeline?

So I don't think there is a clear contradiction here (and in fact, I think the evidence points to the fact that the statement in the article is probably correct).

iamstupidsimple4y ago

From watching his testimony I believe Cummings was disappointed that this 'data system' wasn't available before the pandemic, precisely because that few week delay putting one together meant everything.

Closi4y ago

> I believe Cummings was disappointed that this 'data system' wasn't available before the pandemic, precisely because that few week delay putting one together meant everything.

It's very easy to say that something should have been there after the fact, and harder to build a system for an unknown-threat before you have any clear requirements.

It's also not clear to me what data a 'data system' would have provided that would have meaningfully changed any policy (and if it didn't affect policy, I'm not sure how it can 'mean everything').

What data did the government not have in March 2020 that they could have collected with some sort of pre-built 'data system'? In reality the bigger issue in understanding the situation at the time was that we couldn't identify all the covid cases anyway because there wasn't enough test resource - it wasn't the lack of a 'data system'.

rob_c4y ago

Yes but Cummings deliberately asked questions which are fundamentally unknowable and went with report9 which claimed to be all knowing...

> "and to share those metrics with the public" Given stats on how many patients were in ICU didn't officially exist so you couldnt request them with an FOI request I'll let you work out how true this is. They wanted to control the data to craft a narrative to justify the report9 claims we'd have plague bodies in the streets because this is the end of days...

lozenge4y ago

Well, it depends what time period you're talking about.

In early March 2020, they briefly announced that the daily COVID figures would move to a weekly cadence.

When locking down, they were still flying blind, and after that (during Hancock's 100k tests a day moonshot) there were leaks that "figures were being compiled in a notebook by calling round different labs".

glogla4y ago

I skimmed the article and it seems interesting.

On the data side, they have ~7.5 billion total records and they add in 55 million new a day. On the web side, they have ~1 million daily unique users and 100k concurrent users at peak ("concurrent" means "in one minute" is seems).

I'm no expert on the web part, but I'm kind of curious why they went with the design they did for the data part. The design, and the chosen technologies make me think they treated it more like a normal web app, not like a dashboard. I would expect OLAP database, not a sharded Postgres, and the data model feels very OLTP to me as well. Or maybe is that because it's mostly time series and not traditional data model?

I'll have to go through the article in more detail.

mslot4y ago

OLAP stores are relatively fast at answering a single query on a large data set, but basically none of them can handle high throughput with subsecond response times (e.g. when the whole country checks statistics for their own postcode at 4pm).

OLTP stores are relatively bad at aggregating across a lot of data.

Analytics dashboards with many users, a lot of ever-changing data, and many different views exist in a gray area between OLAP and OLTP often referred to as real-time analytics or operational analytics. The queries are usually somewhat lighter / less ad-hoc / more indexed than in OLAP, but there can be hundreds or thousands of them per second with different filters and aggregations.

There are some specialized real-time analytics databases like Druid. Citus (used in the article) allows you to run such workloads at scale on PostgreSQL.

GraemeMeyer4y ago

A few other commenters have pointed out the same thing - I’m wondering if it’s simply the skill set they had on hand when the need arose

21434y ago

The UK covid dashboard[2] mentioned in the article also has a "simple" version.[1]

I love it when websites have a simple text version.

[1] https://coronavirus.data.gov.uk/easy_read [2] https://coronavirus.data.gov.uk/

zlibOP4y ago

https://github.com/publichealthengland powered by a lot of Python and Ruby, nice.

nojito4y ago

And a bunch of F#

https://github.com/publichealthengland/coronavirus-dashboard...

mdm124y ago

Nice catch!

It surprises me how much more popular F# is in Europe compared to the US. I finally got a professional F# gig in the states (\o/), but there were very few options. It makes me wonder, are universities in Europe providing a more functional-first approach to CS education, or is something else going on?

jll294y ago

Never seen a job ad on anything sharp other than C# in Europe.

There are occasionally LISP and Clojure jobs from what I can tell.

(It's also hard to find people on the talent side. I needed a Haskell developer with NLP skills in 2005, and could not find one so we had to port our codebase to Java.)

1 more reply

boomskats4y ago

This feels like a marketing-led attempt to shoehorn Citus into something topical and shareable, having realised that they've barely talked about it since acquiring the company a couple of years ago.

I'm all for Citus, but cmon. Overkill.

raesene94y ago

The covid dashboard I've found the nicest to use, is actually not one of the official ones but instead this https://www.travellingtabby.com/scotland-coronavirus-tracker... .

The information is presented clearly and it's easy to see what's going on, although in my case the main reason is the breakdown for Argyll & Bute, which isn't a focus area for the national ones!

adenner4y ago

It is miles better than my state's local Covid Dashboard (https://coronavirus.iowa.gov/) that is updated fully once a week on Wednesdays. On Monday and Friday they simply post a screenshot of a pixelated version of a summary page only.

idleherb4y ago

Looking at their database schema (table "auth_user"), it looks like the user passwords are stored unencrypted and without salt?

rob_c4y ago

Given the original meetings were all about producing a dynamic simulation dashboard which would allow people and politicians to understand the impact of various measures on lives saved...

Typical this turned into a pro cloud puff piece that frankly shows a serious amount of over design for what should be a data filtering/processing step to any reasonable "data scientist". And if I'm having to say a data scientist could do it better you know you got it wrong...

axiosgunnar4y ago

> people and monsters

monsters?

rob_c4y ago

Fixed to politicians. Freudian slip and a half there...

2dvisio4y ago

Wonder why they haven’t used powerBI for that, but deep inside I know why...

amelius4y ago

I would have preferred to see it implemented as a spreadsheet in the cloud.

StringyBob4y ago

Well, you can just grab the data very easily: https://coronavirus.data.gov.uk/details/download

amelius4y ago

Yeah, not what I prefer.

As a programmer, I want to make demands about UX too, for a change ...

aembleton4y ago

Why not build your own UX on top of the data?

rob_c4y ago

The data is exportable but you then have to deal with 8MB+ csv files. Still it's better than it being in docx I suppose...

But then this would let you perform statistical analyses on _their_ data, I'm not sure they're such a big fan of that...

aembleton4y ago

What format would you prefer it to be in than CSV? There are CSV libraries in pretty much every language and any spreadsheet can import them. 8MB really isn't that hard to handle these days.

axiosgunnar4y ago

> Government project

> awarded to Microsoft

Hey Europe, want to stop being several decades behind in IT compared to US/China?

One simple trick:

Ban FAANG from public procurement in Europe!

It‘s a no-brainer really.

Buy locally, ideally giving small companies and startups a chance.

You will have to do it anyway very soon if you want your privacy laws to be taken seriously.

There might be a couple of months of friction while buerocrats have to find new procurement partners, but that's it.

And then the European tech scene will rise.

azalemeth4y ago

I really can't agree with this more. My university gives >$2e6/year to Microsoft alone. I'd much rather it gave >$2e6/year to providing jobs for locally employed people, rather than buying someone another yacht.

discordance4y ago

Curious how many students and faculty at your university are served by that $2m/year?

I don’t disagree with you, but if I were the CTO of a 10000+ seat organization, and a Microsoft/Google/etc told me they could provide email, storage, sharing/collaboration, office apps, security etc for a few bucks per user / month… that’s a pretty compelling deal.

notahacker4y ago

And the idea of a university rolling its own office apps suite to provide local employment sounds like the sort of decision likely to end up in tears, or at least in students relying on Excel and Word on their own computers anyway

1 more reply

odiroot4y ago

> Ban FAANG from public procurement in Europe!

Is it any better if SAP / Telekom get similar contracts (what usually happens in Germany)?

axiosgunnar4y ago

At least the tax euros stay in Europe, and you get local expertise, tech hubs etc

stickfigure4y ago

Don't Google Microsoft et al have offices in Europe?

rm4454y ago

> Microsoft

> Ban FAANG

I know exactly what you mean, but is that what we're doing now? Including Microsoft in FAANG but not changing the acronym?

shroompasta4y ago

to be fair, microsoft should've been included in the first place. they've been teetering top 3 market cap for a while now.

amelius4y ago

They should replace Netflix in FAANG or include Disney, HBO and a bunch of others.

desas4y ago

A couple of months wouldn't replace Microsoft word never mind Excel, Aws, Azure AD or Azure

johneth4y ago

Microsoft didn't build this project, they just host it.

It's built in-house by the government.

kakoni4y ago

For instance in Finland Azure/AWS/GCP are seen as a superior alternative against anything local/European and we’ve started to move our govermental and healthcare infra into these cloud providers.

benbristow4y ago

Get your point.

Those small companies and startups are going to end up using Microsoft/Amazon/Google for their hosting/cloud-services anyway so FAANG still win in the end.

tjungblut4y ago

The UK isn't part of the EU anymore.

Doctor_Fegg4y ago

Maybe GP has edited their comment, but it says "Hey Europe" and we are very definitely still part of Europe.

glogla4y ago

But one day you will put that offshore wind turbines into reverse mode and push the isles to the middle of Atlantic, right?

tjungblut4y ago

Indeed it was edited, it said EU before. Europe makes even less sense, there is no centralized public procurement in all of Europe.

rob_c4y ago

But frankly we're very similar in this regard, and given we might get dropped from H24 we're not in a good position tech or science wise heading into a recovery...

We'll no doubt award yet more govt projects to the tech oligarchs of the west and praise students for using their toys...

j / k navigate · click thread line to collapse

114 comments

trebligdivad4y ago

mmcnl4y ago

I agree. For example, the Dutch corona dashboard (coronadashboard.rijksoverheid.nl) is a statically rendered dashboard using Next that gets updated daily. No backend and it's super fast.

codefined4y ago

mmcnl4y ago

tinus_hn4y ago

It looks really nice. Unfortunately it turns out the main feature, the data, is phony.

https://dvhn.nl/groningen/Meer-ziekenhuispati%C3%ABnten-blij...

Accacin4y ago

I'd say that using Next for a static site is just as over engineered, personally.

mmcnl4y ago

1 more reply

xbar4y ago

Is there a dashboard of dashboards somewhere?

mslot4y ago

alexchamberlain4y ago

I think the point was that the analyst needs to do that, but the frontend doesn't really need to - it could render a bunch of statically aggregated data and the spikes become another CDN problem.

vlovich1234y ago

Accacin4y ago

I guess that implies that using Next for a "static site" is not a great idea.

1 more reply

lloydatkinson4y ago

I had the same thoughts and then it was confirmed how insane this setup is part way through:

That’s beyond overkill for something that as you say could be generated statically a couple of times a day.

vidarh4y ago

It's probably overkill, but not really enough overkill to be worth spending much time on.

E.g. 12 worker nodes and 192 vCores means they've picked 16 core nodes. 1.5TB of memory across 12 nodes means 128GB per node. 24TB of storage is just 2TB per node.

So it's 12 relatively mid sized servers/VMs.

tgv4y ago

> generated statically a couple of times a day.

That would require actual work instead of selling an overpriced generic solution.

smarx0074y ago

Did you look at the 3 different (non-trivial) APIs they are offering on top of the dashboard? Though I have a hard time understanding why use PostgreSQL instead of ClickHouse, for example.

lloydatkinson4y ago

No I didn’t tbh, I didn’t read much further. Notice how one sentence says Postgres was chosen because it was somebody’s preference

1 more reply

sharken4y ago

My suspicion is that since this has to do with COVID, there is no real limit on what the cost should really be.

As for using the setup for other things, that seems less likely given this expensive setup.

1 more reply

samhw4y ago

> could be generated statically a couple of times a day

Hell, let's do some partial evaluation: just bake the computed HTML into the source code and recompile that a few times a day. No need to even read from a file when you can fetch it from rodata.

_ben_4y ago

Yes the static render option seems optimal however if an API is being offered then something dynamic is mandated forcing scaling of the data tier. It seems like even a basic app cache would suffice.

Agree with the other comments in that this feels like a shiny use case to quote to other prospects, but all good :)

illwrks4y ago

My guess is that this is sales pitch. It will be rolled out to business customers to say "look at our shiny bells and whistles", and contracts will be signed.

glogla4y ago

I played with the website and it feels really nice.

My guess is that this was web people who were contracted to build a read-only daily updated dashboard instead of interactive web app so they treated it as another web app, just scaled up.

YZF4y ago

londons_explore4y ago

riverdweller4y ago

It’s entirely unnecessary. The data is updated too infrequently to justify anything like this.

This happens entirely on the client side, with no server side component whatsoever (other than the http server to deliver the static HTML&JS that does all the work). See https://covid-19-charts.net/

kakakiki4y ago

I came here to say exactly this. Is there a reason why they didn’t do it? I couldn’t figure it out from the article.

greatgib4y ago

First, when you see public officials doing a blog post on "Microsoft.com" website instead of on a public website, you know that something fishy is going on...

So, this thing could probably have been better splitted to not have the use for "scaled" databases

sdoering4y ago

> First, when you see public officials doing a blog post on "Microsoft.com" website instead of on a public website, you know that something fishy is going on...

The article states it was written by Claire Giordano from San Francisco. Not sure where you got the UK Government official from.

To me it read like a b2b marketing piece and showcase. Kind of: We can power this, so we can power your BI dashboard as well.

Taking this into account it was a nice write up and from a data analyst's and consultant's pov interesting to read.

Rastonbury4y ago

arsome4y ago

greatgib4y ago

Indeed I did not check the bio of the article writer, but when you read:

<<As a result, the GOV.UK Coronavirus dashboard became one of the most visited public service websites in the United Kingdom.>>

You don't expect the gov UK dashboard to be done us consultants...

DrBazza4y ago

> First, when you see public officials doing a blog post on "Microsoft.com" website instead of on a public website, you know that something fishy is going on...

Maybe, I'm naive, or not cynical enough, but I just read this as a case study of customer using Azure to provide the general public with information in a robust fashion.

In fact, if anything, the whole article is remarkably light on pushing Azure, and quite heavy on architecture details.

The open source code (on Github) uses Postgres (not MSSQL), and Python (not C# or Powershell), and in fact has a screen shot of Jetbrain's Pycharm, and not VSCode.

In fact it's probably quite an MS agnostic article.

Even though gov.uk is actually a really good IT company, I'm quite pleased that they're using "the cloud" rather than trying to create their own.

samhw4y ago

> Even though gov.uk is actually a really good IT company

[0] The Government Digital Service in full, but no definite article for the initialism.

ghassanmas4y ago

To be accurte, it's not completely Micrsoft agnostic, it make use case for a PSQL extension citus[1], the company behind this extension has been acquired by Microsoft two years ago[2].

[1]: https://github.com/citusdata/citus

[2]: https://blogs.microsoft.com/blog/2019/01/24/microsoft-acquir...

StringyBob4y ago

Not as much tech detail, but for an alternative source, here’s an intro by the team dashboard lead - https://ukhsa.blog.gov.uk/2022/01/20/reporting-the-vital-sta...

Also - I’ve been really impressed by the openness of the team actually doing the work - eg threads like https://twitter.com/pouriaaa/status/1476892793729654787

and in particular this analysis of debugging a problem that the dashboard encountered - which also gives a lot more background context: https://dev.to/xenatisch/cascade-of-doom-jit-and-how-a-postg...

CraigJPerry4y ago

I think the page looks inoffensive but is clearly focussed on being informative. I wish more data repositories took care and attention towards how data is represented.

one-more-minute4y ago

https://insidegovuk.blog.gov.uk

The only downside is that they often send you to sites run by other, significantly less competent bodies (looking at you, student loans company).

jsmith994y ago

hunter3214y ago

http://ehmipeach.defra.gov.uk/

Yeah, there's still some way to go.

samhw4y ago

> it's not impossible a printer in Swansea whirs to life to print your answers out into the original paper form

So yeah, there are definitely some Jira tickets still on the left.

samwillis4y ago

cronin1014y ago

spaniard892774y ago

All the fancy tech to get in your pockets. For everything else, go f*k yourself.

cameronh904y ago

Most recently I found the DVLA license renewal was one of those ugly backwaters (albeit still fully online), but their license check code generator is great.

For real terrible stuff, check out local council websites.

eggy4y ago

Yes, I thought the same thing in finding it easier on my eyes/quick perception of the site.

I do think the UK and some other countries do a better job of presenting data compared to the CDC.

[1] https://covid-19.ontario.ca/data/hospitalizations#hospitaliz...

[2] https://github.com/phantomics/april

jll294y ago

It's funny to read about a dashboard with TBs of memory and distributed DBs when on HN, people pride themselves on getting Web servers to run on floppy disk based systems.

The most mysterious part for me was why one would put JSON inside relational tables?

wnolens4y ago

> The most mysterious part for me was why one would put JSON inside relational tables?

Cheap and easy way to permit a flexible schema for some part of the data. Performance tests probably showed that for their specific query workload, any slow down from parsing/lack of index was fine.

snthd4y ago

Elsewhere[0] Microsoft have redefined "Open Source" to not include the right to redistribute, or to host on a cloud service.

So while there's nothing wrong here with calling an MIT project open souce, it's not inconsistent with their own definition, and useable as propaganda.

[0] https://azure.microsoft.com/en-gb/services/developer-tools/d...

>Is Azure Data Studio open source?

chrisseaton4y ago

It’s still useful to have the source open for reference even if you can’t redistribute though.

nojito4y ago

How is the most common open source license not open source?

samhw4y ago

I mean, whatever the truth may be, you're begging the question somewhat by calling it an 'open source license'.

skylanh4y ago

BSDL, GPL, MIT License.

This has been an argument for at least 25 years that I've been around this stuff.

snthd4y ago

Both sides of that argument agree that the right to redistribute is fundamental to FLOSS.

Microsoft is defining their product, which you can't redistribute[0], as "Open Source".

[0] https://github.com/microsoft/azuredatastudio/blob/4012f26976...

llimos4y ago

> From the beginning of the COVID-19 pandemic, the United Kingdom (UK) government has made it a top priority to track key health metrics and to share those metrics with the public.

According to Dominic Cummings (ex-adviser to the PM), this isn't true at all - one of their biggest failings early on was to not have the data and not see the priority in getting it.[1]

zarzavat4y ago

Closi4y ago

So I don't think there is a clear contradiction here (and in fact, I think the evidence points to the fact that the statement in the article is probably correct).

iamstupidsimple4y ago

Closi4y ago

> I believe Cummings was disappointed that this 'data system' wasn't available before the pandemic, precisely because that few week delay putting one together meant everything.

It's very easy to say that something should have been there after the fact, and harder to build a system for an unknown-threat before you have any clear requirements.

It's also not clear to me what data a 'data system' would have provided that would have meaningfully changed any policy (and if it didn't affect policy, I'm not sure how it can 'mean everything').

rob_c4y ago

Yes but Cummings deliberately asked questions which are fundamentally unknowable and went with report9 which claimed to be all knowing...

lozenge4y ago

Well, it depends what time period you're talking about.

In early March 2020, they briefly announced that the daily COVID figures would move to a weekly cadence.

glogla4y ago

I skimmed the article and it seems interesting.

I'll have to go through the article in more detail.

mslot4y ago

OLTP stores are relatively bad at aggregating across a lot of data.

There are some specialized real-time analytics databases like Druid. Citus (used in the article) allows you to run such workloads at scale on PostgreSQL.

GraemeMeyer4y ago

A few other commenters have pointed out the same thing - I’m wondering if it’s simply the skill set they had on hand when the need arose

21434y ago

The UK covid dashboard[2] mentioned in the article also has a "simple" version.[1]

I love it when websites have a simple text version.

[1] https://coronavirus.data.gov.uk/easy_read [2] https://coronavirus.data.gov.uk/

zlibOP4y ago

https://github.com/publichealthengland powered by a lot of Python and Ruby, nice.

nojito4y ago

And a bunch of F#

https://github.com/publichealthengland/coronavirus-dashboard...

mdm124y ago

Nice catch!

jll294y ago

Never seen a job ad on anything sharp other than C# in Europe.

There are occasionally LISP and Clojure jobs from what I can tell.

(It's also hard to find people on the talent side. I needed a Haskell developer with NLP skills in 2005, and could not find one so we had to port our codebase to Java.)

1 more reply

boomskats4y ago

This feels like a marketing-led attempt to shoehorn Citus into something topical and shareable, having realised that they've barely talked about it since acquiring the company a couple of years ago.

I'm all for Citus, but cmon. Overkill.

raesene94y ago

The covid dashboard I've found the nicest to use, is actually not one of the official ones but instead this https://www.travellingtabby.com/scotland-coronavirus-tracker... .

The information is presented clearly and it's easy to see what's going on, although in my case the main reason is the breakdown for Argyll & Bute, which isn't a focus area for the national ones!

adenner4y ago

idleherb4y ago

Looking at their database schema (table "auth_user"), it looks like the user passwords are stored unencrypted and without salt?

rob_c4y ago

Given the original meetings were all about producing a dynamic simulation dashboard which would allow people and politicians to understand the impact of various measures on lives saved...

axiosgunnar4y ago

> people and monsters

monsters?

rob_c4y ago

Fixed to politicians. Freudian slip and a half there...

2dvisio4y ago

Wonder why they haven’t used powerBI for that, but deep inside I know why...

amelius4y ago

I would have preferred to see it implemented as a spreadsheet in the cloud.

StringyBob4y ago

Well, you can just grab the data very easily: https://coronavirus.data.gov.uk/details/download

amelius4y ago

Yeah, not what I prefer.

As a programmer, I want to make demands about UX too, for a change ...

aembleton4y ago

Why not build your own UX on top of the data?

rob_c4y ago

The data is exportable but you then have to deal with 8MB+ csv files. Still it's better than it being in docx I suppose...

But then this would let you perform statistical analyses on _their_ data, I'm not sure they're such a big fan of that...

aembleton4y ago

What format would you prefer it to be in than CSV? There are CSV libraries in pretty much every language and any spreadsheet can import them. 8MB really isn't that hard to handle these days.

axiosgunnar4y ago

> Government project

> awarded to Microsoft

Hey Europe, want to stop being several decades behind in IT compared to US/China?

One simple trick:

Ban FAANG from public procurement in Europe!

It‘s a no-brainer really.

Buy locally, ideally giving small companies and startups a chance.

You will have to do it anyway very soon if you want your privacy laws to be taken seriously.

There might be a couple of months of friction while buerocrats have to find new procurement partners, but that's it.

And then the European tech scene will rise.

azalemeth4y ago

discordance4y ago

Curious how many students and faculty at your university are served by that $2m/year?

notahacker4y ago

1 more reply

odiroot4y ago

> Ban FAANG from public procurement in Europe!

Is it any better if SAP / Telekom get similar contracts (what usually happens in Germany)?

axiosgunnar4y ago

At least the tax euros stay in Europe, and you get local expertise, tech hubs etc

stickfigure4y ago

Don't Google Microsoft et al have offices in Europe?

rm4454y ago

> Microsoft

> Ban FAANG

I know exactly what you mean, but is that what we're doing now? Including Microsoft in FAANG but not changing the acronym?

shroompasta4y ago

to be fair, microsoft should've been included in the first place. they've been teetering top 3 market cap for a while now.

amelius4y ago

They should replace Netflix in FAANG or include Disney, HBO and a bunch of others.

desas4y ago

A couple of months wouldn't replace Microsoft word never mind Excel, Aws, Azure AD or Azure

johneth4y ago

Microsoft didn't build this project, they just host it.

It's built in-house by the government.

kakoni4y ago

For instance in Finland Azure/AWS/GCP are seen as a superior alternative against anything local/European and we’ve started to move our govermental and healthcare infra into these cloud providers.

benbristow4y ago

Get your point.

Those small companies and startups are going to end up using Microsoft/Amazon/Google for their hosting/cloud-services anyway so FAANG still win in the end.

tjungblut4y ago

The UK isn't part of the EU anymore.

Doctor_Fegg4y ago

Maybe GP has edited their comment, but it says "Hey Europe" and we are very definitely still part of Europe.

glogla4y ago

But one day you will put that offshore wind turbines into reverse mode and push the isles to the middle of Atlantic, right?

tjungblut4y ago

Indeed it was edited, it said EU before. Europe makes even less sense, there is no centralized public procurement in all of Europe.

rob_c4y ago

But frankly we're very similar in this regard, and given we might get dropped from H24 we're not in a good position tech or science wise heading into a recovery...

We'll no doubt award yet more govt projects to the tech oligarchs of the west and praise students for using their toys...

j / k navigate · click thread line to collapse