Googlebot's recent improvements might revolutionize web development (opens in new tab)

(blog.workhere.io)

119 pointsworkhere-io12y ago96 comments

96 comments

One potential problem here is that google will use this to widen the gap between it and the 'one page apps' web and other search engines (such as duckduckgo) that can't match it in resources.

How strong of an advantage that will be in the long run is uncertain, I would rather see a web that ships pages with actual content in them than empty containers for a variety of reasons (most of which have to do with accessibility and the fact that not all clients are browsers or even capable of running javascript).

This 'new web' is going off in a direction that is harmful, coupled with the mobile app walled gardens it is turning back the clock in a hurry.

I'm fairly sure this is not the web that Tim Berners-Lee envisioned.

epistasis12y ago

I agree 100%, and see other problems for users as well. The worst web pages I encounter are the javascript-constructed DOM pages. And by far the worst offender is the odious Google Web Toolkit.

It may be that I encounter some of these "modern" pages without knowing it because some dev has put in the time to make it work well, it seems that the vast majority are absolutely terrible. It's a small decrease in developer effort for a huge decrease in user satisfaction. I remember the terrible terrible #! days of Twitter with sadness.

Some developers have a tendency to go for the internally sophisticated/beautiful in preference to the best experience for the user. I hope that blog posts like this one don't let loose these developers' worst tendencies.

qyv12y ago

I don't see this as a problem at all, here is why:

Google doesn't really have competitors in search, they just don't. I mean look at how we even define 'searching the internet' in our spoken language: 'google it'. Google has become the identity of search on the web in most people's minds. And to top it off, they are really, really good at it. Google getting better is going to widen the gap between them and everyone else, but the gap is already pretty damn wide. Was there really any chance of someone catching them in any foreseeable future?

But, the wider that gap gets the more motivation there becomes to not attack the gap, but to go in a different direction altogether. Nobody talks about duckduckgo because they're search results are better than google's, they talk about them because duckduckgo is all about your privacy. They found a different way to make a search that people want to use. The wider that gap gets the more motivated some will be to try something truly novel to compete with google.

tragic12y ago

> Nobody talks about duckduckgo because they're search results are better than google's, they talk about them because duckduckgo is all about your privacy.

'Nobody' is such a strong word.

http://devblog.avdi.org/2014/02/16/why-duckduckgo-is-better-...

MichaelGG12y ago

>I'm fairly sure this is not the web that Tim Berners-Lee envisioned.

So? None of these are convincing arguments for application developers. Having to rewrite an application to perform all its UI logic on the server side in addition to client side is a lot of work, for almost no benefit to the people paying to make the application.

As a user, so long URLs work so I can send locations to other people, then most of my accessibility scenarios are solved.

The rest is simply a lack of technology in other clients.

sergiosgc12y ago

The problem is not about some God like commandment. It is about the original design of the Web, which we all believe was the reason for its success.

When you receive a GET request for an URL, and the browser tells you it accepts text/html, it is expected that you answer with the content stored at that URL in the format requested. It is not expected that you answer with an application that when run will eventually produce the content.

The correct way to do what this post is saying is to create a new mime type for this content delivery method. Then, if the browser actively tells you it accepts that content type, deliver it.

What the OP proposes is not text/html. It's something else.

marknutter12y ago

I'm fairly sure the web has not been the web that Tim Berners-Lee envisioned for a long time now.

wutbrodo12y ago

.... and that's a very good thing per se. The idea that the development of something as important and universal to the web should be limited to what one man (no matter how visionary) was able to envision a couple of decades ago is beyond bizarre.

iLoch12y ago

It's not difficult to set up middleware that'll render the page for any clients that require it. (For instance, we can assume any client that identifies as "bot" that's not Google probably wants a pre-rendered page, which we can do quite effortlessly. Here's one implementation for Nodejs: https://prerender.io, or you can always roll your own with something like Phantom.js.

Touche12y ago

Note that sending a different response to googlebot than what you send to normal users is a violation of Google's guidelines and can get your site penalized. Use at your own peril.

1 more reply

BorisMelnik12y ago

wow this is amazing. would love to see an offshoot of this where it could render a sitemap, or even keep a live sitemap up to date via cron.d or something (just hoping out loud)

3 more replies

Flenser12y ago

At the head of the index it probably won't make much difference, big sites will render the initial page on the server, it's the long tail where it will be client only. It may make scaling a website easier though as frameworks will start making adding or moving rendering from client to server trivial. So the path from the long tail to the head will be easier to navigate. (if you'll excuse me mixing my metaphors)

frik12y ago

duckduckgo is mainly a meta search engine (relies on search API of Yahoo that relies itself on Bing, Yandex, etc.). Plus it shows some related snippets from Wikipedia and other data-sources.

Several well known web search engines are now defunct or switched to meta-search business (like Yahoo with Bing data).

There are only a few international/world-wide search engines with a crawler:

Google, Bing, Yandex, Baidu, Gigablast, (Archive.org/Wayback Machine)

SixSigma12y ago

Berners-Lee envisioned a decentralised peer-to-peer information sharing network where everyone was a server and a client.

nkuttler12y ago

One page apps that aren't crawlable don't want to be crawled, don't make the necessary work, or are simply incompetent. Making an ajax site crawlable isn't exactly rocket science.

The gap this will really widen is the one between sites that do the necessary work themselves, and those who don't.

workhere-ioOP12y ago

One potential problem here is that google will use this to widen the gap between it and the 'one page apps' web and other search engines (such as duckduckgo) that can't match it in resources.

There are free and open source tools available that would help search engines parse pages containing JS (PhantomJS comes to mind).

cbr12y ago

It's not just tools, it's the cost of all that parsing and executing in a mock browser environment.

andrenotgiant12y ago

Taking a step back: The "Page" paradigm is still very much alive, despite these recent javascript parsing advances.

1. Google still needs a URL-addressable "PAGE" to which it can send Users.

2. This "PAGE" needs to be find-able via LINKS (javascript or HTML) and it needs to exist within a sensible hierarchy of a SITE.

3. This "PAGE" needs to have unique and significant content visible immediately to the user, and on a single topic, and it needs to be sufficiently different from other pages on the site so as not to be discarded as duplicate content.

mixonic12y ago

I'd debate the phrase "step back". If you replace all your references to PAGE with URL, you get closer to a real meaning.

URLs for single-page applications are a serialization of application state. The fact that we now have an application platform (JavaScript/HTTP) providing sharable, mostly-human-readable state sharing (URLs) and is also indexed and searchable is nothing short of incredible.

Yes, the basic abstractions we use are the same. We will have URLs that address content in our applications. But now these are applications running on Google's own servers. Google is running my application (and hundreds of thousands more), and trying to understand what they mean to humans. This is a pretty amazing step forward.

Imagine Apple announcing it would run all iOS applications, interacting like a user to build a search index. IMO, this parallel shows what makes Google's commitment to running JavaScript apps exciting.

andrenotgiant12y ago

The point I was trying to make is this:

With every new capability from Googlebot comes new opportunities for us to screw it up as developers.

If we were to replace PAGE with URL, and URL is simply a serialization of application STATE, we could easily end up with infinite URLs that lead to STATES that are not really that different, unique or appealing as answers to queries users type into Google.

When deciding how to build Search-accessible Web Apps, and specifically what to expose to Google, we need to keep in mind that Google likes PAGES that follow the requirements I detailed above.

Flenser12y ago

> these are applications running on Google's own servers. Google is running my application (and hundreds of thousands more)

Which is also very beneficial for Google as they'll likely be the only company doing that for a while, and the one able to do it for the most sites for a long time to come, maintaining Google's search index lead.

workhere-ioOP12y ago

As I mentioned in the post, all these problems can be solved by using real paths/URLs and changing them dynamically using pushState.

prophead12y ago

But the onus is still on the developer to choose _what_ gets a unique URL and what does not.

It might be good for user deeplinking capabilities to change the URL every time any type of state change is made (for example sorting a list by date instead of name) - But exposing that many URLs to Google would be bad.

(This is the modern equivalent of the age-old "infinite calendar" problem that Googlebot had to deal with when dynamic calendar apps let you navigate to dates 2 millennia in the future.)

1 more reply

blauwbilgorgel12y ago

Create web applications, rank as a web application.

Create web pages, rank as a web page.

This is a band-aid by Google. Developers created inaccessible websites (JS-only, no HTML fallback) and Google still wanted to give those sites a chance to be in the index. Like when Google made it possible to index text inside .swf movies. This did not mean that flash sites suddenly ranked alongside accessible websites. No, it only meant that you could now find content with a very targeted search query.

Don't think you are gaining any SEO-benefit from one-page JS-only applications, just because Google made it possible for you to start ranking.

And don't forget your responsibility as a web developer to create accessible content. Forgetting progressive enhancement, fallbacks, a noscript explanation for why you need JS, ARIA is devolution. If Google can index your site, but a blind user has a problem with your bouncy Ajax widget, then you failed catering to all your users. If you lazily let Google repair your mistakes, then soon you will be a Google-only website.

mixonic12y ago

100% FUD.

There is no evidence that Google is going to punish my website for being rendered with JavaScript, as you imply with your first two comments.

Google is indexing the HTML generated by JavaScript, and the links in that HTML. Not some non-web custom format like SWF.

JavaScript driven sites work just fine with modern screen-readers. https://developer.mozilla.org/en-US/docs/Web/Accessibility/A... and in 2014 97.6% of screen-readers ran JavaScript http://webaim.org/projects/screenreadersurvey5/#javascript

In 2013, 92 out of 93 visitors to a UK government webpage supported JavaScript: https://gds.blog.gov.uk/2013/10/21/how-many-people-are-missi... And mixed into that 1.1% were users getting broken JS, behind firewalls, disabling JS, etc.

Google making this change does not force you to build a JavaScript-driven website, but it does make it more attractive .

blauwbilgorgel12y ago

If I wanted to imply that Google will punish your website for being rendered with JavaScript, I probably would have said so. It would likely be false too, as it is less of a punishment, than it is not maximizing your chance to rank (to put your best foot forward as a website).

Accessibility is not a numbers game. In many countries it is a legal requirement. And adhering to the WCAG means providing non-JS fallbacks or progressive enhancement. RMS not being able to access your content is an accessibility issue too, it does not have to involve a disability. It can be technical in nature, like disabling JS or being behind a corporate firewall, or your browser not supporting pushstate.

If you want to look at stats, take a look at the stats and surveys on accessibility of dynamic web applications. Just because your screenreader supports JavaScript does not mean you have no accessibility issues due to JavaScript. Rich internet applications should use WAI-ARIA. I don't think people who create websites without a fallback (avoiding this issue entirely), will worry about creating websites with ARIA-support. And if they do care about such accessibility, they should also provide a non-ARIA non-JS fallback.

Google making this change makes it possible to have your non-fallback JS-only application be indexed. It does not make it more attractive from an SEO or accessibility viewpoint.

1 more reply

workhere-ioOP12y ago

Don't think you are gaining any SEO-benefit from one-page JS-only applications, just because Google made it possible for you to start ranking.

No one is expecting to get any SEO benefits that "normal" pages don't have. We are expecting to get the same chance of ranking as normal pages.

You mentioned that single page apps might rank differently or worse than normal pages. Do you have any source for that? (A source that is current, since Googlebot's improvements are quite new).

blauwbilgorgel12y ago

>We are expecting to get the same chance of ranking as normal pages.

Then you should probably adjust this expectation. You say in your article:

>While having this sort of HTML fallback was technically possible, it added a lot of extra work to public-facing single page apps, to the point where many developers dropped the idea...

A JS-driven site with an HTML fallback is a normal page. Then you don't need any tricks or force Google to run your application and hopefully make pages out of them. Start with the fall-back and enhance.

This is a serious mistake with consequences. Tor bundle and Firefox shipped with JavaScript support, because disabling JS broke too much of the current web. It causes accessibility issues (remember when Twitter changed to hash-bang URL's?), if not for Googlebot, then for regular users (From the Webmaster Guidelines):

>Following these guidelines will help Google find, index, and rank your site.

>Use a text browser such as Lynx to examine your site, because most search engine spiders see your site much as Lynx would. If fancy features such as JavaScript, cookies, session IDs, frames, DHTML, or Flash keep you from seeing all of your site in a text browser, then search engine spiders may have trouble crawling your site.

>Make pages primarily for users, not for search engines.

I am still going on the assumption that you created a one-page application without a time-consuming fallback, and you rely on Google to make rankable pages from them. Then you leave some users standing in the cold, so why deserve to rank equal to a user-friendly accessible web page?

> ... single page apps might rank differently or worse than normal pages ...

From the original article, the most current source on this:

> Sometimes things don't go perfectly during rendering, which may negatively impact search results for your site.

> It's always a good idea to have your site degrade gracefully. This will help users enjoy your content even if their browser doesn't have compatible JavaScript implementations. It will also help visitors with JavaScript disabled or off, as well as search engines that can't execute JavaScript yet.

> Sometimes the JavaScript may be too complex or arcane for us to execute, in which case we can’t render the page fully and accurately.

> Some JavaScript removes content from the page rather than adding, which prevents us from indexing the content.

In the SEO community Googlebot's improvements were noted for a while now. See for example: http://ipullrank.com/googlebot-is-chrome/

Single page websites or application-as-content-website's are not popular among SEO's. One reason for this is that it doesn't allow for fine-grained control on keyword targeting, and keeping the site canonical, and it can waste domain authority when you have less targeted pages in the index than you can rank for. Experiment and find out for yourself.

rcsorensen12y ago

Sending the framework of a page to your users and expecting them to do all the heavy lifting and slow loading of constructing the page and fetching the data is still rather unfriendly if you can afford a server to construct it.

If you love your users, give them HTML and let the Javascript enhance it.

Projects like Facebook's React ( http://facebook.github.io/react/docs/top-level-api.html#reac... ) and Rendr (https://github.com/rendrjs/rendr/) let you use server rendering as well as the single page technologies on the client side.

workhere-ioOP12y ago

Clientside rendering doesn't need to be heavy at all. In fact, you could do it with just $.getJSON('/api/users', function(data) { $('#users').text(data.content) }.

Sure, that requires jQuery, but most "normal" sites require jQuery, too.

rcsorensen12y ago

Sure!

And now your poor little mobile user has to wait for the page to download, then to execute javascript, then to wait for the API response, then wait for the DOM to update.

Or you could put it in the HTML, and then they just have to wait for the page to download.

1 more reply

marknutter12y ago

The lifting ain't that heavy. And besides, wouldn't you want your server spending its precious cycles on things the clients absolutely cannot handle?

rcsorensen12y ago

Like `ssorallen says, slow CPUs and limited battery life is an issue. In addition, latency on mobile networks is rough and should be avoided for your users whenever possible.

You have two choices when taking an SPA approach:

1) Give your users a webpage that is usable, readable, and navigable as soon as they get it.

2) Give your users the skeleton of a webpage that they then have to accept the latency of javascript parsing and execution, and then the latency of data fetches.

we have the ability to do #1.

1 more reply

ssorallen12y ago

The lifting can be heavy for mobile devices with slow CPUs and limited battery life unlike the servers running your site. Also if your server renders the site and some of the pages are public, the server can cache the HTML and let the web server serve cached HTML rather than render the page in the web framework for each request.

2 more replies

tragic12y ago

Like many, I am suspicious of the rather overbearing claims made on behalf of the SPA architecture.

I just launched a website. It's a weekly periodical with political analysis, word-count on articles 1500-6000. It needs to carve up the content in a few different ways (categories, issue numbers etc), decorate an article with links to other relevant content, and provide a nice CMS for non-tech people to use. So it's on Django, with the regulation sprinkling of JQuery. (If it were only techies updating it, you could probably do it with a static site generator...)

To me, the idea that you'd try and force something that is plainly a big collection of pages into a 'single page' is just philosophically bizarre, like printing Moby Dick on a square mile of paper, using some amazing origami skills to present it to the reader, all in order to save a bit of effort at the paper mill.

The googlebot business is one aspect of a bigger issue, which is that a website needs to be consumable by a host of different clients. I don't see how you can do the SPA thing without making major assumptions about those clients.

Sometimes, of course, those assumptions can be justified - it depends on the job. And Angular etc are enormously fun to play with, and handled well can enable a great UX for certain jobs. But I don't think it's 'the future'. It's another tool in the box.

Relevant here, a nice talk by John Alsopp from Full Frontal 2012:

https://www.youtube.com/watch?v=KTqIGKmCqd0

EDIT: clarification

xpose200012y ago

I'm not sure if this announcement changes anything. The bottom line is to make apps for the end user. Google is simply saying that those best practices are now crawlable in a way that is very mature. The same rules still apply.

A simple guide can be found here: https://developers.google.com/webmasters/ajax-crawling/. Although I suspect it needs to be updated since its from 2012.

If you create an application, make sure it alters the URL when applicable. For simple apps, the following repos will be useful:

The old way, that still works: https://github.com/asual/jquery-address

The better way, preferred: https://github.com/browserstate/history.js or https://github.com/defunkt/jquery-pjax.... not sure which is better to be honest. Feel free to chime in.

rpedela12y ago

fixed link: https://github.com/defunkt/jquery-pjax

ssorallen12y ago

> Single page apps are not a new concept, but up until now they were typically a bad solution for public websites that depend on hits from search engines

If your users (I'm talking humans, not bots) have to download a mountain of JavaScript and execute it before seeing any content, your site is slower than it could be for everyone. We should stop saying that "single page apps", i.e. sites rendered in JavaScript in a browser, are bad because they can't be scraped by a bot. They are bad for EVERYONE who wants to view the site because of the network and CPU time it takes to download the assets and render the site in the browser.

scotth12y ago

Doesn't it all depend on how long they spend on the site? If it's one page and bounce, sure, that's a terrible experience. More pages than that? Now we're starting to see savings.

ssorallen12y ago

If you serve a rendered page of HTML with CSS links, browsers can progressively render the page as it is downloaded. Users will notice that on the first page load, particularly on higher latency connections where round trips for resources like JavaScript files are expensive.

1 more reply

acqq12y ago

So you want to have the URL for every content but you don't want to provide the content as HTML, instead expecting that once the page is by client, the client only then separately loads the content? Only because it's easier to you to program, you want to deliver me the content much slower than you can?

There is some strange logic there.

marknutter12y ago

Slower in some respects and faster in others. Suppose I cache the data I get back from the server for page 1 and page 2 of content. Now, if the user switches between page 1 and page 2 they don't needlessly ask the server for the HTML every time like they do when relying on the server to render templates.

And I'm not sure where you're getting that it's "easier to program" single page apps than it is to simply rely on the server to render html on the server. The fact that it's not easy is the very reason we have so many competing front-end frameworks to solve the problem elegantly.

workhere-ioOP12y ago

The whole point is to make the experience better for the user: When going to each new page on the site only involves fetching a bit of JSON and not an entire HTML page including header, footer, JS, CSS, etc., that makes the user experience faster. Add to that the fact that since the front page HTML now no longer contains any dynamic elements (you would get those dynamic elements via JS), you can put your front page HTML (the one HTML page there is) on a CDN and get faster load times.

As for speed: Sure there are exceptions when developers go crazy with tons of heavy JS, but that doesn't need to be the case at all.

Touche12y ago

Only the first load is slower. If you click on any links within the page they will render much faster as you skip the whole full page load thing.

grey-area12y ago

With the new improvements to Googlebot, single page apps will likely advance from being niche solutions for non-public websites to being the default way to build websites. A website will contain a single HTML page (typically heavily cached and served via a CDN). The JS on that page will then fetch content (as JSON) from the server and change the path as necessary using pushState.

I find the cheerleading for single page websites disconcerting and the proposed benefits unconvincing. Why should this be the default way to build websites? A few desultory upsides are presented without a full consideration of the multiple downsides to client-side development.

The biggest advantage of thick-client architecture is sending less data to the client and, if you like using javascript, writing everything in js, but there are multiple downsides compared to more traditional thin-client websites - load times which depend more on client capabilities (hugely variable and out of your control) than servers, dependence on js on the client, loading pages while your content is placed in the dom by js, forcing everyone to write in js instead of switching language on the server whenever they like, ignoring the simple document model of html served at predictable URIs, which has served the web so well and means you can use dynamic or static documents, full documents can be cached for very quick serving and by intermediaries, etc, etc. Of course some of these can be overcome, but there are serious obstacles, and the advantages are meagre to non-existent unless you enjoy javascript and feel its the only language you'll ever need.

For someone who doesn't like working in js, and/or doesn't have a huge amount of logic already in js (many websites work just fine with some limited ajax), trying to force every website into the procrustean bed of client-side development is not an appealing prospect. I can see why it appeals to those who have already invested in js frameworks, but predictions of its future dominance on the web, like predictions that eworld, activex or mobile would replace the web, are overblown.

I suspect the birth and death of Javascript will be a footnote in the history of the web, rather than taking it over as this article suggests. If anything we should be looking to replace our dependence on js, not making it mandatory.

timq12y ago

I can't believe that single page webapp are easier to write than true old website. Maybe you can gain some performance improvement but if you use a framework you will loose this very little gain. Those who claims that developpement is easier with framework on a single page have too learn programming, because for most case, the "old" way works very well and is incredibly faster than a bloated javascript page.

I really, really dislike this approache of doing website, it make the code and the design hard to understand and cost a lot of problem when comes the time to debug or made major changes.

There are no magic when using javascript, it will slow down the client, and manipulating DOM is very slow. Doing things server side cost nothing compared to javascript. Remember that loading a webpage is very fast when you have only a CSS and html code, because it very easy to put it in cache and doing some pretty nice optimizations on it.

With frameworks, making "webapp" become a huge nigthmare, things becomes overly complex and bloated, the request as to pass trough a lot of layers before end up somewhere, and the developpement is not faster than have a custom code. Good framework don't make good programmer.

When the article say "put the CSS inside a <style> tag on the page - and the JS inside a <script> tag", it's just horrible, fuck it.

workhere-ioOP12y ago

Those who claims that developpement is easier with framework on a single page have too learn programming, because for most case, the "old" way works very well and is incredibly faster than a bloated javascript page.

Who says the page will become bloated with JS just because you use clientside loading? The mechanism I'm talking about can be done with something like 10 lines of JS or less. No one's saying you have to use AngularJS with every web page you make.

When the article say "put the CSS inside a <style> tag on the page - and the JS inside a <script> tag", it's just horrible, fuck it.

First of all, this is not a requirement of single page apps, just an option. Secondly, when you're developing, you would still have your JS and CSS in separate files. Your compiler would then minify the whole thing and put it inside your minified HTML file.

CMCDragonkai12y ago

Not every single search engine will be able to scale a JS virtual machine for all the pages they need to index. There are also social network bots that you might like supporting such as Facebook and Twitter which will not be able to crawl javascript either.

At any case, if you want to have a solution to this SEO problem now, I created SnapSearch (https://snapsearch.io)

bhartzer12y ago

I wouldn't actually call this "recent" improvements. I mean, Google has been handling JavaScript for years now. And they're just now coming out and publicly saying it. Which is typical Google.

workhere-ioOP12y ago

What they were saying before was that you always need a HTML fallback for JS-generated content. Now it seems they're saying you don't necessarily need to.

drakaal12y ago

Optimism rather than fact.

There is a lot more to it. I am pretty well known as an SEO, and while I would love this to be true it isn't.

Google's improvements to GoogleBot are mostly targeting spam, and obfuscation of content. The ideas is not to discover content as much as it is to avoid having content hidden.

Previously you could have a webpage that appeared to be about Puppies, but then used any number o dynamic methods to instead show naked cam girls. Google worked to fix this.

Google is now doing some indexing of named anchors, and this allows for linking to a page with in a page as it were. But that is a Long ways from building indexable single page applications.

-Brandon Wirtz SEO (formerly Greatest Living American) http://www.blackwaterops.com

basseq12y ago

This is a great improvement, but I'm struck by two things:

1. The state of JavaScript-only application development is still nascent. The number of JS-only sites I see that are buggy, don't use PushState correctly, or have other shortcomings is growing faster than the overall trend. Not that it can't be done well, but if your JavaScript-only "app" is really just a standard website, you might want to re-think your approach.

2. There has to be a better way. There are distinct benefits to approaches in caching content and providing feedback, but JavaScript seems to be a kluge-y approach. It reminds me of frames back in the day. Some of this is browser support; some of this is lack of standardization; some is perhaps a missing piece of the HTML spec; etc.

workhere-ioOP12y ago

Author here. Some of you are saying that this will lead to bloated, JS-heavy websites. I disagree. The JS necessary for making a single page app can be done with something like 10 lines of JS (plus jQuery or something similar, but that is already included in most normal pages anyway).

A single page app isn't JS-heavy by definition, and a "normal" page (with HTML generated on the server) can easily be JS-heavy. It all depends on how you program it. Just keep in mind that single page apps don't necessarily need to use heavy frontend frameworks such as Knockout, Ember or AngularJS.

CHY87212y ago

This seems like an odd thing to be shouting about.

As far as I can see, throughout this thread the performance benefits of single page apps are touted as being fantastic, making it worthwhile to use the new technology etc.

When has performing operations efficiently ever been the domain of the web? Websites in my experience have the worst performance of almost any software I use! I've seen developers cite 200ms or longer to load a page as being a good benchmark - that seems pretty awful to me.

If getting this tiny performance improvement (which often results in poorer performance on the first load (not ideal for many)) is so critical, why do the same developers not invest in writing more performant server apps? Yes, often the database is a bottleneck, but these problems can in general be worked around (either by use of faster queries or caching etc).

Why attempt to get a small performance benefit by saving 30-odd kB of HTML on each page load (static and so essentially free for the server), when one could get a much larger performance benefit by optimising the backend?

Almost all serious sites will still see their page load being limited by the time it takes to produce the page. It's possible to write really fast websites (try http://forum.dlang.org/) but no one seems to do it :(

If anything but almost all of your website is static, you won't be saving all that much time.

workhere-ioOP12y ago

If anything but almost all of your website is static, you won't be saving all that much time.

Single page apps can easily be static (static HTML page + static JSON). The point of this would be to decrease the download size for each new page visited by the user.

CHY87212y ago

I think you missed my point. In each web page downloaded there's a bunch of (basically constant) static data - to download your javascript files, and set up your document - your template (or similar). This is the only data that single page apps can eliminate - everything else must either be queried from the server or can already be cached.

Some sites obviously inline CSS or JavaScript, but that can be eliminated if necessary (and only affects the first page load anyway).

This information is free to generate on the server side, so it's not slowing down that computation at all (it's just a stringbuilder function, essentially). Furthermore, the transfer time is generally not the deciding factor - it's the server side time to put the rest of the information together.

To give one example, I went to a typical website - the Guardian (it's a fairly standard high-traffic news website). Chrome informs me that in order to request one article, it took 160ms to load the html - 140ms of waiting and 20ms of downloading. Now, the RTT is about 14ms, so that's about 110ms of generating the web page and 20ms of actually downloading it. It's about 30kB of compressed HTML (150kB uncompressed), most of it's 'static content' - inlined CSS and JS.

Them using the single page model would reduce the page download time (apart from the first page) by an absolute maximum of 20ms - which means that the time to load each page has been reduced by about 12%.

This is fine, but almost all of the data is just the result of string concatenations and formatting - i.e. free processing (or at least almost-free processing). It's getting the rest of the data together that's somehow taking the 100ms (or crap implementations).

The cost of moving data around on websites is typically small compared to the actual production time of the content. That's why we see people preferring to inline huge amounts of CSS etc on each web page and having people download it time after time - because it's only about 10kB compressed the data transfer is inconsequential, and normally is dominated by the RTT.

Spending all the time writing these frameworks because of performance benefits is a fallacy - the data still has to be generated somewhere, and if it happens dynamically it's slow as hell. The savings can never become that great - at most they lead to 20-30ms of improvements if bandwidth is acceptable.

Writing the frameworks because they make development easier is a much more reasonable argument.

This still all detracts away from the fact that non-static websites are typically dog slow and they shouldn't be.

PinguTS12y ago

The single page app has another _huge_ drawback.

The reload via JS fails silently, when you are on a bad Internet connection, like I am currently here in Nepal. You can not simply do a reload, like with a simple HTML page.

For example, Facebook is here unusable, because of the same issue. It works only, when you request the mobile site of Facebook in the browser.

So please, get rid of that damn JS, if you care about your user base and usability.

BTW: that problem also happens on bad hotel Wifi in the US.

adamconroy12y ago

Somewhat off topic, but how do single page apps deal with people hacking the JS? For example, if as a particular user I am only allowed to perform certain functions within the app, and that functionality is contained in the JS, then it doesn't seem like it would be very hard to modify the JS to enable the functionality I shoudn't be allowed to use.

ailox12y ago

Usually the functionality of the app exists in the backend, which would be server side. No matter what you do on the frontend, there should be no way for you to trigger actions in the backend you were not authorized to perform.

allendoerfer12y ago

This news article seems to come up every few years. Nevertheless, the niche of apps, which can profit of this, is quite small.

Either you have a highly interaction-heavy web-app, where it makes sense to execute most of the code on the client and deliver the content as JSON or you have a content-heavy website, where it makes sense to deliver cached content to the client.

There are some apps in between, which are highly interactive and content heavy like web versions of social apps. For them additionally the question arises, if they want to be crawled by Google or Google wants to index their content.

To profit from this, you need an app, which content the users search for and interact with several times after they have found it. So i guess, "revolutionize" seems a bit much to me.

slashdotaccount12y ago

Please use this instead: https://en.wikipedia.org/wiki/Progressive_enhancement

SquareWheel12y ago

But we're talking about making AJAX pages robot-friendly, not making regular pages mobile-friendly.

kayoone12y ago

Loading off most of the work to the client has its downsides too. If that was a reality i would assume that mobile devices would require quite a bit more battery power to render basic webpages.

h1karu12y ago

google is not the only search engine.

workhere-ioOP12y ago

Which I emphasized in the post :)

justinph12y ago

We're still using hypertext transfer protocol, right? You need to send hypertext down the wire.

We shouldn't let one company, google, dictate how the web works, simply because of their proprietary technological innovation.

o_____________o12y ago

Roads are meant for horses, right? We shouldn't let one company, Ford, dictate how the roads work.

My man, if we only used infrastructure and technology in the way it was originally intended and narrowly imagined, the world would be a dim place.

justinph12y ago

Ok, I see your point.

But, have Yahoo or Bing or DuckDuckGo made the transition to be able to crawl the web with a full JS & DOM rendering engine? I doubt it. By eschewing that compatibility we're setting a very high bar for what any competitor to google would have to achieve.

I like google. I just don't think it's good to have one company own a market so completely.

3 more replies

Yetanfou12y ago

Google is not dictating anything so there is no need to vent your dislike, of, Google, in that way. If anything, this releases developers from a constraint on how sites are built.

That said, I'd rather see more server-side HTML instead of client-side JS when it comes to the web. If you're developing a game, client-side JS is fine. If you're serving textual content with the odd image, please serve it in the way the web was won: HTML. Use CSS if you feel the need for some 'style', but remember that perfection is reached when there is nothing left to take away, rather than nothing more to add.

fixermark12y ago

If my product is well-formatted data, why should I build a server that needs to know how to vend that data in a machine-readable visual-formatting-agnostic format (such as JSON or XML) and also a targeted-for-human-consumption format such as HTML? It's a reasonable architectural decision to build one server that knows only how to vend JSON and an associated viewer (that happens to use HTML and JavaScript for presentation purposes) that knows how to consume and render that JSON.

1 more reply

wutbrodo12y ago

> We shouldn't let one company, google, dictate how the web works, simply because of their proprietary technological innovation.

What? This article is about how a constraint due to a single company (and to a lesser extent, other search engines) is being _relaxed_. Before this, single-page apps were a riskier proposition because of many sites' reliance on Google traffic. Now that we're moving closer to that technical limitation being overcome, this is one _less_ constraint "dictated" by Google that site owners have to deal with.

jfoutz12y ago

Not to sound like a complete troll but, uh, how should i send this gif to the client?

j / k navigate · click thread line to collapse

96 comments

jacquesm12y ago

One potential problem here is that google will use this to widen the gap between it and the 'one page apps' web and other search engines (such as duckduckgo) that can't match it in resources.

This 'new web' is going off in a direction that is harmful, coupled with the mobile app walled gardens it is turning back the clock in a hurry.

I'm fairly sure this is not the web that Tim Berners-Lee envisioned.

epistasis12y ago

I agree 100%, and see other problems for users as well. The worst web pages I encounter are the javascript-constructed DOM pages. And by far the worst offender is the odious Google Web Toolkit.

qyv12y ago

I don't see this as a problem at all, here is why:

tragic12y ago

> Nobody talks about duckduckgo because they're search results are better than google's, they talk about them because duckduckgo is all about your privacy.

'Nobody' is such a strong word.

http://devblog.avdi.org/2014/02/16/why-duckduckgo-is-better-...

MichaelGG12y ago

>I'm fairly sure this is not the web that Tim Berners-Lee envisioned.

As a user, so long URLs work so I can send locations to other people, then most of my accessibility scenarios are solved.

The rest is simply a lack of technology in other clients.

sergiosgc12y ago

The problem is not about some God like commandment. It is about the original design of the Web, which we all believe was the reason for its success.

The correct way to do what this post is saying is to create a new mime type for this content delivery method. Then, if the browser actively tells you it accepts that content type, deliver it.

What the OP proposes is not text/html. It's something else.

marknutter12y ago

I'm fairly sure the web has not been the web that Tim Berners-Lee envisioned for a long time now.

wutbrodo12y ago

iLoch12y ago

Touche12y ago

Note that sending a different response to googlebot than what you send to normal users is a violation of Google's guidelines and can get your site penalized. Use at your own peril.

1 more reply

BorisMelnik12y ago

wow this is amazing. would love to see an offshoot of this where it could render a sitemap, or even keep a live sitemap up to date via cron.d or something (just hoping out loud)

3 more replies

Flenser12y ago

frik12y ago

duckduckgo is mainly a meta search engine (relies on search API of Yahoo that relies itself on Bing, Yandex, etc.). Plus it shows some related snippets from Wikipedia and other data-sources.

Several well known web search engines are now defunct or switched to meta-search business (like Yahoo with Bing data).

There are only a few international/world-wide search engines with a crawler:

Google, Bing, Yandex, Baidu, Gigablast, (Archive.org/Wayback Machine)

SixSigma12y ago

Berners-Lee envisioned a decentralised peer-to-peer information sharing network where everyone was a server and a client.

nkuttler12y ago

One page apps that aren't crawlable don't want to be crawled, don't make the necessary work, or are simply incompetent. Making an ajax site crawlable isn't exactly rocket science.

The gap this will really widen is the one between sites that do the necessary work themselves, and those who don't.

workhere-ioOP12y ago

One potential problem here is that google will use this to widen the gap between it and the 'one page apps' web and other search engines (such as duckduckgo) that can't match it in resources.

There are free and open source tools available that would help search engines parse pages containing JS (PhantomJS comes to mind).

cbr12y ago

It's not just tools, it's the cost of all that parsing and executing in a mock browser environment.

andrenotgiant12y ago

Taking a step back: The "Page" paradigm is still very much alive, despite these recent javascript parsing advances.

1. Google still needs a URL-addressable "PAGE" to which it can send Users.

2. This "PAGE" needs to be find-able via LINKS (javascript or HTML) and it needs to exist within a sensible hierarchy of a SITE.

mixonic12y ago

I'd debate the phrase "step back". If you replace all your references to PAGE with URL, you get closer to a real meaning.

andrenotgiant12y ago

The point I was trying to make is this:

With every new capability from Googlebot comes new opportunities for us to screw it up as developers.

When deciding how to build Search-accessible Web Apps, and specifically what to expose to Google, we need to keep in mind that Google likes PAGES that follow the requirements I detailed above.

Flenser12y ago

> these are applications running on Google's own servers. Google is running my application (and hundreds of thousands more)

workhere-ioOP12y ago

As I mentioned in the post, all these problems can be solved by using real paths/URLs and changing them dynamically using pushState.

prophead12y ago

But the onus is still on the developer to choose _what_ gets a unique URL and what does not.

(This is the modern equivalent of the age-old "infinite calendar" problem that Googlebot had to deal with when dynamic calendar apps let you navigate to dates 2 millennia in the future.)

1 more reply

blauwbilgorgel12y ago

Create web applications, rank as a web application.

Create web pages, rank as a web page.

Don't think you are gaining any SEO-benefit from one-page JS-only applications, just because Google made it possible for you to start ranking.

mixonic12y ago

100% FUD.

There is no evidence that Google is going to punish my website for being rendered with JavaScript, as you imply with your first two comments.

Google is indexing the HTML generated by JavaScript, and the links in that HTML. Not some non-web custom format like SWF.

Google making this change does not force you to build a JavaScript-driven website, but it does make it more attractive .

blauwbilgorgel12y ago

Google making this change makes it possible to have your non-fallback JS-only application be indexed. It does not make it more attractive from an SEO or accessibility viewpoint.

1 more reply

workhere-ioOP12y ago

Don't think you are gaining any SEO-benefit from one-page JS-only applications, just because Google made it possible for you to start ranking.

No one is expecting to get any SEO benefits that "normal" pages don't have. We are expecting to get the same chance of ranking as normal pages.

You mentioned that single page apps might rank differently or worse than normal pages. Do you have any source for that? (A source that is current, since Googlebot's improvements are quite new).

blauwbilgorgel12y ago

>We are expecting to get the same chance of ranking as normal pages.

Then you should probably adjust this expectation. You say in your article:

>While having this sort of HTML fallback was technically possible, it added a lot of extra work to public-facing single page apps, to the point where many developers dropped the idea...

>Following these guidelines will help Google find, index, and rank your site.

>Make pages primarily for users, not for search engines.

> ... single page apps might rank differently or worse than normal pages ...

From the original article, the most current source on this:

> Sometimes things don't go perfectly during rendering, which may negatively impact search results for your site.

> Sometimes the JavaScript may be too complex or arcane for us to execute, in which case we can’t render the page fully and accurately.

> Some JavaScript removes content from the page rather than adding, which prevents us from indexing the content.

In the SEO community Googlebot's improvements were noted for a while now. See for example: http://ipullrank.com/googlebot-is-chrome/

rcsorensen12y ago

If you love your users, give them HTML and let the Javascript enhance it.

workhere-ioOP12y ago

Clientside rendering doesn't need to be heavy at all. In fact, you could do it with just $.getJSON('/api/users', function(data) { $('#users').text(data.content) }.

Sure, that requires jQuery, but most "normal" sites require jQuery, too.

rcsorensen12y ago

Sure!

And now your poor little mobile user has to wait for the page to download, then to execute javascript, then to wait for the API response, then wait for the DOM to update.

Or you could put it in the HTML, and then they just have to wait for the page to download.

1 more reply

marknutter12y ago

The lifting ain't that heavy. And besides, wouldn't you want your server spending its precious cycles on things the clients absolutely cannot handle?

rcsorensen12y ago

Like `ssorallen says, slow CPUs and limited battery life is an issue. In addition, latency on mobile networks is rough and should be avoided for your users whenever possible.

You have two choices when taking an SPA approach:

1) Give your users a webpage that is usable, readable, and navigable as soon as they get it.

2) Give your users the skeleton of a webpage that they then have to accept the latency of javascript parsing and execution, and then the latency of data fetches.

we have the ability to do #1.

1 more reply

ssorallen12y ago

2 more replies

tragic12y ago

Like many, I am suspicious of the rather overbearing claims made on behalf of the SPA architecture.

Relevant here, a nice talk by John Alsopp from Full Frontal 2012:

https://www.youtube.com/watch?v=KTqIGKmCqd0

EDIT: clarification

xpose200012y ago

A simple guide can be found here: https://developers.google.com/webmasters/ajax-crawling/. Although I suspect it needs to be updated since its from 2012.

If you create an application, make sure it alters the URL when applicable. For simple apps, the following repos will be useful:

The old way, that still works: https://github.com/asual/jquery-address

The better way, preferred: https://github.com/browserstate/history.js or https://github.com/defunkt/jquery-pjax.... not sure which is better to be honest. Feel free to chime in.

rpedela12y ago

fixed link: https://github.com/defunkt/jquery-pjax

ssorallen12y ago

> Single page apps are not a new concept, but up until now they were typically a bad solution for public websites that depend on hits from search engines

scotth12y ago

Doesn't it all depend on how long they spend on the site? If it's one page and bounce, sure, that's a terrible experience. More pages than that? Now we're starting to see savings.

ssorallen12y ago

1 more reply

acqq12y ago

There is some strange logic there.

marknutter12y ago

workhere-ioOP12y ago

As for speed: Sure there are exceptions when developers go crazy with tons of heavy JS, but that doesn't need to be the case at all.

Touche12y ago

Only the first load is slower. If you click on any links within the page they will render much faster as you skip the whole full page load thing.

grey-area12y ago

timq12y ago

I really, really dislike this approache of doing website, it make the code and the design hard to understand and cost a lot of problem when comes the time to debug or made major changes.

When the article say "put the CSS inside a <style> tag on the page - and the JS inside a <script> tag", it's just horrible, fuck it.

workhere-ioOP12y ago

When the article say "put the CSS inside a <style> tag on the page - and the JS inside a <script> tag", it's just horrible, fuck it.

CMCDragonkai12y ago

At any case, if you want to have a solution to this SEO problem now, I created SnapSearch (https://snapsearch.io)

bhartzer12y ago

I wouldn't actually call this "recent" improvements. I mean, Google has been handling JavaScript for years now. And they're just now coming out and publicly saying it. Which is typical Google.

workhere-ioOP12y ago

What they were saying before was that you always need a HTML fallback for JS-generated content. Now it seems they're saying you don't necessarily need to.

drakaal12y ago

Optimism rather than fact.

There is a lot more to it. I am pretty well known as an SEO, and while I would love this to be true it isn't.

Google's improvements to GoogleBot are mostly targeting spam, and obfuscation of content. The ideas is not to discover content as much as it is to avoid having content hidden.

Previously you could have a webpage that appeared to be about Puppies, but then used any number o dynamic methods to instead show naked cam girls. Google worked to fix this.

Google is now doing some indexing of named anchors, and this allows for linking to a page with in a page as it were. But that is a Long ways from building indexable single page applications.

-Brandon Wirtz SEO (formerly Greatest Living American) http://www.blackwaterops.com

basseq12y ago

This is a great improvement, but I'm struck by two things:

workhere-ioOP12y ago

CHY87212y ago

This seems like an odd thing to be shouting about.

As far as I can see, throughout this thread the performance benefits of single page apps are touted as being fantastic, making it worthwhile to use the new technology etc.

If anything but almost all of your website is static, you won't be saving all that much time.

workhere-ioOP12y ago

If anything but almost all of your website is static, you won't be saving all that much time.

Single page apps can easily be static (static HTML page + static JSON). The point of this would be to decrease the download size for each new page visited by the user.

CHY87212y ago

Some sites obviously inline CSS or JavaScript, but that can be eliminated if necessary (and only affects the first page load anyway).

Writing the frameworks because they make development easier is a much more reasonable argument.

This still all detracts away from the fact that non-static websites are typically dog slow and they shouldn't be.

PinguTS12y ago

The single page app has another _huge_ drawback.

The reload via JS fails silently, when you are on a bad Internet connection, like I am currently here in Nepal. You can not simply do a reload, like with a simple HTML page.

For example, Facebook is here unusable, because of the same issue. It works only, when you request the mobile site of Facebook in the browser.

So please, get rid of that damn JS, if you care about your user base and usability.

BTW: that problem also happens on bad hotel Wifi in the US.

adamconroy12y ago

ailox12y ago

allendoerfer12y ago

This news article seems to come up every few years. Nevertheless, the niche of apps, which can profit of this, is quite small.

To profit from this, you need an app, which content the users search for and interact with several times after they have found it. So i guess, "revolutionize" seems a bit much to me.

slashdotaccount12y ago

Please use this instead: https://en.wikipedia.org/wiki/Progressive_enhancement

SquareWheel12y ago

But we're talking about making AJAX pages robot-friendly, not making regular pages mobile-friendly.

kayoone12y ago

Loading off most of the work to the client has its downsides too. If that was a reality i would assume that mobile devices would require quite a bit more battery power to render basic webpages.

h1karu12y ago

google is not the only search engine.

workhere-ioOP12y ago

Which I emphasized in the post :)

justinph12y ago

We're still using hypertext transfer protocol, right? You need to send hypertext down the wire.

We shouldn't let one company, google, dictate how the web works, simply because of their proprietary technological innovation.

o_____________o12y ago

Roads are meant for horses, right? We shouldn't let one company, Ford, dictate how the roads work.

My man, if we only used infrastructure and technology in the way it was originally intended and narrowly imagined, the world would be a dim place.

justinph12y ago

Ok, I see your point.

I like google. I just don't think it's good to have one company own a market so completely.

3 more replies

Yetanfou12y ago

Google is not dictating anything so there is no need to vent your dislike, of, Google, in that way. If anything, this releases developers from a constraint on how sites are built.

fixermark12y ago

1 more reply

wutbrodo12y ago

> We shouldn't let one company, google, dictate how the web works, simply because of their proprietary technological innovation.

jfoutz12y ago

Not to sound like a complete troll but, uh, how should i send this gif to the client?

j / k navigate · click thread line to collapse