How strong of an advantage that will be in the long run is uncertain, I would rather see a web that ships pages with actual content in them than empty containers for a variety of reasons (most of which have to do with accessibility and the fact that not all clients are browsers or even capable of running javascript).
This 'new web' is going off in a direction that is harmful, coupled with the mobile app walled gardens it is turning back the clock in a hurry.
I'm fairly sure this is not the web that Tim Berners-Lee envisioned.
It may be that I encounter some of these "modern" pages without knowing it because some dev has put in the time to make it work well, it seems that the vast majority are absolutely terrible. It's a small decrease in developer effort for a huge decrease in user satisfaction. I remember the terrible terrible #! days of Twitter with sadness.
Some developers have a tendency to go for the internally sophisticated/beautiful in preference to the best experience for the user. I hope that blog posts like this one don't let loose these developers' worst tendencies.
Google doesn't really have competitors in search, they just don't. I mean look at how we even define 'searching the internet' in our spoken language: 'google it'. Google has become the identity of search on the web in most people's minds. And to top it off, they are really, really good at it. Google getting better is going to widen the gap between them and everyone else, but the gap is already pretty damn wide. Was there really any chance of someone catching them in any foreseeable future?
But, the wider that gap gets the more motivation there becomes to not attack the gap, but to go in a different direction altogether. Nobody talks about duckduckgo because they're search results are better than google's, they talk about them because duckduckgo is all about your privacy. They found a different way to make a search that people want to use. The wider that gap gets the more motivated some will be to try something truly novel to compete with google.
'Nobody' is such a strong word.
http://devblog.avdi.org/2014/02/16/why-duckduckgo-is-better-...
So? None of these are convincing arguments for application developers. Having to rewrite an application to perform all its UI logic on the server side in addition to client side is a lot of work, for almost no benefit to the people paying to make the application.
As a user, so long URLs work so I can send locations to other people, then most of my accessibility scenarios are solved.
The rest is simply a lack of technology in other clients.
When you receive a GET request for an URL, and the browser tells you it accepts text/html, it is expected that you answer with the content stored at that URL in the format requested. It is not expected that you answer with an application that when run will eventually produce the content.
The correct way to do what this post is saying is to create a new mime type for this content delivery method. Then, if the browser actively tells you it accepts that content type, deliver it.
What the OP proposes is not text/html. It's something else.
Several well known web search engines are now defunct or switched to meta-search business (like Yahoo with Bing data).
There are only a few international/world-wide search engines with a crawler:
Google, Bing, Yandex, Baidu, Gigablast, (Archive.org/Wayback Machine)
The gap this will really widen is the one between sites that do the necessary work themselves, and those who don't.
There are free and open source tools available that would help search engines parse pages containing JS (PhantomJS comes to mind).
1. Google still needs a URL-addressable "PAGE" to which it can send Users.
2. This "PAGE" needs to be find-able via LINKS (javascript or HTML) and it needs to exist within a sensible hierarchy of a SITE.
3. This "PAGE" needs to have unique and significant content visible immediately to the user, and on a single topic, and it needs to be sufficiently different from other pages on the site so as not to be discarded as duplicate content.
URLs for single-page applications are a serialization of application state. The fact that we now have an application platform (JavaScript/HTTP) providing sharable, mostly-human-readable state sharing (URLs) and is also indexed and searchable is nothing short of incredible.
Yes, the basic abstractions we use are the same. We will have URLs that address content in our applications. But now these are applications running on Google's own servers. Google is running my application (and hundreds of thousands more), and trying to understand what they mean to humans. This is a pretty amazing step forward.
Imagine Apple announcing it would run all iOS applications, interacting like a user to build a search index. IMO, this parallel shows what makes Google's commitment to running JavaScript apps exciting.
With every new capability from Googlebot comes new opportunities for us to screw it up as developers.
If we were to replace PAGE with URL, and URL is simply a serialization of application STATE, we could easily end up with infinite URLs that lead to STATES that are not really that different, unique or appealing as answers to queries users type into Google.
When deciding how to build Search-accessible Web Apps, and specifically what to expose to Google, we need to keep in mind that Google likes PAGES that follow the requirements I detailed above.
Which is also very beneficial for Google as they'll likely be the only company doing that for a while, and the one able to do it for the most sites for a long time to come, maintaining Google's search index lead.
It might be good for user deeplinking capabilities to change the URL every time any type of state change is made (for example sorting a list by date instead of name) - But exposing that many URLs to Google would be bad.
(This is the modern equivalent of the age-old "infinite calendar" problem that Googlebot had to deal with when dynamic calendar apps let you navigate to dates 2 millennia in the future.)
Create web pages, rank as a web page.
This is a band-aid by Google. Developers created inaccessible websites (JS-only, no HTML fallback) and Google still wanted to give those sites a chance to be in the index. Like when Google made it possible to index text inside .swf movies. This did not mean that flash sites suddenly ranked alongside accessible websites. No, it only meant that you could now find content with a very targeted search query.
Don't think you are gaining any SEO-benefit from one-page JS-only applications, just because Google made it possible for you to start ranking.
And don't forget your responsibility as a web developer to create accessible content. Forgetting progressive enhancement, fallbacks, a noscript explanation for why you need JS, ARIA is devolution. If Google can index your site, but a blind user has a problem with your bouncy Ajax widget, then you failed catering to all your users. If you lazily let Google repair your mistakes, then soon you will be a Google-only website.
There is no evidence that Google is going to punish my website for being rendered with JavaScript, as you imply with your first two comments.
Google is indexing the HTML generated by JavaScript, and the links in that HTML. Not some non-web custom format like SWF.
JavaScript driven sites work just fine with modern screen-readers. https://developer.mozilla.org/en-US/docs/Web/Accessibility/A... and in 2014 97.6% of screen-readers ran JavaScript http://webaim.org/projects/screenreadersurvey5/#javascript
In 2013, 92 out of 93 visitors to a UK government webpage supported JavaScript: https://gds.blog.gov.uk/2013/10/21/how-many-people-are-missi... And mixed into that 1.1% were users getting broken JS, behind firewalls, disabling JS, etc.
Google making this change does not force you to build a JavaScript-driven website, but it does make it more attractive .
Accessibility is not a numbers game. In many countries it is a legal requirement. And adhering to the WCAG means providing non-JS fallbacks or progressive enhancement. RMS not being able to access your content is an accessibility issue too, it does not have to involve a disability. It can be technical in nature, like disabling JS or being behind a corporate firewall, or your browser not supporting pushstate.
If you want to look at stats, take a look at the stats and surveys on accessibility of dynamic web applications. Just because your screenreader supports JavaScript does not mean you have no accessibility issues due to JavaScript. Rich internet applications should use WAI-ARIA. I don't think people who create websites without a fallback (avoiding this issue entirely), will worry about creating websites with ARIA-support. And if they do care about such accessibility, they should also provide a non-ARIA non-JS fallback.
Google making this change makes it possible to have your non-fallback JS-only application be indexed. It does not make it more attractive from an SEO or accessibility viewpoint.
No one is expecting to get any SEO benefits that "normal" pages don't have. We are expecting to get the same chance of ranking as normal pages.
You mentioned that single page apps might rank differently or worse than normal pages. Do you have any source for that? (A source that is current, since Googlebot's improvements are quite new).
Then you should probably adjust this expectation. You say in your article:
>While having this sort of HTML fallback was technically possible, it added a lot of extra work to public-facing single page apps, to the point where many developers dropped the idea...
A JS-driven site with an HTML fallback is a normal page. Then you don't need any tricks or force Google to run your application and hopefully make pages out of them. Start with the fall-back and enhance.
This is a serious mistake with consequences. Tor bundle and Firefox shipped with JavaScript support, because disabling JS broke too much of the current web. It causes accessibility issues (remember when Twitter changed to hash-bang URL's?), if not for Googlebot, then for regular users (From the Webmaster Guidelines):
>Following these guidelines will help Google find, index, and rank your site.
>Use a text browser such as Lynx to examine your site, because most search engine spiders see your site much as Lynx would. If fancy features such as JavaScript, cookies, session IDs, frames, DHTML, or Flash keep you from seeing all of your site in a text browser, then search engine spiders may have trouble crawling your site.
>Make pages primarily for users, not for search engines.
I am still going on the assumption that you created a one-page application without a time-consuming fallback, and you rely on Google to make rankable pages from them. Then you leave some users standing in the cold, so why deserve to rank equal to a user-friendly accessible web page?
> ... single page apps might rank differently or worse than normal pages ...
From the original article, the most current source on this:
> Sometimes things don't go perfectly during rendering, which may negatively impact search results for your site.
> It's always a good idea to have your site degrade gracefully. This will help users enjoy your content even if their browser doesn't have compatible JavaScript implementations. It will also help visitors with JavaScript disabled or off, as well as search engines that can't execute JavaScript yet.
> Sometimes the JavaScript may be too complex or arcane for us to execute, in which case we can’t render the page fully and accurately.
> Some JavaScript removes content from the page rather than adding, which prevents us from indexing the content.
In the SEO community Googlebot's improvements were noted for a while now. See for example: http://ipullrank.com/googlebot-is-chrome/
Single page websites or application-as-content-website's are not popular among SEO's. One reason for this is that it doesn't allow for fine-grained control on keyword targeting, and keeping the site canonical, and it can waste domain authority when you have less targeted pages in the index than you can rank for. Experiment and find out for yourself.
If you love your users, give them HTML and let the Javascript enhance it.
Projects like Facebook's React ( http://facebook.github.io/react/docs/top-level-api.html#reac... ) and Rendr (https://github.com/rendrjs/rendr/) let you use server rendering as well as the single page technologies on the client side.
Sure, that requires jQuery, but most "normal" sites require jQuery, too.
And now your poor little mobile user has to wait for the page to download, then to execute javascript, then to wait for the API response, then wait for the DOM to update.
Or you could put it in the HTML, and then they just have to wait for the page to download.
You have two choices when taking an SPA approach:
1) Give your users a webpage that is usable, readable, and navigable as soon as they get it.
2) Give your users the skeleton of a webpage that they then have to accept the latency of javascript parsing and execution, and then the latency of data fetches.
we have the ability to do #1.
I just launched a website. It's a weekly periodical with political analysis, word-count on articles 1500-6000. It needs to carve up the content in a few different ways (categories, issue numbers etc), decorate an article with links to other relevant content, and provide a nice CMS for non-tech people to use. So it's on Django, with the regulation sprinkling of JQuery. (If it were only techies updating it, you could probably do it with a static site generator...)
To me, the idea that you'd try and force something that is plainly a big collection of pages into a 'single page' is just philosophically bizarre, like printing Moby Dick on a square mile of paper, using some amazing origami skills to present it to the reader, all in order to save a bit of effort at the paper mill.
The googlebot business is one aspect of a bigger issue, which is that a website needs to be consumable by a host of different clients. I don't see how you can do the SPA thing without making major assumptions about those clients.
Sometimes, of course, those assumptions can be justified - it depends on the job. And Angular etc are enormously fun to play with, and handled well can enable a great UX for certain jobs. But I don't think it's 'the future'. It's another tool in the box.
Relevant here, a nice talk by John Alsopp from Full Frontal 2012:
https://www.youtube.com/watch?v=KTqIGKmCqd0
EDIT: clarification
A simple guide can be found here: https://developers.google.com/webmasters/ajax-crawling/. Although I suspect it needs to be updated since its from 2012.
If you create an application, make sure it alters the URL when applicable. For simple apps, the following repos will be useful:
The old way, that still works: https://github.com/asual/jquery-address
The better way, preferred: https://github.com/browserstate/history.js or https://github.com/defunkt/jquery-pjax.... not sure which is better to be honest. Feel free to chime in.
If your users (I'm talking humans, not bots) have to download a mountain of JavaScript and execute it before seeing any content, your site is slower than it could be for everyone. We should stop saying that "single page apps", i.e. sites rendered in JavaScript in a browser, are bad because they can't be scraped by a bot. They are bad for EVERYONE who wants to view the site because of the network and CPU time it takes to download the assets and render the site in the browser.
There is some strange logic there.
And I'm not sure where you're getting that it's "easier to program" single page apps than it is to simply rely on the server to render html on the server. The fact that it's not easy is the very reason we have so many competing front-end frameworks to solve the problem elegantly.
As for speed: Sure there are exceptions when developers go crazy with tons of heavy JS, but that doesn't need to be the case at all.
I find the cheerleading for single page websites disconcerting and the proposed benefits unconvincing. Why should this be the default way to build websites? A few desultory upsides are presented without a full consideration of the multiple downsides to client-side development.
The biggest advantage of thick-client architecture is sending less data to the client and, if you like using javascript, writing everything in js, but there are multiple downsides compared to more traditional thin-client websites - load times which depend more on client capabilities (hugely variable and out of your control) than servers, dependence on js on the client, loading pages while your content is placed in the dom by js, forcing everyone to write in js instead of switching language on the server whenever they like, ignoring the simple document model of html served at predictable URIs, which has served the web so well and means you can use dynamic or static documents, full documents can be cached for very quick serving and by intermediaries, etc, etc. Of course some of these can be overcome, but there are serious obstacles, and the advantages are meagre to non-existent unless you enjoy javascript and feel its the only language you'll ever need.
For someone who doesn't like working in js, and/or doesn't have a huge amount of logic already in js (many websites work just fine with some limited ajax), trying to force every website into the procrustean bed of client-side development is not an appealing prospect. I can see why it appeals to those who have already invested in js frameworks, but predictions of its future dominance on the web, like predictions that eworld, activex or mobile would replace the web, are overblown.
I suspect the birth and death of Javascript will be a footnote in the history of the web, rather than taking it over as this article suggests. If anything we should be looking to replace our dependence on js, not making it mandatory.
I really, really dislike this approache of doing website, it make the code and the design hard to understand and cost a lot of problem when comes the time to debug or made major changes.
There are no magic when using javascript, it will slow down the client, and manipulating DOM is very slow. Doing things server side cost nothing compared to javascript. Remember that loading a webpage is very fast when you have only a CSS and html code, because it very easy to put it in cache and doing some pretty nice optimizations on it.
With frameworks, making "webapp" become a huge nigthmare, things becomes overly complex and bloated, the request as to pass trough a lot of layers before end up somewhere, and the developpement is not faster than have a custom code. Good framework don't make good programmer.
When the article say "put the CSS inside a <style> tag on the page - and the JS inside a <script> tag", it's just horrible, fuck it.
Who says the page will become bloated with JS just because you use clientside loading? The mechanism I'm talking about can be done with something like 10 lines of JS or less. No one's saying you have to use AngularJS with every web page you make.
When the article say "put the CSS inside a <style> tag on the page - and the JS inside a <script> tag", it's just horrible, fuck it.
First of all, this is not a requirement of single page apps, just an option. Secondly, when you're developing, you would still have your JS and CSS in separate files. Your compiler would then minify the whole thing and put it inside your minified HTML file.
At any case, if you want to have a solution to this SEO problem now, I created SnapSearch (https://snapsearch.io)
There is a lot more to it. I am pretty well known as an SEO, and while I would love this to be true it isn't.
Google's improvements to GoogleBot are mostly targeting spam, and obfuscation of content. The ideas is not to discover content as much as it is to avoid having content hidden.
Previously you could have a webpage that appeared to be about Puppies, but then used any number o dynamic methods to instead show naked cam girls. Google worked to fix this.
Google is now doing some indexing of named anchors, and this allows for linking to a page with in a page as it were. But that is a Long ways from building indexable single page applications.
-Brandon Wirtz SEO (formerly Greatest Living American) http://www.blackwaterops.com
1. The state of JavaScript-only application development is still nascent. The number of JS-only sites I see that are buggy, don't use PushState correctly, or have other shortcomings is growing faster than the overall trend. Not that it can't be done well, but if your JavaScript-only "app" is really just a standard website, you might want to re-think your approach.
2. There has to be a better way. There are distinct benefits to approaches in caching content and providing feedback, but JavaScript seems to be a kluge-y approach. It reminds me of frames back in the day. Some of this is browser support; some of this is lack of standardization; some is perhaps a missing piece of the HTML spec; etc.
A single page app isn't JS-heavy by definition, and a "normal" page (with HTML generated on the server) can easily be JS-heavy. It all depends on how you program it. Just keep in mind that single page apps don't necessarily need to use heavy frontend frameworks such as Knockout, Ember or AngularJS.
As far as I can see, throughout this thread the performance benefits of single page apps are touted as being fantastic, making it worthwhile to use the new technology etc.
When has performing operations efficiently ever been the domain of the web? Websites in my experience have the worst performance of almost any software I use! I've seen developers cite 200ms or longer to load a page as being a good benchmark - that seems pretty awful to me.
If getting this tiny performance improvement (which often results in poorer performance on the first load (not ideal for many)) is so critical, why do the same developers not invest in writing more performant server apps? Yes, often the database is a bottleneck, but these problems can in general be worked around (either by use of faster queries or caching etc).
Why attempt to get a small performance benefit by saving 30-odd kB of HTML on each page load (static and so essentially free for the server), when one could get a much larger performance benefit by optimising the backend?
Almost all serious sites will still see their page load being limited by the time it takes to produce the page. It's possible to write really fast websites (try http://forum.dlang.org/) but no one seems to do it :(
If anything but almost all of your website is static, you won't be saving all that much time.
Single page apps can easily be static (static HTML page + static JSON). The point of this would be to decrease the download size for each new page visited by the user.
Some sites obviously inline CSS or JavaScript, but that can be eliminated if necessary (and only affects the first page load anyway).
This information is free to generate on the server side, so it's not slowing down that computation at all (it's just a stringbuilder function, essentially). Furthermore, the transfer time is generally not the deciding factor - it's the server side time to put the rest of the information together.
To give one example, I went to a typical website - the Guardian (it's a fairly standard high-traffic news website). Chrome informs me that in order to request one article, it took 160ms to load the html - 140ms of waiting and 20ms of downloading. Now, the RTT is about 14ms, so that's about 110ms of generating the web page and 20ms of actually downloading it. It's about 30kB of compressed HTML (150kB uncompressed), most of it's 'static content' - inlined CSS and JS.
Them using the single page model would reduce the page download time (apart from the first page) by an absolute maximum of 20ms - which means that the time to load each page has been reduced by about 12%.
This is fine, but almost all of the data is just the result of string concatenations and formatting - i.e. free processing (or at least almost-free processing). It's getting the rest of the data together that's somehow taking the 100ms (or crap implementations).
The cost of moving data around on websites is typically small compared to the actual production time of the content. That's why we see people preferring to inline huge amounts of CSS etc on each web page and having people download it time after time - because it's only about 10kB compressed the data transfer is inconsequential, and normally is dominated by the RTT.
Spending all the time writing these frameworks because of performance benefits is a fallacy - the data still has to be generated somewhere, and if it happens dynamically it's slow as hell. The savings can never become that great - at most they lead to 20-30ms of improvements if bandwidth is acceptable.
Writing the frameworks because they make development easier is a much more reasonable argument.
This still all detracts away from the fact that non-static websites are typically dog slow and they shouldn't be.
The reload via JS fails silently, when you are on a bad Internet connection, like I am currently here in Nepal. You can not simply do a reload, like with a simple HTML page.
For example, Facebook is here unusable, because of the same issue. It works only, when you request the mobile site of Facebook in the browser.
So please, get rid of that damn JS, if you care about your user base and usability.
BTW: that problem also happens on bad hotel Wifi in the US.
Either you have a highly interaction-heavy web-app, where it makes sense to execute most of the code on the client and deliver the content as JSON or you have a content-heavy website, where it makes sense to deliver cached content to the client.
There are some apps in between, which are highly interactive and content heavy like web versions of social apps. For them additionally the question arises, if they want to be crawled by Google or Google wants to index their content.
To profit from this, you need an app, which content the users search for and interact with several times after they have found it. So i guess, "revolutionize" seems a bit much to me.
We shouldn't let one company, google, dictate how the web works, simply because of their proprietary technological innovation.
My man, if we only used infrastructure and technology in the way it was originally intended and narrowly imagined, the world would be a dim place.
But, have Yahoo or Bing or DuckDuckGo made the transition to be able to crawl the web with a full JS & DOM rendering engine? I doubt it. By eschewing that compatibility we're setting a very high bar for what any competitor to google would have to achieve.
I like google. I just don't think it's good to have one company own a market so completely.
That said, I'd rather see more server-side HTML instead of client-side JS when it comes to the web. If you're developing a game, client-side JS is fine. If you're serving textual content with the odd image, please serve it in the way the web was won: HTML. Use CSS if you feel the need for some 'style', but remember that perfection is reached when there is nothing left to take away, rather than nothing more to add.
What? This article is about how a constraint due to a single company (and to a lesser extent, other search engines) is being _relaxed_. Before this, single-page apps were a riskier proposition because of many sites' reliance on Google traffic. Now that we're moving closer to that technical limitation being overcome, this is one _less_ constraint "dictated" by Google that site owners have to deal with.