One web page for every book ever published. It's a lofty but achievable goal.
To build Open Library, we need hundreds of millions of book records, a wiki interface, and lots of people who are willing to contribute their time and effort to building the site.
To date, we have gathered over 20 million records from a variety of large catalogs as well as single contributions, with more on the way.
Open Library is an open project: the software is open, the data are open, the documentation is open, and we welcome your contribution. Whether you fix a typo, add a book, or write a widget--it's all welcome. We have a small team of fantastic programmers who have accomplished a lot, but we can't do it alone!
---
They also seem to provide an API[2].
For an invite, please send me an email at mek@archive.org or go to: https://openlibrary.org/volunteer
# APIs & Data Dumps
- https://openlibrary.org/developers/api
- https://openlibrary.org/dev/docs/api/books
- https://openlibrary.org/developers/dumps monthly data dumps for if you need bulk access and the APIs are not enough.
# Spread the word
Also, if you want to help raise awareness of this resource, please help us get the word out on twitter!
1. https://twitter.com/openlibrary/status/1338185940469051392
2. https://twitter.com/openlibrary/status/1338186553915367425
# Issues
Thank you all for helping us discover some issues with our goodreads importer and search (recently migrated to Python3 + thanks @cdrini et al for these fast bug fixes! If you notice an problem, please help open an issue here: https://github.com/internetarchive/openlibrary/issues/new/ch...
# Learn More
- https://archive.org/details/openlibrary-tour-2020/openlibrar... if you want to learn more about Open Library, here's a short intro vid.
- https://github.com/internetarchive/openlibrary if you want to follow on github.
How do I go about claiming my author page? The current book listed there has been officially "retired" for over a decade now, and I have plenty of other books that I'd be happy to add.
Would anyone be interested in having an instant search experience for this books dataset, like the one I built for the 2M recipes database posted on HN earlier this week: https://news.ycombinator.com/item?id=25365397
[1] https://www.goodreads.com/review/import (export)
Although, I just tried importing my Goodreads export into Open Library and I get the following "Internal Error":
> Hmm... > Sorry. There seems to be a problem with what you were just looking at. > We've noted the error xxxx-xx-xx/yyyyyy and will look into it as soon as possible. Head for home?
Anyone else facing this issue?
And yeah, searching needs some work! That's on my task list for this month. Just this Friday I spent most of my day working on updating our search engine, Solr, from 3.6 to 8.7 (wip!). But search is a _BIG_ pain point. We're a small team with a big long list of things to do, but we are making progress! This year we updated to Python 3, switched most of our production environments to docker-based for easier deploys and to give open source contributors more control of production infra, added reading history stats for users, added a new interface for exploring books, worked on a novel recommendation system, added text selection to the online BookReader for public domain books, added GoodReads importing, grew our community, added the ability to search by classification, and much, much more (you can see highlights from our year here: https://github.com/internetarchive/openlibrary/issues/3891 ).
There is still _definitely_ a lot to do, but I think the biggest reason worth using/contributing to Open Library is likely its open source community. Anyone can jump in and help make improvements to the system (as they very often do!). Personally, I think it's more likely that a system with a community will survive/flourish than one maintained by a single person (I also wondered whether I should just create my own before contributing to and now working on Open Library!). And there are also loads of different tasks associated with a site like OL, which would be impossible for me to do if I was going it alone.
If you would be interested, checkout the GitHub repo: https://github.com/internetarchive/openlibrary . It's very active, and you can get an idea of how we work :)
How do you even get an ISBN for a candle?
Great points here
1. The banner happens last month of the year (Wikipedia being the perfect analog). Yes, there are mixed feelings and it's not the world's best experience :P
2. Our entire data set is available to download as in bulk https://openlibrary.org/developers/dumps because we'd love to see a decentralized p2p version
3. https://github.com/mouse-reeve/bookwyrm Mouse who used to work @ Internet Archive has a decentralized version of Open Library (Bookwyrm) and it's worth checking out.
4. For the last 5 or so years the Internet Archive has been cultivating a dweb/dapp community and integrating with IIIF, Dat, IPFS, gun, bittorent, webtorrents, and others and hosting regular summits and meetups https://blog.archive.org/2018/07/21/decentralized-web-faq/
5. The wayback machine is an interesting case study: it turns out, incentive structures (even things like FIL/filecoin) haven't been able to perfectly crack the nut on getting folks interested enough to preserve the whole wayback machine. There's petabytes of material and there's a powerlaw about what people care about today. Internet Archive realized what we care about today may not be the same as tomorrow, and so there's a cost eaten (the incentive comes from economies of scale generated by intrinsic desire rather than $). And in a way, this centralized solution (economies of scale) IS the solution a community came up with. It has flaws and advantages (tradeoffs), such as centralized points of failure, and I think the archive would be (and has been) ecstatic to explore improving these opportunities.
I haven't shared this yet -- it's more for the community, but I've tried to address various questions from the community and distill answers + resources for Open Library here:
https://blog.openlibrary.org/2020/12/13/importing-your-goodr...
Tried my best to include others players in the space (wikidata, inventaire, bookbrainz, worldcat, bookwyrm) who are doing great work and pay respects to readng, storygraph and other innovative services which are breaking onto the scene.
I get that Open Library doesn't have as much data as GoodReads, but I wish it would show me the data it couldn't import so I could add it manually to Open Library's data store.
Nevertheless, I love the idea and I'll be opening bug reports and maybe code contributions if something looks easy enough.
https://github.com/internetarchive/openlibrary/issues/new/ch...
If you tag @tabshaikh who helped implement the importer and me @mekarpeles we can make sure it gets triaged and tagged correctly this week :)
Dissatisfied with how slow and clunky Goodreads is I actually thought about making my own (albeit much simpler) version of Goodreads to keep track of my reading habits. I often dig through Goodreads to find books or authors I can't remember the names of -- and Goodreads isn't great for that.
Open Library actually provides the missing piece. The fact that they offer bulk downloads also makes it easier to be a good internet citizen and not send tons of API traffic their way.
Looks like I'll have to set up a monthly donation. I'd really like see openlibrary succeed.
https://openlibrary.org/developers/dumps
The project is also open source, and you can find the code (and contribute!) on GitHub: https://github.com/internetarchive/openlibrary
After some looking, there are Some private databases with millions of # but no official site. Eg
For books in the collections of large libraries (like the LoC) there will be a public catalogue entry with the ISBN attached, but they don't assign it.
There were also a lot of books published before ISBNs were created, and not every book has an ISBN attached even to this day.
> WorldCat is a union catalog that itemizes the collections of 17,900 libraries in 123 countries and territories[4] that participate in the OCLC global cooperative. It is operated by OCLC, Inc.[5] The subscribing member libraries collectively maintain WorldCat's database, the world's largest bibliographic database.[6]
My primary interest is in recording my thoughts on books and stories I've read, so a review site is what I'm looking for. This is more challenging for short stories as they don't have a ISBN, may appear online, or in a fiction magazine or in an anthology.
So, I may put down my thoughts on short story X that I read in book Y, but if I look up book Z that also features that short story, my thoughts on the story would also appear there.
In short, I'm looking for a site that also records short stories like the ISFDB [1], but allows users to add reviews.
So far, I haven't found one. I'm now putting down my notes on short stories in a Zotero database.
https://www.theguardian.com/books/2013/apr/02/amazon-purchas...
I would like to understand the true strategic interest behind this. Is Amazon simply penny-pinching now that they’ve successfully obliterated the market for both new and used books online? There’s way more to this story than appears on the surface.
The strategic reason for Amazon is obvious. As someone else mentioned, Amazon doesn't want Goodreads data to be used to add value to their competitors' offerings.
Speaking as a developer who tried to build on top of Goodreads API, I also want to add that this was a long time coming. The API had been neglected for some time. And some of the most interesting datasets weren't even made available through the API.
This is it exactly. Goodreads was/is the best/largest source of information on books available online.
I guess the writings on the wall...
What I think all off these large companies are doing is pulling up the open-web ladder after they've climbed it to dominant positions. The problem with anti-trust action is that it's reactive; we wait until a company has gotten too big, and then hope we can cut it down to size. I'd love to see moves toward proactive open-data and open-algorithm requirements, so that we guarantee a level playing field. That won't be easy, but neither is trying to rein in companies with annual profits in the tens of billions.
[1] https://www.cnn.com/2020/10/24/tech/facebook-nyu-political-a...
I think you're flattening a lot of complexity here. Yes, I agree that Mark Zuckerberg is probably not terribly interested in our privacy. But one of the biggest outrages in FB privacy history, the Cambridge Analytica Scandal, was driven entirely by a researcher inappropriately accessing and sharing user data. While you might not trust Zuckerberg, and maybe you shouldn't, there is definitely some "there there" when it comes to how researchers handle our data too.
A bunch of book reviews and book recommendations that can be used separately from Amazon doesn't help Amazon.
Regulators should step in when businesses/startups are being harmed by a monopoly via unfair practices such as bundling. But intervening otherwise will simply deny founders and employees a decent exit. And if the economics are not sound enough it'll be harmful for customers as well - the app will get loaded with too many ads or simply shutdown.
So IMHO not a good case for regulation.
Not that I like Amazon, but it is not philanthropy. I assume goodreads take non trivial cost to run operations, moderations and the site and just because Amazon is doing well financially it doesn't mean they have free money. This move makes it easier for Amazon to monetize on goodreads, it is as simple as that.
Effectively the only reason it's not supporting itself is precisely because Amazon bought it.
The web has to mature beyond advertising as a business model. For this to happen people are going to have to open their wallets, pay for the services they use, and support independent businesses. That’s how we build a web where indies can thrive - one that’s more village centre than financial centre. I think the shift is underway.
True/false?
If I pay I want: No ads, no tracking, full access to my own data in sane export formats, schemas, no data mining, no data selling, no "sharing data with our partners", encryption options, no dumb hoops, no dark patterns, the ability to point a product at an API endpoint of my choosing, backup options that default to my infrastructure first and so on.
Actually let's add more: The data generated by my use of my data in the product. Non-canned support responses that don't ask for information I literally put in the ticket three weeks ago. Prominent indication of where (geographically and legally) data is stored and used. If/how often you do backups. If/how often you practice disaster recovery.
So really what I want to pay for is sanity and no bullshit.
Yet if I do pay, many services and companies will still do all of this shit in the background until midnight the day it's finally made illegal, all the while gaslighting me about "how much they value me as a customer" and how they "respect" our "relationship".
It's literally obscene.
Because you, a paying customer, are worth the most to the advertisers.
If only...
Maybe one day these things will be standard. We have to convince the mainstream these are goals worth pursuing... As long as most people accept how shit the status quo is, it won't improve.
The reason all of these things happen is because it's easy to slip into them in a tight financial spot and there's usually no instantaneous backlash.
Unfortunately: - competitors use revenue from ads, tracking, selling data to their advantage and undercut your value based on price.
- and free users love free stuff. I had a user ask me to include ads in exchange for paid features.
- and yet people still distrust you. Once I proudly posted a new feature to Show HK and the first response was "How does this post not qualify as 'spam'?". Some people automatically think you're a bad guy since you sell something.
- without advertising, how do you get the word out?
I rely on word-of-mouth, since it carries much trust, and news articles. This means it's a slow game.
It is very hard to find mutual respect (both ways) between user and maker. Most relationships start with distrust or "what is in it for me".
Nearly every time I contact customer services these days I'm fobbed off with obnoxious PR speak instead of just telling me straight.
GDPR's right to data portability provides much of the export functionality you're after. It must be structured, in a format that is commonly-used and machine-readable. The ICO's guidance suggests that CSV, XML and JSON best meet this requirement.
Tracking is something else that GDPR helps with. Tracking of personal information via e.g. cookies require active consent. Silence is not consent.
"sharing data with our partners" requires a lawful basis when dealing with EU data subjects. This will normally be consent where data is sold to third-parties for e.g. marketing, so data subjects will be able to make an informed decision and opt out of this. Again, silence is not consent - and burying data sharing in an unreadable legal document is not informed consent.
> the ability to point a product at an API endpoint of my choosing
The right to data portability includes this:
> Individuals have the right to ask you to transmit their personal data directly to another controller without hindrance. If it is technically feasible, you should do this.
> Actually let's add more: The data generated by my use of my data in the product.
This is in scope for a Subject Access Request.
> Non-canned support responses that don't ask for information I literally put in the ticket three weeks ago
This is difficult to solve with regulation but I think it's an entirely reasonable thing to expect for your money. GDPR does not help here
Hopefully if there are multiple competitors in the space, customer support is something that providers can compete on.
> Prominent indication of where (geographically and legally) data is stored and used
Privacy information already must contain a transparent list of data processors:
> This includes anyone that processes the personal data on your behalf, as well all other organisations.
What we really need is for other countries to start taking data protection regulation seriously.
Idea: Move all online businesses to a new digital-only currency. Let people earn that currency by donating the processing power/storage/bandwidth of their devices, like the @Home projects. Of course people could always current existing currency to the new e-currency.
Let's say an hour of donating an average laptop = an hour of using Google, Facebook, etc.
Yours is a great list of stipulations. I would just add: support for open + interoperable protocols such as activitypub and RSS.
> Non-canned support responses that don't ask for information I literally put in the ticket three weeks ago.
No, sorry. I don't want to put too fine a point on it, but you get this if you pay what I want, not what you want.
All this crying about tracking - how else the owner of the place can make product better? If I own a store - I can see where people go, how they shop, how they walk around, which basket size they prefer, what they buy. If I don't collect data on my website - I can't THINK about how to make my service better. I can only GUESS. What about data collections for simple functionality - like, when you come back to the half-filled form and I remember the values you already submitted by matching your cookies. Are you against this as well?
Sorry, but if this is the alternative - I would rather have Google know everything I do online and hope that they honestly don't store data on my Incognito browsing. If they do - worse for them.
As for legalities. It is a global world. There is no way to enforce this effectively globally. This is a pipe dream.
Also data in some cases must be shared with partners, those might be payment processor, ID checks etc.
The only problem is that the consumers want to trade the tokens at a profit instead of as purchases.
But that isn’t really a problem for the service that sold them. It is revenue. But people have an uncomfortable relationship with other people making money when they can extrapolate how much and consider that a problem.
Many services now are completely client side and use the nearest node that you connect to as the backend. They store enough variables in their smart contracts on chain and do the rest of the calculations client side. So their web service isn’t tracking you. But if you reuse addresses other people are.
I'd argue the term 'pay for the services [they use]' is too vague here to be meaningful - there are too many options that would drastically change the incentives.
Pay per API call? APIs start to need 5 calls to get all the info you'd need for one request. Subscription model? Consumers are going to have to juggle a different account for every provider they use. Subscription to aggregator who then pays content providers based on usage? We're back to the clickbait situation we were in in the early days of advertising and are arguable still in.
For me the insane thing is that there are no options. I can't universally buy any song I want DRM free from a range of providers. I can't pay per article for news from an RSS feed.
TV is an interesting one because the industry has convinced the user base to pay for content, but the subscription model is already showing some of the limitations shown above.
I feel like the biggest innovations in this space aren't so much new ideas or 'just convincing people to pay for things', it's a case of making the payments as easy and understandable as their previous counterparts.
Blame AML/KYC.
Frictionless, permissionless micropayments are illegal. On purpose.
This is not a social or technological problem. It is a legal problem. AML/KYC does not scale.
Yes but to be frank if I had to pay for everything I use I would go broke.
We're not all on SV salaries here. I have about 100€ of disposable income per month. If I put it on online things, it means I would have to stay at home (I mean pre or post pandemic times).
Unfortunately, advertising has also created an unsustainably high "standard of living", so to speak - you get so many services and applications for free these days that would realistically not exist or cost much more than you were willing to pay for them had it not been for advertising.
Personally, I don't think there's a way out of it until someone comes up with an alternative that brings the benefits of advertising without all the downsides it has, because individual consumer incentives are just not aligned. I'll gladly pay a one-time fee for some productivity app, or a small subscription for something I use almost daily. But if e.g. goodreads wanted to charge a subscription, community size would probably dwindle, and personally I'd just start keeping a spreadsheet of my books again.
People don't want to pay for content. If people pay for content and feel ripped off, they can ask for a refund. Then cheaters can pay to access the content then ask for their money back if they want to. This puts the content provider in a bad position.
If people pay for content, then they want to have that content themselves forever. In some sense, this is fair. But then they want to share that content with others. Then the other person/people don't need to pay. Now you have a problem where everyone can just get the content when one person has paid for it. This is a bad dilemma where content providers and consumers both seem to have a good case.
Since neither pay model seems to work, companies just show ads. Then people ignore ads, so the ad companies make them more attention-grabbing and intrusively targeted. So people use reader mode and ad blockers. Now no one is looking at the ads that pay for the content.
I wish I knew the answer.
But I think individual action isn't sufficient to get us over the hump. There are just too many things we use on a daily basis, and often those things use things that use things. "Free" is an illusion, but it's an illusion with a very low cognitive load. Manually supporting each and every thing I appreciate at the right level is a complex and taxing process. In practice, I'm sure I miss a lot.
In the physical world, we have some solutions for this. I don't have to subscribe to each park I use. I don't have to kick in for each sidewalk tree I walk by. I live in a neighborhood with a lot of street and alley murals, all community supported in various ways. I think the next step forward for the web involves finding ways for collective action with low individual cognitive load. It wouldn't be perfect, but it could be better than what we have now.
Switching away from ads will feel good until some do-nothing decides they'll hit their bonus targets by re-introducing ads in addition to the subscriptions.
This has happened many times. It's obviously going to happen again.
You do not want to normalize subscriptions for every old Web service, trust me.
Iow, I believe advertising is symptomatic of our current state of consciousness as a collective, and so it is not a cause.
Ofc, this could exist with actual paid services.
I’ve not used GoodReads before but it appears to be a book recommendation service. I’m surprised they can’t run the main site & API on Amazon referral links for the recommended books.
Ads convince people to spend money on some product they wouldn't buy otherwise. This in term finances those ads which finances the service. I'm not convinced this is better in the long run for many than paying services directly.
I think in order for this to happen, there's going to have to be better payment models than $4-9 a month subscriptions.
Crowdfunding is probably the best solution, if it can be made to work - it would allow for some kind of monetization on a voluntary basis, while preserving free access for most users. But I don't know of any site that really uses it successfully, aside from Wikipedia.
Yes indeed. Right now news & opinion are largely limited to what is considered "advertiser friendly" - think short news cycle, and other problems.
The perspective of paying pennies for everything introduces a lot of friction. Luckily an alternative model is making roadway, mostly in Youtube circles: pay a couple of your favorite creators, and watch the rest for free.
The extension of the old "freemium" model, but this time based around stochastic balance between viewers who pay this or that creator. The paying subscribers often get access to additional material - one that's more geared towards fans or deeply interested viewers - while the regulars, free-to-watch, viewers get the general content just fine.
I've said it many times before but it's worth repeating...
The Internet survived just fine before all the ads and tracking and it will survive fine without it.
Today it’s run by people, standing on their shoulders, whose dominant motivation is making money or how to “capitalise” on something, something which they have no fundamental interests or passion for.
Obviously the outcome is going to be different.
If you allow the web to move to a model where you have to pay dollars for a service worth cents it turns into the same kind of market as the mobile phone operators market charging fees for sms messages: a total rip off.
Why does this needs to be a business anyway? The only reason is because it can be.
Is the Web a business model?
The Web really is machines serving HTML, JSON, XML,... over HTTP/HTTPS across physical connections. There are several ways of looking at this, but often enough the debate gets reduced into a dichotomy.
I'll put this into two simplified extremes.
The Web is a shared infrastructure seen as a "public commons". You can access that infrastructure, request/receive bits and bytes from other machines for free and if you want to host content yourself, you connect your own machine to that infrastructure and you share content via your own machine for free. You carry the costs of the usage of the infrastructure yourself, regardless of the direction of data traffic across the network.
The Web and it's infrastructure are commodities. Storage, maintenance, bandwidth,... are expenses that should be offloaded. The main goal of hosting and making information available on the Web is to, either directly or indirectly, make a marginal profit. You pay for the privilege of accessing someone else's machine to download data, and you get paid by those who want to gain access to information you host on your own machine.
The problem with the statement above is that it implies that both extremes are mutually exclusive, and only the latter one is viable.
This is false.
The Web is ultimately a decentralized network which is build on top of intentions and goals of humans. And those intentions and goals can be wildly differing. There are parts of the Web that operate according to the former idea, and there are parts that operate according to the latter. Both exist, and there's a spectrum in between.
In the analogue world, the same notion translates into private businesses, non-profits, cooperations, community initiatives, charities, public initiatives and so on.
Goodreads choosing to close down it's public API is just one case choosing to move towards one side of the spectrum. It's by no means an indication that the entirety of the Web and - more specifically - it's denizens decide to move towards that side.
That spectrum does emerge based on laws of economics, though.
Goodreads has always been a private business. public API's of private businesses are never truly "Public". They are either a courtesy or a business investment. And they will step away from such courtesy if the costs outstrip the benefits.
The Web isn't quite the same as public space though - parks, beaches, forests, streets, grasslands,... - because the vast amount of infrastructure is privately owned. In that regard, the notion of "The Web is a Commons" is only true to the extent that private people are willing to accept and support that idea, and are willing to carry a shared part of the costs.
It's that last part which makes all the difference. Operating a basic website with a limited number of visitors comes at a low cost, and so one could operate a small Goodreads like website with a niche of books. There are plenty of examples of people keeping freely accessible blogs and the like with their own book reviews.
Goodreads tries to turn that idea into a business model. The intention of generating a profit is very distinct in that regard. However, not only is it hard to sell the opinions of other people, it's even harder if costs generated by trying to cater to an audience of millions outstrip the revenue generated.
The Web isn't financial centre. Just like London nor New York aren't representative as to the entirety of human society. Big businesses are - ultimately - only a part of the Web, just as much as they are only a part of society. And if their business models fail to keep them operational, the Web, and society, will, ultimately, churn on without them.
As the infrastructure changes the dominating idea shapes it.
As an example there used to be (and stills is, but the change is evident) vendors selling music recordings. Vinyl, tape, cd. The market traded in tangible artifact that could change owner, could be copied (legally or not) could be put in a library and lent out.
In Sweden we even pay a special copy compensation tax when buying any device with storage capacity to tunnel some money towards “content creators” in support of this distribution form.
However as the technology has shifted, allowing for direct streaming, the trading of artifacts has disappeared. The laws and economic realities now promote a market with fewer vendors offering only a limited catalogue of recordings, and only in a form that can never leave their control effectively.
This is only an example. And I think this particular one is mostly about reviewing the legal landscape.
Another example might be how protocols like RSS, XMPP, SMTP was used for interoperability and allow different vendors to offer compatible services. As things shift, this time perhaps more due to economic realities, the dominating tendency is still to erode interoperability and dominating players shape the technology towards their more siloed reality. Perhaps we need more tax funded players, (public service?), simply competing and collaborating to tilt things back again.
In general, I doubt that the network effect can be overcome for consumer platforms. People want to share their book reviews. Why should they limit themselves to people who pay? Paying only works for those who want social filtering.
To compete with Goodreads, the data would have to be free like Wikipedia, for other competitors to emerge so that it is not just a village but a country with many villages. But then, it's a rtf reader situation where it is difficult to survive as an app creating company.
No, I don’t have a great solution other than the status quo.
The whole process doesn’t have to be expensive, but it’s certainly not free. You can build very cool stuff and give it away for free, but sustainability and scalability ultimately require revenue. The magic thing about a successful business is the ability to cover execution costs, support the development team, and still leave value on the table for users.
I think digital services have drastically different economics depending on (1) how adding a NEW user changes the value proposition for EXISTING users, and (2) something like the user’s start up/discovery cost relative to lifetime value
A direct payment model makes sense when your users’ value is independent from the size of the user base (assuming performance scales at least linearly). These services can tolerate moderate startup/discovery costs. For critical enterprise services, startup costs can be high because the lifetime value is still much larger.
If value scales with the user base, as in a social network like good reads, then startup/discovery cost must be pushed to zero to grow the user base as quickly as possible. A paywall slows user base growth and reduces value for those users that actually choose to pay.
So far, advertising is the only known way to monetize (and thus sustain) a digital service while maintaining near-zero startup/discovery costs for individual users. Micropayments, even with good UX and low fees, increase joining costs relative to advertising. Thus they will reduce the value of the service to paying users if value scales with user base size, but would benefit services where value is independent of user base size.
Federation is maybe the best way out of this dilemma IMHO. The value of the overall network grows with the user base, so adding new federation partners should be near-free. Each instance is small relative to the network as a whole, and thus can focus on individual user value rather than growing the network, which means it can charge users directly. This is why you’re willing to buy a great Twitter or Reddit client app, but would never pay for a Twitter or Reddit subscription. (Yes I know those are centralized services, but the model holds if you look at the business relationships).
As far as the media is concerned, everyone here under the age of sixty have always been the product. And it’s deeply naïve and idealistic to think that “paying for their services”—how will they do that? Oh well—will change something which is now completely fundamental to media as we know it.
This is a very poetic way to describe API deprecation. I'm gonna steal this.
(It’s not quite cut and dried in this case because there may still be some people that still have access to the API—but those that have been cut off look to have no recourse.)
At AWS, I was "put on a pedestal" for stating (in an internal forum) that X was deprecated (I meant it in a sense it was "not recommended anymore" and wasn't updated at all)... The management thought it sent the wrong message (that is, deprecation == removal).
People often associate wrong meaning to deprecation (ironic in my case given Amazon is a Java shop).
https://blog.stephanieawilkinson.com/posts/2020-12-10-yonder...
https://www.librarything.com/more/importgoodreads
The general LibraryThing UI is a bit scary at first, don't let that stop you. All "Pro" accounts were made free for everyone some years back.
Those were good times.
People sneer now?
Book Id,Title,Author,Author l-f,Additional Authors,ISBN,ISBN13,My Rating,Average Rating,Publisher,Binding,Number of Pages,Year Published,Original Publication Year,Date Read,Date Added,Bookshelves,Bookshelves with positions,Exclusive Shelf,My Review,Spoiler,Private Notes,Read Count,Recommended For,Recommended By,Owned Copies,Original Purchase Date,Original Purchase Location,Condition,Condition Description,BCID
The CSV export uses both quoted and unquoted fields at the same time on the same record which is unfortunate, but it works.
One unfortunate bug that they seem to have put onto the 'wont-fix' pile is that for many recent-ish books, the 'date read' field isn't properly exported, so if you try to make reading stats you have to cheat a bit by approximating the 'finished date' with the 'book added' date.
It's sad that Goodreads way forward is to stifle competition, rather than innovate.
Where Goodreads has really failed to change is something not easily visible in pithy screenshots: there are a lot of bugs and missing functionality with regard to book metadata, but these flaws are something you don't quite grasp unless you become a Goodreads Librarian.
And even that is a stretch, you can see there are plenty of adjustments in the layout, and the right one looks fine by today's standard.
I disagree. It might be subjective, or a matter of taste.
[1] - https://borges.ai
I have a use case: the bibliographies (recommended reading pages) for my own books. Could I send readers to their choice of bookseller?
Would love to have a public API for readng but don't want supporting it slowing us down when we need to change something. It's just the three of us in our spare time at the moment, so we need the agility!
I really, really like that use case too! Right now all you could do would be to create a collection of books on your profile, but would be nice to have on the book page I think.
> Could I send readers to their choice of bookseller?
We would like to refer people to libraries and indies, but these are obviously fragmented so a bit difficult. I think allowing authors to set a preferred book seller probably makes sense.
[0] https://en.wikipedia.org/wiki/Shelfari#Amazon_and_shutdown
Personally I'd like to see audible integration, which ironically goodreads does not provide :(
I feel bad for people who put more work than I did into using the API. It seems kind of short-sighted to me, I'd have thought anything promoting book-reading would be good for Amazon/goodreads.
Hopefully it's the catalyst for a good alternative to appear. I know developers weren't happy with the API, or users with the site in general.
Although there is always the risk of spiralling down an echo chamber, having two or three people you know give a decent review to a book is a strong signal that you might enjoy it (or not, once you know those peoples' tastes).
So it saves me time and ensures a steady pipeline of "likely interesting stuff" to follow up on. Their e-mail newsletter is also likely to be the only one I let into my inbox willingly, just so I have an idea of what might be interesting to add to my queue.
Of course, the API deprecation is just bad. I've always been frustrated with the way Goodreads was integrated with the Kindle (it hardly, if ever, worked) and now syncing between Calibre and Goodreads is likely to stop working as well, so I won't have a non-wetbrain list of what I've read over the last decade or so.
I actually use Last.fm to do this for musical taste.
Those who will use your info legitimately will probably use an API, but those who only want your data can hide their IP across a number of cloud agents to extract all of the data from your site regardless of whether you offer an API or not unless you use CAPTCHA.
But if you need predictability and reliability (e.g. you're providing a service to other people) for whatever you implement using this 3rd party service you don't control, relying on their ui that they can break any time they feel like it will lead to more downtime than APIs for which you're usually given some notice before they're deprecated.
You'd like to make a recommendation engine, the idea is that the user could input 1-3 books they liked and it would suggest more books that are similar.
What sort of algorithms should I look into to do that sort of things?
Note that I don't have user profiles with what book they read, only have the database of books, can't do the recommendation engine based on two users liked the same book so they will also like other books either of them liked.
It's easy for recommendations to get stuck on a local maxima if they only look at one metric at a time, like "similarity" weights. But if you have a lot of metadata about each title, you can break out of those "loops" by sprinkling in metrics like ratings/genres/release date/popularity/etc. This doesn't have to hurt from a performance perspective, either; you can filter on the same single metric, but request more recommendations than you need and pluck out a pseudo-random set in the application logic.
That also lets you provide context for the recommendation. "It's like this, but [older/more obscure/with vampires]."
https://www.newstatesman.com/science-tech/social-media/2020/...
For example, searching Old Man and the Sea on isbndb returns (as you’d expect) many isbns:
https://isbndb.com/search/books/Old%2Bman%2Band%2Bthe%2Bsea
Do books have another identifier that logically consolidates editions, foreign language prints, etc.?
(Note the work id is in the url: OL6307W)
If you want to get the work id for a given ISBN: https://openlibrary.org/isbn/2070360075.json will redirect to the edition page, and there you can get "works[0].key".
Or, you can search by the isbn: https://openlibrary.org/search.json?q=isbn:2070360075
Except these things began as hobbies before business became involved at all.
It’s a message board for book lovers ffs, pre web people used to build things themselves, at their own expense, just for fun.
I hope it does! I'm banking on it too for my project.
I'm very disappointed to see that they're killing their API. I'll be looking for an alternative.
It is load times are one of the longest of the sites I commonly visit, and it barely had any new functionality added in the last 10 years.
I'm struggling to imagine what they could be; Goodreads doesnt exactly have much data that's really useful.
Guess it won't be happening now...
Goodreads is basically the only major social network for bookworms, yet the majority of its users hate it (including me), but are forced to use it to their chagrin.
You would think that would make the market ripe for a disruptor to arrive and topple the incumbent leader, yet each year nothing happens.
I personally have also thought of making a new rival product, but when you do the math on the market potential and the financial benefits, I just don't see a viable way.
People who read books, even if they read them every day, won't use your social network each day since books by themselves are the type of content which is consumed the longest (compared to a movie, tv show, song, or video game).
So you have a social network where users come back on a whim, even if you read like a maniac and try to read one book per week (I tried it one year, it was crazy, you are basically spending all your free time reading), even then users wouldn't use your app each day, but perhaps once or twice a week, and who knows how much time would the average session last?
To make things worse, you could maybe even get away with users using your app once a week if you have a big enough market (user base), but the number of book readers is not that great (especially compared to other media consumption)... The median American reads 4 books a year [1], or simply put, one book in three months.
So you have a social network where users don't need to use it often and there aren't a lot of users, that already spells trouble, but there is another major issue.
You could even succeed with those issues if you had a highly commoditized product to advertise, let's say a social network for yacht lovers, even if you have a small number of users and they do not use it much, you can still manage to succeed with it, since if you advertise yachts to potential yacht owners, you have a very valuable marketing channel which is worth quite an amount to the right people.
You can see where I am going with this, just compare a 5% commission on a yacht, vs a 5% commission on a book... Unfortunately books are not so highly valued (in monetary terms) nor sought after.
To sum up, you have a social network where users don't spend a lot of time, you don't have a lot of users and it is centered around a low profit product... Of course Goodreads has no competition, no sane person would touch that market with a ten foot pole.
Yet, to quote George Bernard Shaw, "all progress depends on the unreasonable man". If someone manages to solve this problem and find a profitable way to survive, I would not be surprised to see Goodreads fall.
I even thought of contacting Scribd to work with them, since I think they might have the best shot currently to position themselves as market leaders. They have an excellent product (Netflix for books) and already have a well sized user base. Would be interesting to see them expand and also became a social network for book lovers.
[1] - https://www.bustle.com/p/how-many-books-did-the-average-amer...
[2] - https://www.goodreads.com/quotes/536961-the-reasonable-man-a...
I believe that Amazon owns GoodReads. There's a natural conflict of interest, but at the end of the day sales are sales.
Last I heard, commissions for Amazon affiliates were not only for the linked item. For example, if you link them to the book, but they go on to buy a toaster oven, you'll receive a commission for the entire purchase.
If you think more user engagement is necessary, that's fine. Strategies could be developed to drive more engagement.
If the problems you see in the site are shared by others, I think you could have something viable. Please don't let a can't-do attitude stop you from experimenting. Web development is an open landscape for adventurers.
I think a site trying to make me “interact daily” about books would suck. But I do want a way to use reading and collection information of others to help me pick my own books.
But having a database of "all books" is not necessarily trivial. Even though OpenLibrary really does provide a great start, the contents seem to come from Goodreads/Amazon in a lot of cases, and I'm concerned about the legality of making a commercial competitor based on it.
Also, it would take a lot of time and data to get a good recommendation engine going. Amazon really is in the best position to do this. Just a shame that Goodreads get so little love from them.