Rubygems.org AWS bill for Feb 2014 [pdf] (opens in new tab)

(dropbox.com)

251 pointsvbrendel12y ago147 comments

147 comments

I thought I'd answer some of your questions, as the person that pays the bill.

1. This can be cheaper on AWS. We've been meaning to move to reserve instances, paying a year at a time, for a while and simply haven't done it yet.

2. Fastly has already donate CDN usage to us, but we haven't fully utilized it yet as we're (slowly) sort out some issues between primary gem serving and the bundler APIs.

3. RubyCentral pays the bill and can afford to do so via the proceeds generated from RubyConf and RailsConf.

4. The administration is an all volunteer (myself included) effort. Because of that, paying a premium to use AWS has it's advantages because it allows more volunteers have help out given the well traveled platform. In the past, RubyGems was hosted on dedicated hardware within Rackspace. While this was certainly cheaper, it created administrative issues. Granted those can be solved without using AWS, but we get back to again desiring to have as low of friction on the administration as possible.

Any other questions?

briancurtin12y ago

> In the past, RubyGems was hosted on dedicated hardware within Rackspace. While this was certainly cheaper, it created administrative issues. Granted those can be solved without using AWS, but we get back to again desiring to have as low of friction on the administration as possible.

If Rackspace can be of assistance in the future, feel free to reach out (brian.curtin@rackspace.com). We currently donate hosting to many open source projects, including ones in a similar space, like the Python Package Index.

evanphx12y ago

Thanks! I'll bring it up with the team.

1 more reply

mattbee12y ago

Hey Evan, as with Rubyforge the last 7-odd years, you'd be welcome to a free account on Bytemark's UK cloud platform bigv.io, or dedicated servers, or a mix on the same VLAN. We're a Ruby shop ourselves, and we host a fair chunk of Debian in our data centre too these days (https://www.debian.org/News/2013/20130404). I So just drop me a line if that's of interest <matthew@bytemark.co.uk>.

I assume this was posted because it's an enormous bill :) but obviously if you're happy with it, carry on!

knyt12y ago

Did you consider using a mirror network, with servers run by external organizations, instead of going with AWS bandwidth for rubygems? Seems like that would be a good approach for the static/bulk part of your dataset, and there are lots of companies and universities who are set up to serve software. (The mirror I manage serves about 50 TB/month for several Linux distros, and many sites are larger.) Do the work and infrastructure required to manage these networks make them not worthwhile?

Edit: Found a post [0] calling for a rubygems mirror network. Otherwise there is lots of information about setting up local mirrors of the repository.

[0] http://binarymentalist.com/post/1314642927/proposal-we-have-...

evanphx12y ago

It's been discussed many times before, yes. Rubygems usage pattern by our users make any kind of mirror delay unacceptable. We currently run a number of mirrors, configured as caching proxies. I want to get us going on a CDN like Fastly soon because they provide effectively the same functionality but distributed to many, many more POPs that I will ever setup.

2 more replies

grey-area12y ago

It would be really interesting to see the bandwidth broken down by gem - I suspect rails would be at the top, but it'd be interesting to see.

If most of the installs are on servers, have you considered talking to server providers about setting up internal mirrors on their networks? That might save everyone a lot of bandwidth.

Of course, people shouldn't really be installing their gems from ruby gems on servers anyway, is there any way to prod bundler to make it default to package gems and do a local install where possible, rather than downloading them every time there is a deploy (the current default)? At present you use double bandwidth from people downloading once on their local machine, and once on their server to update.

Fetching the ruby gems index with bundler/rubygems still takes a while every time I bundle update, have you looked at optimising that part of the process further (at least it doesn't fetch a list of all gems now, but it still fetches a list of all versions of each gem doesn't it?), say caching older gem results? The list of gem versions available should not change for old ones, so you should really only need to fetch a very small list of latest versions. The memory usage and bandwidth usage is still quite high there.

jasonkester12y ago

Hey, a chance to plug my thing!

I built S3stat (https://www.s3stat.com/) to fix this opaqueness that comes with using Cloudfront as a CDN and get you at least back to the level of analytics you'd get if you were hosting files from one of your own servers.

RubyGems guys, if you have logging set up already, I'd be happy to run reports for all your old logs (gratis, naturally) so you can get a better idea of which files (and as another commenter wondered about, which sources) are costing you the most.

1 more reply

Meekro12y ago

I still don't understand how AWS was preferable to a dedicated server host. Could you elaborate on that?

evanphx12y ago

Virtualization allows us to spin up new instances and migrate traffic to them. This means we can work entirely from chef and keep things clean. This is important for our volunteers to have a complete picture of an instance and to be able to make new ones.

1 more reply

teeparham12y ago

Do you know who are the biggest consumers of bandwidth? I would guess the CI servers (Travis, Circle)

dwwoelfel12y ago

I think that bandwidth consumed by Circle should be free, since we're also hosted in AWS. Maybe somebody who knows more about the details of Amazon's billing can confirm/deny.

2 more replies

evanphx12y ago

A very good question. I'll see about crunching some of the logs to break it down by subnet.

2 more replies

dwwoelfel12y ago

How have the costs changed in the last year or so? It would be cool to see a month-over-month graph.

evanphx12y ago

I'll put that on my todo list.

VeejayRampay12y ago

No question. Though I'll use the occasion to thank you for all the dedication, financial commitment and awesome software you've provided us with in the Ruby community.

jacquesct12y ago

Need help with the VCL with Fastly? Drop me a line to my name minus ct @npmjs.com.

evanphx12y ago

Thanks! I'll definitely keep you in mind as we're (finally) getting around to setting up correctly.

1 more reply

ChrisDiNicolas12y ago

Have you looked into the Rackspace Cloud offering?

patio1112y ago

While one could probably knock a couple thousand bucks off that if one cared to (which is probably penny wise and pound foolish but invariably comes up in HN discussions of hosting costs), the amazing thing is that hundreds of thousands of people worldwide are able to use core infrastructure which costs less than the fully-loaded cost of a single billing clerk in your local municipal water department.

duaneb12y ago

> which costs less than the fully-loaded cost of a single billing clerk in your local municipal water department.

To be fair, a lot of maintenance value goes into the software that is never quantified. Broken software breaks hard, not partially, so maintenance is even more crucial.

fivethree12y ago

When a levy breaks people die. Software maintenance and damage is nothing compared to real engineering.

5 more replies

jcampbell112y ago

What is funny is that Github is footing the bill for most package systems, which were likely inspired by ruby gems, yet Github itself was built with Ruby gems. I am pretty sure the hosting costs for homebrew/npm round to nil (I could be wrong).

jpfuentes212y ago

If you mean the npm homebrew package, then yes. If you mean npm packages, then you might be living under a rock.

michaelmior12y ago

Why do you say GitHub is footing the bill? I understand for Homebrew since it uses GitHub repositories, but npm is served via a CouchDB instance which I believe is sponsored by Joyent.

adventured12y ago

Serious question: muni water dept billing clerks make over $84,000 per year all in?

Mister_Snuggles12y ago

Once you factor in employer-paid benefits, pension, etc, the employer's cost is something like 140% of the employee's gross pay. This turns the $84,000 into a $60,000 gross salary.

If you include other costs, like the office space and equipment used by the employee, it starts to sound pretty reasonable.

1 more reply

fletchowns12y ago

An employee costs a lot more than just the cost of their salary.

1 more reply

amalag12y ago

BTW this is paid for by : http://rubycentral.org/

evjan12y ago

Thank you, Ruby Central!

area51org12y ago

Was wondering that, thanks.

incision12y ago

At a glance, this looks like AWS being used like a dedicated host, which as demonstrated, isn't exactly cheap.

There's no spot or even reserved pricing, just a bunch of on-demand instances that were up 24/7 for all 28 days in February.

Seems like a genuine dedicated host, reserved instances or an architecture that leverages the elastic in elastic compute cloud would be worth considering.

saurik12y ago

A lot of the price is bandwidth. They are effectively being reamed by using CloudFront instead of negotiating a better rate with a "real" CDN (which will also give then much better performance, as CloudFront doesn't have many edge locations).

(Although, actually, while I verified their total dollars spent is greater than what would be required to get a fundamentally better deal on bandwidth, I didn't take into consideration that once you slash their costs the amount they would be paying might no longer be ;P.)

toomuchtodo12y ago

> They are effectively being reamed by using CloudFront instead of negotiating a better rate with a "real" CDN (which will also give then much better performance, as CloudFront doesn't have many edge locations).

You can negotiate with AWS to get the same Cloudfront pricing as you would with Akamai. I know because I'm in the the process right now.

More importantly, they could be running on 2-3 dedicated servers at OVH or Hetzer, and have Cloudflare in front of them instead of Cloudfront. Or, if they insist on Cloudfront, switch to Price Class 100 (US and EU only). Its cheaper, and latency isn't that much higher vs serving out of all Cloudfront locations.

As long as most of your content is static, and you have a solid CDN, your origin doesn't have to be highly reliable or scalable. Its just an object store to persist data for the CDN.

vacri12y ago

CloudFront doesn't have many edge locations

This is nonsense. They have more edge locations than most. I didn't try all comparators in the list, but out of half of them I tried, none had more than Cloudfront: http://www.cdnplanet.com/compare/cloudfront/maxcdn/

So if Cloudfront has 'not many', who has 'many', and how many is that?

1 more reply

vertis12y ago

Could probably fix some of this just by talking to Amazon about it. It's not like this is a 'for profit' setup.

1 more reply

stickydink12y ago

Right now we're a top 25 grossing iPhone game developer. The last AWS bill I saw was January's, a little under $200k.

I'm not on the server team, so I don't know exactly what contributes most to it. But part of me really thinks it could be reduced!

jcampbell112y ago

This bill is 2/3 bandwidth, and 1/3 compute.

Some games require massive amounts of compute, but the bandwidth to deliver the assets is generally paid by Apple.

I can guarantee you, your company is paying a metric fuck-ton more. It is called Apple's 30% cut.

Your company is paying AWS $200k to pass json messages around for analytics and social aspects of the game. You are paying Apple something like $1 million per week to distribute, market, and collect payments for the game.

I am not saying your company is dumb, or Apple is evil. I am saying your experience and anecdote isn't relevant to Ruby Gems, and offering a different way to think about the games industry vs. the open source software distribution world.

stickydink12y ago

We aren't paying that much in cut just yet. We're a small team (6 engineers in total). You don't have to be pulling in millions per week to get high on the grossing charts. We're probably around 1/4 of what you estimated.

Though you mention delivering the assets. Actually (like a lot of games) we make a big effort in getting under 50MB over-the-air limit on the App Store. The total content for retina iPhone is ~300MB, delivered in parts as you progress in the game. That's kept on S3, downloaded through CloudFront.

But yes! You're right, it's mostly a hell of a lot of JSON flying around.

2 more replies

toomuchtodo12y ago

If you've got Business or Enterprise support from AWS, look at their Trusted Advisor product. Its included, and does sanity checks for cost and security against all of the AWS resources you're using.

newobj12y ago

I love that you qualify it with "right now". Applaud your anti-hubris, stickydink.

ukd112y ago

The most interesting thing that I found about dealing with stuff at 6 figure+ scale per month on AWS was the un-advertised limits (nodes, provisioned volumes / total size, snapshots, elbs, etc) that you have to either hit, or extract from your account manager.

If anyone ever ends up doing something like this; ask them upfront!

ceejayoz12y ago

Amazon has a whole section in their documentation for the default limits. http://docs.aws.amazon.com/general/latest/gr/aws_service_lim...

When I've hit them, I've usually had a response to the "raise my limit" form within an hour or two.

wbond12y ago

Package Control is a far cry from the scale of RubyGems. PC uses a little over 2TB a month, whereas my calculations show RubyGems using around 50TB.

That said, early on I chose Linode because of their generous bandwidth that is included with the boxes. For the price of less than 1TB of AWS bandwidth, I get 8TB, plus a decent box. The bigger boxes have an even bigger proportion.

I'm not posting this to give any suggestions for RubyGems - I know nothing of the complexity of that setup. Mostly just figured I'd share the research I did for finding reasonably priced bandwidth.

ilaksh12y ago

The thing is there are many providers who can do the same and most of them will do it for less than half of this. Some less than 1/5th. I think they should move this to Digital Ocean and save $5000.

The bias towards AWS for this type of application is ridiculous and a big waste of money.

ghshephard12y ago

Whenever anybody makes this type of statement, I'm alway interested in knowing if they've ever run a site with this type of traffic, and this many customers.

In particular, have you ever run a site that consistently serves over 25 Terabytes of traffic/month, or have you worked with someone who has?

I guarantee you that no company I have worked for in the last 15 years, could have ever run this type of infrastructure for $7K/month. Its absolutely amazing.

Zarel12y ago

My site serves 25 TB/mo, and it costs me $80/mo...

$60/mo for a dedicated server, $20/mo for CloudFlare. The dedicated server only serves 1 TB of it, the other 24 TB is static assets cached and served directly by CloudFlare.

Here's a screenshot of CloudFlare Analytics for the last 30 days: http://d.pr/i/6Z8S/5GU2Ni8t

2 more replies

gaadd3312y ago

I'm not sure if RubyGems gets more traffic/has more intense computational needs/has more users than OkCupid, but that used to be hosted for about ~2-3K/mo from what I recall. However, that's not amortizing the cost of the hardware.

toomuchtodo12y ago

> I guarantee you that no company I have worked for in the last 15 years, could have ever run this type of infrastructure for $7K/month. Its absolutely amazing.

True, but lets not compare offerings of the past to now. There is still room for practical efficiency gains.

adrianpike12y ago

Are you including person costs in that $7k? If so, I totally agree.

1 more reply

ilaksh12y ago

Have you worked for Rackspace, Linode or Digital Ocean?

1 more reply

joshmn12y ago

> The bias towards AWS for this type of application is ridiculous and a big waste of money.

They could get an even better deal by just going through a dedicated server provider (or even better, colocating).

There's little advantage with choosing DO versus going with a dedicated server provider (and again, colocating). I guess the advantage would be the control panel that they wouldn't use, having a few one-click stacks that they won't use, stuff like that.

If someone can afford a $7,000 AWS bill they can afford to put some money towards hardware and an onApp license if they want "cloudy" stuff. To colocate their hardware it would probably run them anywhere from $400-$800 a month depending on where they go. Their total bill would be decreased by $5500 a month. The upfront investment of the hardware wouldn't be more than $12,000 either. LOE? Probably two weeks with a competent sysadmin.

Yes you can have issues with your hardware and stuff and then you have to take care of that, but if you're good with your DC, they're great to you.

vacri12y ago

at $600/month you've only saved them $1500/month (the hosting portion is only $2.1k), and now they also have physical hardware to manage, requiring a broader skillset from the volunteers, plus someone having to be in physical proximity for 'on-call' issues.

I don't know what datacentres tend to charge for data transfer, but as that's the largest item on the bill, it's the more salient point.

Also, just because it's not on the bill doesn't mean they're not using other AWS services; there are several free ones.

ceejayoz12y ago

> To colocate their hardware it would probably run them anywhere from $400-$800 a month depending on where they go.

For one datacenter, but CloudFront gets you 40+.

gaadd3312y ago

Bandwidth is by far their biggest cost, colocation/dedicated hosting would save a substantial amount but you are still going to be looking at something in the ballpark of $1,000/mo for 1Gbps. (Unless Cogent has slashed prices even further)

zimbatm12y ago

Digital Ocean doesn't provide a CDN. EC2 only account for 1.4k in the bill so I don't see how you would spare $5000.-

TylerE12y ago

Bandwidth on DO is (worst case), $20.48/TB, so their 17TB usage would cost $348/month. A far cry from $5,000.

1 more reply

jdubs12y ago

It's often difficult to migrate providers when the application is complex or the owners may see the value in the provider. They're falling right into the hands of most cloud providers evil plans, they make it cheap to get started but as time moves on, it becomes more difficult to migrate away.

vertis12y ago

We moved the stack to Amazon in about 60 hrs last year (gems were already on S3). Given that time involved writing a lot of chef recipes, I'd say if pushed we could move out again in an even shorter period of time.

Everything needed to build the rubygems.org stack can be found at https://github.com/rubygems/rubygems-aws

xs_kid12y ago

I guess host it in AWS is a benefit for the integration with other services hosted in Amazon like TravisCI (the most popular CI for open-source Ruby projects) and Heroku (the most popular hosting for Ruby projects)

jpfuentes212y ago

There's already 30+ comments on this thread and no one has pointed out the obvious: this is all for the peanut gallery to laugh at Npm, Inc.

If the bill remained relatively consistent they could host Rubygems.org for ~28 months with 200K.

hurrycane12y ago

We run into the same cost-related problems for our CDN. What we did to solve it was to rent dedicated servers that are near AWS regions. We used Route53 latency based routing to route traffic to that dedicated servers + Nginx + LUA. We're serving 300+ TB of traffic per month and the total price is just a percent of the RubyGems AWS Bill. There is some maintenance included with this solution and the problem is finding the right dedicated server providers.

reustle12y ago

That's not as bad as I was expecting. I was once working with a startups infrastructure (>100 servers) and it was near 20k/mo (mostly reserved instances)

jayvanguard12y ago

Yes, this seems quite reasonable considering the scale it handles.

jtrtoo12y ago

Since it can take a bit of time to read through the invoice, here's a summary of the bill:

CloudFront $1,071 Data Transfer $3,597 EC2 $2,184 S3 $ 228

While "bandwidth" costs equate to ~$4,668/month, only $1,071 is CDN (CloudFront), with the balance just raw Data Transfer.

Since lots of folks are commenting, and not everyone realizes the difference it's also a good time to point out the CloudFront vs. Data Transfer distinction.

Using Amazon's terms... Data Transfer means anything directly served/coming from EC2 or S3 (or a few other services which aren't relevant here), but NOT anything for CloudFront (which is, obviously, a separate line item, as shown above).

The bulk of CDN (CloudFront) usage ($735 worth or 69%) is US.

The bulk of Data raw bandwidth (Data Transfer) usage ($2,931 ~80%) is US East.

jtrtoo12y ago

Is any of this good/bad/right/wrong? I have no idea. That depends quite a bit on what THEY are doing with it and why. For example, it can be cheaper to distribute from CloudFront versus straight from S3 for some use cases. Though, generally, you are not only looking at using CloudFront to save money over S3 ...there's typically a performance reason.

And sometimes the hosting costs simply don't matter. It's easy for us engineers - siting here on HN - to sit at our keyboards and play around with hypothetical ways to save money. This isn't necessarily a bad thing, but there are numerous things in IT that it doesn't make sense to optimize. Why? Because the ROI on the engineering time, CapEx, and OpEx (and the time, energy, and focus of ANYONE involved or impacted at all) to do the optimization doesn't outweigh the opportunity cost.

Sometimes there are simply better uses of our limited capital and time.

Not everything needs to be optimized. And the argument gets stronger when there are other factors more difficult to factor in: adopting a platform that isn't as widely known or isn't backed by a similar level of maturity (even with it's quirks, at least they are well known), etc.

The risks/concerns not only vary between organizations, but often from one period of an organization's growth to the next. The beauty is every organization gets to make their own decision ...and none of them have to give a damn if the HN community agrees or not. :-)

SeoxyS12y ago

While by no means insignificant, this bill is no where near what I'd imagine would warrant a HN post. I wouldn't be surprised if most startups beat this regularly.

The startup whose backend I co-created racks up an AWS bill that hovers around a half million dollars a month. We make use of all of the ways to save with Amazon: pre-paid reserved instances, negotiated deals, etc. And we're not even that big; imagine what Netflix's AWS bill must cost?

We've tried other providers, toyed with co-locating, but at the end of the day the flexibility and cost benefit of IaaS outweighed the lower base price of CPU cycles when you roll it yourself.

twotwotwo12y ago

> this bill is no where near what I'd imagine would warrant a HN post.

Can only guess at why folks like any post, but it's not necessarily how large the bill is. Maybe it's how low it is for a service that's widely relied on, or maybe it's the level of transparency, which turned out to include evanphx above showing up to answer questions about the project.

matthewrudy12y ago

Absolutely, this is a transparency thing.

Compared to npm asking for $300,000 in donations to keep the thing running. I'm glad RubyGems can run for relatively so little, and be transparent in doing so.

ne0lithic12y ago

With most of this being bandwidth costs, it seems like switching to a host like Digital Ocean would make more sense here. The bandwidth costs are a fraction of Amazon's in comparison.

As for the CDN, switching to something like Cloudflare might make more sense rather than relying on Cloudfront. At the least, there's a "US and EU only" option for edge locations to use which si considerably cheaper than the default option of all edge locations.

mje__12y ago

Wow, as someone who uses rubygems all day and is not in "US and EU only", I'm glad you're not involved in this project.

jtrtoo12y ago

I presume you mentioned Cloudflare because of their "unlimited bandwidth". That comes with some constraints as to the use/application: https://www.cloudflare.com/terms.html

It's possible RubyGems.org would be classified under one of the "not really allowed here" terms.

ceejayoz12y ago

> With most of this being bandwidth costs, it seems like switching to a host like Digital Ocean would make more sense here. The bandwidth costs are a fraction of Amazon's in comparison.

That's just replacing bandwidth costs with build-and-run-your-own-CDN costs.

sampierson12y ago

Why was this even posted? Looking for help reducing it? Complaining about the amount spent? Looking for a pat on the back?

I saw a talk at Ruby/RailsConf about the work spent building and maintaining rubygems.org. It smelled a bit martyrish. "Look at the thankless work we perform behind the scenes".

Well, if help is required building or operating rubygems.org, please just say so. As a seasoned Ruby developer I'd be more than happy to contribute development time, and as a daily user I'd be willing to commit financially in a small way towards operating costs. Not that that is required - given all the offers of free hosting this post received in response.

If we don't know about a problem, we can't help. Just ask if help is what you want. It's not like the Ruby community doesn't have great communication channels.

reillyse12y ago

This seems reasonable to me? Why is this a newsworthy item?

pataphysician12y ago

Transparency is nice and the guy paying the bill answered questions for curious minds.

trustfundbaby12y ago

exactly. I came here expecting to see a $200k/mo bill tbh

rebyn12y ago

How can I donate to Ruby Central? Checked out their "Support" page yet it didn't help much. Any easier ways like donating via Paypal?

tyw12y ago

could easily knock multiple thousand bucks off of that by just reserving the ec2 servers you know you'll need, plus reserve the cloudfront bandwidth you know you'll need (for the amount of data served I believe you should be able to cut CF costs by at least half).

3 year heavy EC2 reservations pay for themselves in ~7 months, cloudfront reserved bandwidth is just a 12 month agreement so that costs nothing up front. You might want to experiment with some different instance types though, depending on your resource utilization. Personally I really like using the new c3.large instances for my web servers and anything else that needs more CPU than memory, proportionately. If the standard instances suit your needs better you still might want to move to the m3 class.

Aside from those two items it looks like you are sending out a considerable amount of stuff from EC2->internet (27 TB transfer out from US-East to internet). I'd recommend looking at whether you could set up a cloudfront distribution with your EC2 servers as its origin.

colinbartlett12y ago

I had no idea it cost this much to host rubygems.org.

The website says that hosting is provided by BlueBox?

howardr12y ago

They might provide by paying for it rather than hosting themselves

dschwartz8812y ago

This seems to be mostly their CDN bill. Not sure, but I don't really consider a CDN as part of hosting fees, more of a general infrastructure fee.

davidradcliffe12y ago

This includes all our compute fees too.

qrush12y ago

Bluebox hosts one of the gem mirrors...AWS is the primary source though.

lassebunk12y ago

Could it be possible to cache the version list locally and then just update it incrementally, e.g. via Git? Wouldn't this save both download time (for us), and bandwidth (for RubyGems)?

ww52012y ago

Interesting. Looks like most are bandwidth cost.

senthilnayagam12y ago

assuming there is a direct correlation between requests and projects, we can do a guestimate on ruby developers and projects which are active

soheil12y ago

That's it?

dushyant12y ago

WHY DOWNLOAD!!! WHY

j / k navigate · click thread line to collapse

147 comments

evanphx12y ago

I thought I'd answer some of your questions, as the person that pays the bill.

1. This can be cheaper on AWS. We've been meaning to move to reserve instances, paying a year at a time, for a while and simply haven't done it yet.

2. Fastly has already donate CDN usage to us, but we haven't fully utilized it yet as we're (slowly) sort out some issues between primary gem serving and the bundler APIs.

3. RubyCentral pays the bill and can afford to do so via the proceeds generated from RubyConf and RailsConf.

Any other questions?

briancurtin12y ago

evanphx12y ago

Thanks! I'll bring it up with the team.

1 more reply

mattbee12y ago

I assume this was posted because it's an enormous bill :) but obviously if you're happy with it, carry on!

knyt12y ago

Edit: Found a post [0] calling for a rubygems mirror network. Otherwise there is lots of information about setting up local mirrors of the repository.

[0] http://binarymentalist.com/post/1314642927/proposal-we-have-...

evanphx12y ago

2 more replies

grey-area12y ago

It would be really interesting to see the bandwidth broken down by gem - I suspect rails would be at the top, but it'd be interesting to see.

If most of the installs are on servers, have you considered talking to server providers about setting up internal mirrors on their networks? That might save everyone a lot of bandwidth.

jasonkester12y ago

Hey, a chance to plug my thing!

1 more reply

Meekro12y ago

I still don't understand how AWS was preferable to a dedicated server host. Could you elaborate on that?

evanphx12y ago

1 more reply

teeparham12y ago

Do you know who are the biggest consumers of bandwidth? I would guess the CI servers (Travis, Circle)

dwwoelfel12y ago

I think that bandwidth consumed by Circle should be free, since we're also hosted in AWS. Maybe somebody who knows more about the details of Amazon's billing can confirm/deny.

2 more replies

evanphx12y ago

A very good question. I'll see about crunching some of the logs to break it down by subnet.

2 more replies

dwwoelfel12y ago

How have the costs changed in the last year or so? It would be cool to see a month-over-month graph.

evanphx12y ago

I'll put that on my todo list.

VeejayRampay12y ago

No question. Though I'll use the occasion to thank you for all the dedication, financial commitment and awesome software you've provided us with in the Ruby community.

jacquesct12y ago

Need help with the VCL with Fastly? Drop me a line to my name minus ct @npmjs.com.

evanphx12y ago

Thanks! I'll definitely keep you in mind as we're (finally) getting around to setting up correctly.

1 more reply

ChrisDiNicolas12y ago

Have you looked into the Rackspace Cloud offering?

patio1112y ago

duaneb12y ago

> which costs less than the fully-loaded cost of a single billing clerk in your local municipal water department.

To be fair, a lot of maintenance value goes into the software that is never quantified. Broken software breaks hard, not partially, so maintenance is even more crucial.

fivethree12y ago

When a levy breaks people die. Software maintenance and damage is nothing compared to real engineering.

5 more replies

jcampbell112y ago

jpfuentes212y ago

If you mean the npm homebrew package, then yes. If you mean npm packages, then you might be living under a rock.

michaelmior12y ago

Why do you say GitHub is footing the bill? I understand for Homebrew since it uses GitHub repositories, but npm is served via a CouchDB instance which I believe is sponsored by Joyent.

adventured12y ago

Serious question: muni water dept billing clerks make over $84,000 per year all in?

Mister_Snuggles12y ago

Once you factor in employer-paid benefits, pension, etc, the employer's cost is something like 140% of the employee's gross pay. This turns the $84,000 into a $60,000 gross salary.

If you include other costs, like the office space and equipment used by the employee, it starts to sound pretty reasonable.

1 more reply

fletchowns12y ago

An employee costs a lot more than just the cost of their salary.

1 more reply

amalag12y ago

BTW this is paid for by : http://rubycentral.org/

evjan12y ago

Thank you, Ruby Central!

area51org12y ago

Was wondering that, thanks.

incision12y ago

At a glance, this looks like AWS being used like a dedicated host, which as demonstrated, isn't exactly cheap.

There's no spot or even reserved pricing, just a bunch of on-demand instances that were up 24/7 for all 28 days in February.

Seems like a genuine dedicated host, reserved instances or an architecture that leverages the elastic in elastic compute cloud would be worth considering.

saurik12y ago

toomuchtodo12y ago

You can negotiate with AWS to get the same Cloudfront pricing as you would with Akamai. I know because I'm in the the process right now.

As long as most of your content is static, and you have a solid CDN, your origin doesn't have to be highly reliable or scalable. Its just an object store to persist data for the CDN.

vacri12y ago

CloudFront doesn't have many edge locations

So if Cloudfront has 'not many', who has 'many', and how many is that?

1 more reply

vertis12y ago

Could probably fix some of this just by talking to Amazon about it. It's not like this is a 'for profit' setup.

1 more reply

stickydink12y ago

Right now we're a top 25 grossing iPhone game developer. The last AWS bill I saw was January's, a little under $200k.

I'm not on the server team, so I don't know exactly what contributes most to it. But part of me really thinks it could be reduced!

jcampbell112y ago

This bill is 2/3 bandwidth, and 1/3 compute.

Some games require massive amounts of compute, but the bandwidth to deliver the assets is generally paid by Apple.

I can guarantee you, your company is paying a metric fuck-ton more. It is called Apple's 30% cut.

stickydink12y ago

But yes! You're right, it's mostly a hell of a lot of JSON flying around.

2 more replies

toomuchtodo12y ago

If you've got Business or Enterprise support from AWS, look at their Trusted Advisor product. Its included, and does sanity checks for cost and security against all of the AWS resources you're using.

newobj12y ago

I love that you qualify it with "right now". Applaud your anti-hubris, stickydink.

ukd112y ago

If anyone ever ends up doing something like this; ask them upfront!

ceejayoz12y ago

Amazon has a whole section in their documentation for the default limits. http://docs.aws.amazon.com/general/latest/gr/aws_service_lim...

When I've hit them, I've usually had a response to the "raise my limit" form within an hour or two.

wbond12y ago

Package Control is a far cry from the scale of RubyGems. PC uses a little over 2TB a month, whereas my calculations show RubyGems using around 50TB.

I'm not posting this to give any suggestions for RubyGems - I know nothing of the complexity of that setup. Mostly just figured I'd share the research I did for finding reasonably priced bandwidth.

ilaksh12y ago

The thing is there are many providers who can do the same and most of them will do it for less than half of this. Some less than 1/5th. I think they should move this to Digital Ocean and save $5000.

The bias towards AWS for this type of application is ridiculous and a big waste of money.

ghshephard12y ago

Whenever anybody makes this type of statement, I'm alway interested in knowing if they've ever run a site with this type of traffic, and this many customers.

In particular, have you ever run a site that consistently serves over 25 Terabytes of traffic/month, or have you worked with someone who has?

I guarantee you that no company I have worked for in the last 15 years, could have ever run this type of infrastructure for $7K/month. Its absolutely amazing.

Zarel12y ago

My site serves 25 TB/mo, and it costs me $80/mo...

$60/mo for a dedicated server, $20/mo for CloudFlare. The dedicated server only serves 1 TB of it, the other 24 TB is static assets cached and served directly by CloudFlare.

Here's a screenshot of CloudFlare Analytics for the last 30 days: http://d.pr/i/6Z8S/5GU2Ni8t

2 more replies

gaadd3312y ago

toomuchtodo12y ago

> I guarantee you that no company I have worked for in the last 15 years, could have ever run this type of infrastructure for $7K/month. Its absolutely amazing.

True, but lets not compare offerings of the past to now. There is still room for practical efficiency gains.

adrianpike12y ago

Are you including person costs in that $7k? If so, I totally agree.

1 more reply

ilaksh12y ago

Have you worked for Rackspace, Linode or Digital Ocean?

1 more reply

joshmn12y ago

> The bias towards AWS for this type of application is ridiculous and a big waste of money.

They could get an even better deal by just going through a dedicated server provider (or even better, colocating).

Yes you can have issues with your hardware and stuff and then you have to take care of that, but if you're good with your DC, they're great to you.

vacri12y ago

I don't know what datacentres tend to charge for data transfer, but as that's the largest item on the bill, it's the more salient point.

Also, just because it's not on the bill doesn't mean they're not using other AWS services; there are several free ones.

ceejayoz12y ago

> To colocate their hardware it would probably run them anywhere from $400-$800 a month depending on where they go.

For one datacenter, but CloudFront gets you 40+.

gaadd3312y ago

zimbatm12y ago

Digital Ocean doesn't provide a CDN. EC2 only account for 1.4k in the bill so I don't see how you would spare $5000.-

TylerE12y ago

Bandwidth on DO is (worst case), $20.48/TB, so their 17TB usage would cost $348/month. A far cry from $5,000.

1 more reply

jdubs12y ago

vertis12y ago

Everything needed to build the rubygems.org stack can be found at https://github.com/rubygems/rubygems-aws

xs_kid12y ago

jpfuentes212y ago

There's already 30+ comments on this thread and no one has pointed out the obvious: this is all for the peanut gallery to laugh at Npm, Inc.

If the bill remained relatively consistent they could host Rubygems.org for ~28 months with 200K.

hurrycane12y ago

reustle12y ago

That's not as bad as I was expecting. I was once working with a startups infrastructure (>100 servers) and it was near 20k/mo (mostly reserved instances)

jayvanguard12y ago

Yes, this seems quite reasonable considering the scale it handles.

jtrtoo12y ago

Since it can take a bit of time to read through the invoice, here's a summary of the bill:

CloudFront $1,071 Data Transfer $3,597 EC2 $2,184 S3 $ 228

While "bandwidth" costs equate to ~$4,668/month, only $1,071 is CDN (CloudFront), with the balance just raw Data Transfer.

Since lots of folks are commenting, and not everyone realizes the difference it's also a good time to point out the CloudFront vs. Data Transfer distinction.

The bulk of CDN (CloudFront) usage ($735 worth or 69%) is US.

The bulk of Data raw bandwidth (Data Transfer) usage ($2,931 ~80%) is US East.

jtrtoo12y ago

Sometimes there are simply better uses of our limited capital and time.

SeoxyS12y ago

While by no means insignificant, this bill is no where near what I'd imagine would warrant a HN post. I wouldn't be surprised if most startups beat this regularly.

We've tried other providers, toyed with co-locating, but at the end of the day the flexibility and cost benefit of IaaS outweighed the lower base price of CPU cycles when you roll it yourself.

twotwotwo12y ago

> this bill is no where near what I'd imagine would warrant a HN post.

matthewrudy12y ago

Absolutely, this is a transparency thing.

Compared to npm asking for $300,000 in donations to keep the thing running. I'm glad RubyGems can run for relatively so little, and be transparent in doing so.

ne0lithic12y ago

With most of this being bandwidth costs, it seems like switching to a host like Digital Ocean would make more sense here. The bandwidth costs are a fraction of Amazon's in comparison.

mje__12y ago

Wow, as someone who uses rubygems all day and is not in "US and EU only", I'm glad you're not involved in this project.

jtrtoo12y ago

I presume you mentioned Cloudflare because of their "unlimited bandwidth". That comes with some constraints as to the use/application: https://www.cloudflare.com/terms.html

It's possible RubyGems.org would be classified under one of the "not really allowed here" terms.

ceejayoz12y ago

> With most of this being bandwidth costs, it seems like switching to a host like Digital Ocean would make more sense here. The bandwidth costs are a fraction of Amazon's in comparison.

That's just replacing bandwidth costs with build-and-run-your-own-CDN costs.

sampierson12y ago

Why was this even posted? Looking for help reducing it? Complaining about the amount spent? Looking for a pat on the back?

I saw a talk at Ruby/RailsConf about the work spent building and maintaining rubygems.org. It smelled a bit martyrish. "Look at the thankless work we perform behind the scenes".

If we don't know about a problem, we can't help. Just ask if help is what you want. It's not like the Ruby community doesn't have great communication channels.

reillyse12y ago

This seems reasonable to me? Why is this a newsworthy item?

pataphysician12y ago

Transparency is nice and the guy paying the bill answered questions for curious minds.

trustfundbaby12y ago

exactly. I came here expecting to see a $200k/mo bill tbh

rebyn12y ago

How can I donate to Ruby Central? Checked out their "Support" page yet it didn't help much. Any easier ways like donating via Paypal?

tyw12y ago

colinbartlett12y ago

I had no idea it cost this much to host rubygems.org.

The website says that hosting is provided by BlueBox?

howardr12y ago

They might provide by paying for it rather than hosting themselves

dschwartz8812y ago

This seems to be mostly their CDN bill. Not sure, but I don't really consider a CDN as part of hosting fees, more of a general infrastructure fee.

davidradcliffe12y ago

This includes all our compute fees too.

qrush12y ago

Bluebox hosts one of the gem mirrors...AWS is the primary source though.

lassebunk12y ago

Could it be possible to cache the version list locally and then just update it incrementally, e.g. via Git? Wouldn't this save both download time (for us), and bandwidth (for RubyGems)?

ww52012y ago

Interesting. Looks like most are bandwidth cost.

senthilnayagam12y ago

assuming there is a direct correlation between requests and projects, we can do a guestimate on ruby developers and projects which are active

soheil12y ago

That's it?

dushyant12y ago

WHY DOWNLOAD!!! WHY

j / k navigate · click thread line to collapse