The Cloud is Just Someone Else's Computer

#13

What do you do when you can’t remote login to your server anymore because it crashed? When it’s in your own place you can still (physically) press that reboot button… And with a VPS you can still reboot it from a web based control panel…

2 Likes
#14

The vCPU in DO are Threads, note actual Core, so your Mini-PC would actually be 12vCPU.

The Memory in Mini-PC aren’t ECC.

The Cloud also allow scaling up and down. So assuming you aren’t running / required CPU for 24 hours. So it should be quite a bit cheaper.

I think there could be some new DO price drop soon. I really wish they could offer 1:1 CPU / Memory instances.

1 Like
#15

Heavy Azure user here.

Fun fact, the fastest F series azure vm is the same speed, per cpu core and both in a vm, as our 2012 Xeon E5-2670 dev server.

Still happy to use Azure as - for regulatory purposes - we need to be able to quickly spin up machines in secure hosting locations in various countries around the world.

1 Like
#16

Ah I should have mentioned that in the post – you do get managed power rails with EndOffice so you can do a power cycle reboot via a web UI.

But you’re right, for real server infrastructure you definitely want built in hardware KVM over TCP/IP, which goes by a few different names, for SuperMicro it is IPMI, for Dell it is DRAC, for HP it is ILO, for IBM it is IMM etc.

This doesn’t really matter unless you run hundreds of servers. Some interesting 2018 data here:

RAM used to be among the least reliable component in a computer, but over the last 5 or so years it has improved greatly. In 2018, RAM in general had an overall failure rate of .41% (1 in every 244), but the field failure rate was just .07% (1 in every 1400). So, while RAM is still at risk of failing - especially since you often have 4 or more sticks in a system - after it goes through all our testing and quality control process it is actually very reliable for our customers.

That said, they did see improved field failure rates for ECC of 1:5000 versus 1:1000 for plain old DDR4.

#17

Yep, this is where I landed a few years ago and I’m still there. The provider is responsible for hardware repairs, you get remote control (KVM/IPMI/DRAC), and you can do OS installs remotely using the provider’s provisioning system. Lots of options out there: Hetzner, OVH, online.net.

My current box at OVH (which I use for offsite backup and web hosting) is an SP-32 (Xeon E3-1270v6
4c/8t - 3.8GHz/4.2GHz, 32GB DDR4 ECC 2400 MHz, SoftRaid 3x4TB SATA, 1 Gbps bandwidth) costs about $100/month.

1 Like
#19

On the following:

@codinghorror I’d love to see a write up on how you’re tackling this side of things. I’d be fun to see the automation and tooling you’re using, or your monitoring setup. Even a we spend way more time on this than we should, but it’s fun! post would be an awesome read.

#20

Well it is fairly boring.

Setup — simply plug in memory and the ssd and then power it on. The only little tweaky thing is you need to turn on power resume after power loss in the BIOS.

Software — an absolutely bog standard Ubuntu Server 18 LTS install with all defaults. Turn on automatic OS updates, that’s about it.

Testing — documented in the blog post already linked but here it is again https://blog.codinghorror.com/is-your-computer-stable/ this takes about a day as memory testing with 32gb is a few hours, and you should let mprime run overnight.

There is honestly not much more to it than that. Modern Linux is extremely mature, modern PC hardware is easy to set up and exceedingly reliable.

#21

Thanks for the link :smile:

When I see a tweet like the one I posted above, it implies that there exists in the world a very unreliable server. A server that hosts software that takes days to deploy, or needs constant tweaking due to underlying misunderstandings of system settings, or doesn’t have any monitoring, or hasn’t been secured properly, or that goes down unexpectedly, or… (I could go on.)

#22

I used to be afraid of that too, but we started deploying Discourse instances on Docker containers at scale in 2014 – as an open source project free to anyone and everyone – and those servers have been remarkably stable over the years. I even upgraded Ubuntu Server versions with little incident.

(I know this because I personally installed and supported around ~325 of them as our early $99 Discourse install plan which ran from 2014 - 2016. We don’t do that any more, but you can head to https://www.literatecomputing.com/ who does :wink: )

#23

Recommendations on memory and storage for self-purchase and installation in the Partaker B18? Compatibility specs are a bit unclear on the websites. I’m going to try this fun experiment. Looking for 32gb memory and 1tb nvme.

Your blog says B19 but the sites say B18. Minor typo?

1 Like
#24

I run IT for a small non-profit with big computing needs. I inherited a setup consisting of a boatload of colocated ‘ancient’ servers and network gear, alongside a corporate attitude that we would naturally migrate everything to the cloud once contracts/warranties were up. We had already started putting new systems into AWS piecemeal, so it felt inevitable. The colo bills seemed frightening, as did the software licensing costs per core, etc. I was no stranger to bitching about these myself.
Then I decided to run the numbers, due diligence, you know, of a COMPLETE hardware replacement, colocated, financed over 3 years, versus the required cloud servers to do the same job. Er… Absolutely no contest. We could buy and colocate all the hardware we needed for those first 3 years for the cost of 1 (one) year in the cloud. And in years 4+, we’d have no hardware cost (above support contracts). Now, admittedly, by then we’d be running on ‘old’ hardware, but we’re vanishingly close to 100% uptime and hitting performance goals on kit from when God was a boy, so this stuff would be around for a long time, living in its perfectly air conditioned, power smoothed environment.
And, y’know, no money, so we’re used to making do.
So, much as I would have loved to be able to put all our stuff in a crusher, there was no viable case I could make. The numbers were simply overwhelmingly stacked towards our own kit. Once it hits you that there are 8765 hours in a year, and you’re running 200 VMs, that colo bill turned into a per VM hour figure? Forget about it. :slight_smile:

2 Likes
#25

Ah sorry about that, fixed the typo.

Any DDR4 SO-DIMM 2400 should be fine, and any M.2 NVMe PCIe drive. I went with Corsair Vengeance 32GB and Samsung 970 Pro 512GB.

I tend to agree that you want to spec up on boxes you plan to have in colocation service for 3 years. I’ve never regretted going for the slightly bigger / better boxes in these roles, as they tend to age better when you do. Note that 64GB isn’t possible, I looked. Nobody makes DDR4 SO-DIMMs in 32GB capacity quite yet :wink:

(That said if you do want to save a few $, I’d possibly go with the 970 Evo series SSDs, and maybe the 4 core / 8 thread CPU in the B18. Those are viable tradeoffs.)

Let us know how your experiment goes. Reply here, I will definitely see it, even years later. I’m in this kind of stuff for the long haul. :truck:

#26

Just curious, but has anyone here tried to run a globally distributed set of services which need 3 nines of availability in multiple geographies and be able to hand peak qps of between 1000 and 10,000 on a couple of PC’s you bought from new egg?

I always find these comparisons between the cost of the cloud vs DIY approaches kind of hilarious.

2 Likes
#27

For nominal meteor strike protection, and true global availability you definitely need the :cloud: no question.

The funny thing is, today’s hardware is so crazy fast – particularly if you’re not paying Intel’s “Xeon Platinum” enterprise tax – that these kinds of super high throughput numbers are kinda not that tough to hit on just a few colocated boxes like the one in this blog post.

#28

My point is… the cloud is not built for a couple of guys in their basement. It’s build for large orgs who need multiple levels of redundancy, have to deal with all sorts of data sovereignty and HIPAA,GDPR (etc) compliance issues and quite frankly aren’t in the business of operating servers because they’re a shoe manufacturer or an insurance company. Whilst I don’t doubt that a single hardcore server can crank out some QPS, when you start adding globally consistent data stores, external APIs or dependent systems etc you quickly realize that you’re actually I/O bound and the fattest machine on the planet isn’t going to help unless you can scale up to a few thousand instances of your app in a few seconds, then scale back down to nothing so you’re not paying out the ass for machines you’re not using.

And don’t get me started on security. Sure you can run your own kit and save some pennies, until you’re pwned by the next linux kernel vulnerability and you have to take your entire system down to patch the OS, and I’m sure it’s fine that you just run the OSS version of every single cloud service you’d want and that it won’t have any security holes that mean your customers personal information ends up being downloaded by your second cousin using a Russian VPN and a raspberry pi. I mean, how hard is it to run a messaging bus, database, search index, map reduce cluster, index service, logs analysis and data storage system, all securely and using a common set of control credentials. Should be pretty easy right?

Comparing the cost of a server you buy against the cost of a cloud provider is nonsensical unless you plan to actually measure it correctly, and when you do generally you’ll find the cloud is the biggest bargain out there.

</end_rant>

1 Like
#29

A few thousand? I don’t think so. As an example, Stack Overflow, which I co-founded, ran on 25 servers in 2014:

http://highscalability.com/blog/2014/7/21/stackoverflow-update-560m-pageviews-a-month-25-servers-and-i.html

Servers, drives, and CPUs have all gotten larger and faster since then, for basically the same prices.

As long as you take OS defaults, use the auto-update mechanism built into the OS, and reboot every 6 months or so, you’re fine. Remember those ~325 Discourse $99 cloud installations from 2014-2016 I referenced, above? Guess how many got p0wned with security issues? Zero. I’ve lived this test (literally, as I was personally responsible for all those installs) and I’m comfortable with what the real world data showed me in that two year period.

Note that I’m definitely not arguing Stack Overflow should switch to these boxes in the blog post. That’d be ridiculous, of course! But it is downright amazing how far you can get with a few basic modern colocated boxes today. :heart_eyes:

2 Likes
#30

Fair point, machines are faster now no doubt, but I’ll still maintain that the title of this post is kind of misleading. Look, I don’t care, and if you can run your site on less than a cloud provider charges then more power to you, but I think it’s a little unfair to compare the raw cost of a machine with the cost of a VM (or other compute form) in a public cloud. Full respect for stackoverflow, but number of servers and amount of downtime and not necessarily equivalent (also 560M per month is really only about 200qps and presumably much of it was cached at a CDN, so maybe half that?). As may be apparent from my comments, I work for a major cloud provider and we have lots and lots of customers who routinely run 1000’s of QPS with peaks in the 10’s of 1000’s, and while they could no doubt do that on a few hundred machines, it makes 0 sense for them to do that themselves given all the limitations and risks I mentioned. Stackoverflow is a great example because it’s NOT a shoe company, but 90%+ of businesses haven’t a hope of running at scale with a DIY approach. Now that said, your average shoe company is not running 1000’s of QPS either, so your point is valid, although some of us believe that if they’re not eventually software companies then they won’t be shoe companies for much longer.

1 Like
#31

Yes. I could only wish Discourse get anywhere close to that. Even just 50% of the load with 100% more Resources would be quite an achievement.

#32

We host thousands of Discourse sites on maybe ~40-50 servers at the moment. So the ratio isn’t that far off, but it is true the .net framework is extremely efficient in ways that Ruby + Rails is … uhhh… not? Postgres has gotten quite competitive with SQL Server in the last decade, as well, so that’s good news, particularly in the “if you have to ask how much the licensing of SQL Server is gonna cost, you can’t afford it” department :wink:

Our yearly software licensing bill is basically $0, though we do give back when we can. I can assure you Stack Overflow’s yearly software licensing bill is more on the order of $100,000 … perhaps a whole lot more. It wouldn’t surprise me if it was closer to $250k.

4 Likes
#33

Other than the convenience of managed servers, the only convincing argument I’ve conceded to for cloud is being able to scale up resources (CPU, RAM, storage, or bandwidth). If you run an ecommerce (black friday), slashdot effect (sudden web traffic), or USGS after a major earthquake.

For static load, I’ve used baremetal servers for years on online.net (now scaleway) for under $20/mo. Hetzner auction for i7 servers around $25/mo. OVH/kimsufi for personal dediservers (low end CPU or storage ARM server) for $4-10/mo. For some reason, Europe seems to have the best deals on VPS & baremetal. Granted, they aren’t RackSpace, but I’m not paying RackSpace prices either.

For personal servers, I’ve contemplated signing up for dynamic DNS on Hurricane Electric. Then, I could potentially run a server at home with port forwarding. I’d have to manage the uptime and backups myself, but I’d only pay for hardware upfront and ongoing electricity.

1 Like