5.11.07

Scalable Web Architectures: Common Patterns and Approaches

Ah, I'm having a great time in Berlin! But instead of a personal blog post, this is a "live blogging"-like post. These are my notes on the "Scalable Web Architectures: Common Patterns and Approaches" workshop Cal Henderson (Flickr) gave. As you'll see in this and future posts, this isn't me talking about the workshop but just the notes I took from it...



-- Stuff you have to care about...
  • * What is scalability
  • * Traffic growth
  • * Dataset growth
  • * Mantainability
  • * High Availability
  • * Performance

-- Scalability
  • * Vertical (get bigger)
  • * Sometimes is alright... quicker... and sometimes less expensive, if the software isn't written for H-scalability or when rearchitecting it goes more expensive than to buy just "more RAM" or somthing
  • * Horizontal (get more)

-- Architecture
  • * finding the right balance for good/fast/cheap

-- App servers
  • * state and sessions? please no sessions or at least not local sessions, if you want a scalable app... -> store in a central database, instead of using sticky sessions
  • * no sessions can be implemented:
  • - stash everything in a cookie
  • - sign the cookie
  • - using a $timestamp you can easily see when the cookie expires
  • * "super slim sessions"
  • - no account info in the session (use the previous cookie) but you can fetch database stuff and stuff it in a
  • "super slim session"
  • * the ramsus way - architect your app so that app servers don't know about the other servers,
  • so they can scale horizontaly without problems

# presentation
# markup
| page logic
| business logic
$ database

  • * Availability - everything doubled, if you can re-double it in another datacenter

-- Amazon
  • * S3 - storage
  • * EC2 - compute
  • * SQS - queueing
  • * all scale horizontally, it is cheap when you're small, it's not really cheap with scale

-- Load balancing

-- Parallelizable == easy

-- Databases
  • * usually the hardest part to scale (unless we're doing a lot of file serving)
  • * when starting, vertical scale should be used
  • * usually you need (specially on web) more read power than write power
  • - database replication (writes go to master, reads go to master or slaves)
  • - this sucks when you need to scale writes, 'cause you'll have to add a lot of new slaves to be sleeping...
  • * caching avoids needing to scale
  • * About MySQL clusters: MBD allows a mesh, RSN will be great in an upcoming version

-- Federation (at this moment we're talking about /big/ scaling)
  • * Simple things first: divide your tables and put each slice in each cluster

-- Akamai...

-- Serving Files

-- High Availability
  • * RAID 5 is cheap, RAID 10 has speeeeeed! ;-)
  • * MogileFS
  • * FlickrFS ( http://sourceforge.net/projects/flickrfs/ )
  • * Amazon S3 (cheap!)

-- Field Work

( iamcal.com/talks )