Cloud Connect live: Todd Papaioannou of Yahoo! keynote

Yahoo runs a “private cloud” of 400,000 servers. “Real-time” elasticity is the goal. What is elasticity – ability to dynamically provision computing resources to meet a business need.

The pants metaphor: Traditional approach to deal with expanding services…change the pants or loosen the belt buckle. But with maternity pants they expand to accomodate the growth on demand.

Yahoo! Cloud supports 600M+ users, 200PB of data, 100Bn events per day.

Spin up time for traffic spikes is a major problem, load shedding is the only current option. Turn off things that aren’t critical to accomodate the load.

What kind of pants are you wearing?

Cloud Connect live: Kevin McEntee of Netflix keynote

Netflix August 2008 outage was a major black eye. Their architecture was Big Java – Big Oracle. No real high availability.

Why Cloud? Even in 2008 startups were growing in the cloud and wanted to benefit from the continuous improvement of AWS. What they found was it conferred tremendous agility for developers and the business. This was because of the reduction in complexities.

Accidental complexity is generational. Data centers are accidental complexity. Data center planning is driven by capacity forecasting and that is driven by the business forecasting, which can be impossible to do accurately.

Cloud was an opportunity to eliminate process and control.

Netflix culture is of freedom and responsibility. There is no single point of control over cloud spending.

What did Netflix get from using cloud computing? Netflix got high availability, eliminated complexity, process and control, found greater freedom and responsibility.

Cloud Connect live – Randy Bias of Cloudscaling

Myth: we need enterprise clouds because Amazon’s cloud platform doesn’t suit enterprise. Enterprise needs something different.

Who is actually adopting Amazon Web Services? Enterprises. Most of the adoption today is being driven by greenfield applications and NOT legacy applications.

Public clouds are fighting over greenfield applications. Enterprise clouds are fighting over legacy applications.

There is an error in Cloud as outsourcing. Cloud is multitenant. The enterprise cloud has no clothes. Where is the enterprise cloud business model equivalent of Amazon Web Services?

Amazon is winning. Their growth and momentum are staggering. S3 on track for 150 billion in revenue.

Rackspace hosting grows 3-5% per quarter compared with commodity cloud growth at 20-25% per quarter.

Enterprise clouds have a disproportiate spend at 5-10x commodity cloud costing. Initial capital expense is 6-8x for enterprise over the cost of commodity cloud.

Go commodity, serve greenfield applications, embrace the change.

Cloud Connect live – Lew Tucker keynote

More than ever, the network is the computer. And the network is growing fast.

Architectural battle on how applications are built. Horizontal scalability is key. Eventual consistency. Tightly coupled architectures cause a lot of problems. Inflexible and complex, it doesn’t scale.

Layering decouples parts of the system. Provisioning of applications is decoupled from the provisioning of the infrastructure. This leads to a revolution in how applications are built.

The cloud becomes “turtles all the way down”

Networking is a platform…needs to be managed as a system. It is the computer, and this is just the beginning.

Cloud Connect live: Werner keynote

Werner Vogels kicks off by talking about Alfred Korzybski’s famous statement “the map is not the territory“, and Richard Feynman’s observation that no matter how the “laws” of physics may change it doesn’t mean that nature has actually changed.

The models that we have of what is cloud and how cloud works are flawed because they don’t adequately represent the complex ecosystem of services that comprise cloud computing.

Startups are quickly getting enterprise scale using cloud computing. Enterprises are learning form consumer businesses and are approaching problems with a startup mentality.

The Cloud is an ecosystem. It can’t be defined by quadrants and grids.

Werner’s talk reminds me of a tweet that caught his attention a few years ago: RT @Werner RT @ianrae #AWS is the coral reef of internet computing.There’s a rich and intertwined ecosystem growing on a simple substructure11:50 AM May 19th, 2009 via web

Werner shows a slide of all the features released in 2009 and its clear that the substructure has itself evolved significantly in complexity.

Werner finishes with “Its still day one in cloud”

Cloud Connect live: Data Storage and the Cloud

James Duncan and Jason Hoffman of Joyent are reviewing the state of storage within the cloud. Jason Hoffman gives an overview of where we are and how we got here, and where things might be going. He explains the evolution of cloud computing as such:

1995 – Internet – Cloud Networking – Network IO turned into a utility

2005 – Intercomputer – Cloud Computing – Ram CPU memory and disc I/O

2015 – Interdata – Cloud data management, Governanace, Policy, etc…

Our ability to generate data and our appetite for storage is unlimited. Figuring out what is unique data and what is redundant data is critical. Figuring our what data is important is critical.

Key issues in scaling storage, which are driving many new storage solutions such as cloud storage, NoSQL platforms etc… are Administrative, Geographic, Load and Capacity.

James Duncan delves into some of the platforms being used for cloud storage:

A popular solution to improve storage access is Memcached which protects expensive backend I/O by caching frequently accessed data in memory. Many folks are migrating to Redis which has richer functionality and increased durability.

Eventually consistent document stores include Mongo and Riak, which is clustered and configurable, but has no indexes. Others are Project Voldemort, Cassandra and Hadoop. They have a lot of similarities but differ in the details, for example Riak is excellent at reads, where Voldemort excels at write performance.

Blobstores are scalable object storage like S3, include MogileFS and Openstack‘s object store based on Rackspace’s Cloud Files.

Ceph is an interesting project that  in the mainline Linux kernel that is more production worthy than its “alpha” status indicates and is possibly the closest open source contender to build an S3-like object store in house.

Jason: Test reliability and durability of data by unplugging systems and seeing what happens. For example, Mongo is known to lose data under these circumstances.

Cloud Connect live: Migrating your existing applications to the AWS Cloud

Jinesh Varia (@jinman) is doing a review of how to move applications to AWS after doing a review of the AWS infrastructure and services. What is immediately striking is how similar Amazon’s approach is to the CloudOps cloud migration services.

AWS Migration Phase 1: Cloud Assessment

This includes financial assessment, operating expense budgeting, security and compliance assessment, technical assessment (choosing the right candidate, migrating licensed products, identifying tools that can be reused, functional assessment).

Financial assessment: there is an AWS Simple Monthly Calculator and a whitepaper on the Economics of the AWS cloud which are available at http://aws.amazon.com/economics

Security and Compliance assessment: AWS has many security certifications including SAS70, its important to determine risk tolerance, regulatory compliance requirements, intellectual property concerns. Jinesh points out that you own the data, not AWS. AWS simply provides the infrastructure. You also choose where to store the data and how to handle the encryption at rest and in transit. More details about the security considerations are here: http://aws.amazon.com/security/

Migration suitability assessment: Create an application dependency and classification chart to identify candidates for migration. Sort application components by their security, performance and scalability requirements and the degree of coupling with other components and applications. Good candidates are loosely coupled. Marketing, content management, backup, log processing, development and staging, customer and partner facing systems tend to be easily justified use cases.

Know your objectives and your success criteria! Create a plan including a roadmap.

Hands on AWS lab

Jinesh is now running an AWS lab for attendees of the workshop. He created a Windows Server AMI (Amazon machine image) and we are walking through the steps of creating the AWS account, obtaining the access keys, provisioning and launching an EC2 instance based upon the AMI, using EBS for storage and resizing the storage on the fly, and finally snapshotting the running volumes to S3.

Elastic Beanstalk demonstration

Jinesh is doing a demonstation of Elastic Beanstalk…uploading a .war file and showing how this file goes into an S3 bucket and the system automatically creates an ELB (elastic load balancer), an autoscalaing group using EC2 Micro instances running Apache and Tomcat, EBS volumes with snapshots which store the data back on S3. It all happens automagically, behind the scenes. PHP and RoR support are apparently coming soon!

Cloud Connect – Cloudy Operations Session Live Blog

The Cloud Connect Conference in Santa Clara is getting underway. This will be my attempt to share interesting tidbits from todays sessions.

Cloudy Operations Session – 9:00am

Very excited for this day long session being MC’d by John Willis from OpsCode. Should be great to get deep into devops tech and procedures.

* Some big names in the audience today including Adobe, Paypal State of Nevada amongst them. Apparently this”devops” thing has real legs 😉

* Once you get over all the “Cloud” hype, devops is still just about managing systems in a datacenter.

* John Willis quote: “When you grow up you want to build a Facebook app”

* According to the Google Chief Economist, we are in a period of “Combinatorial Innovation”.

* Devops is about trying to build “bulletproof” infrastructure. There’s still a lot of cultural baggage in how we view operations. This requires a cultural shift to support future innovations.

* Operations can be a real competitive advantage and not just some comoditized function. As Tim O’reilly puts it “Operations is the elephant in the room”

* Big players like Amazon, Microsoft, Google understand the realities of operating large scale infrastructure and have then built their own tools.

* Who cares whether you’re working with IaaS, PaaS, SaaS or XaaS. They’re all services, whether they live in the public cloud or a private datacenter and you want to treat them similarly.

* Currently we compete on Scale and Velocity of Innovation. How do you get an idea from the whiteboard to a bulletproof production system in the shortest amount of time? You can’t do it without the proper operational culture.

* Devops is based on 3 things: 1) Cooperation between developers & operations 2) Renaissance of tools and API based infrastructure. 3) Global community of practice.

* The walls between developers and operations must be broken down. You need developers who think like operations folks and ops technicians that think like developers.

* Agile: Where you get stuff done by doing stuff. Instead of having meetings about how you’re going to do stuff.

* Who has had a two week devops agile sprint decimated by the pager? Hands-up all over the room. Great line from Andrew Shafer

* The security guys who say you can’t automate security, probably don’t understand what’s going on. Systems like Chef & Puppet allow you to know and dictate exactly what’s on one or thousands of systems. Try manually verifying 300 new systems before deployment in a timely manner.

* Test Driven systems are a requirement. When you create things like Chef or Puppet recipes to deploy services, those same recipes MUST include adding the service to monitoring, etc…

* “Cowboy” based operations is fun, but if you rely on cowboys you’ll be stepping in horse shit for the rest of your life.

* Being able to experiment in operations is a critical asset. There’s no teacher better than pain.

* In the devops world, you need to evaluate your team not only based on technical skills or abilities but also based on team cultural and team cohesion.

* Boundary Objects are key. You need data to drive devops and that means providing realy data in realtime on performance of systems as well as business based KPIs.

* Fault tolerance shouldn’t just apply to your infrastructure. It must apply to your staff as well. Knowledge can’t be isolated in silos which are “vulnerable”.

* Interesting discussion of Kanban as opposed to Scrum based Agile devops.

Configuration Management

Industry has changed:

1990: Systems are inventoried, packaged & files transfered

2005: Unattended bare metal servers “very very” hard. 7K nodes took 5 days w/90% success

2007: unattended bare metal in under 10 minutes, fully configured in 3 minutes.

2008: Unattended server in 2 minutes, 5K servers in a week

2010: 10K nodes in under 5 minutes

* Cloud means to operations that: demand is dynamic and developers are crucial to operations.

Config management has 3 components:

  1. Provisioning (turn on the system)
  2. Configuration (get the system to a certain state)
  3. Systems Integrations (get the service to an operational state, let systems know about each other)
  4. Orchestration

* Devops has forced operations to get better at what they do.

* Config management treats infrastructure as code. Using development tools, mindsets and techniques to manage operations.

* Infrastructure as code means the infrastructure is self-documenting.  Version control, process control and application control.

* In a cloudy world your prime constraint for recovery should be the time it takes to restore your application data.

* CM apps like Chef allow you to treat your infrastructure like a programming language or subroutines therein.