CIN: Cloud Infrastructure DSLs

23-02-2012 | Category: Programming | Tags: Cloud Computing, Metaprogramming, Ruby

Imagine you are a server admin and need to create a complete IT infrastructure consisting of an application server, a database server, and a backup server. First, you need to setup the servers, physically or virtually. Second, you need to install an operating system of your choice. Third, you select, configure, and install the needed applications.

No doubt, these steps are easy manageable for experienced admins. Yet, they are time consuming. And if you need to prepare the larger infrastructures over and over again, for example for testing purposes, this becomes a tedious, repetitive task.

In this post, I present my answer to this challenge. In one of the biggest case studies during my PhD, two students and me developed several DSLs that automatizes the creation of complete IT infrastructures. I shortly present each of these DSLs, and show you how to use them.

What Is Cloud Computing?

Cloud computing describes the concept to host data, files, applications, or even complete systems on the Internet. These resources are seamlessly blended into your local computer. The economic benefits are clear: You do not need the storage, computation, or machines themselves, but rent them. Maintenance, updates, and backups of the resources are done for you too. Cloud computing is especially beneficial for businesses, because they can better focus on their core products and services.

When I speak of cloud computing in this post, I mean the virtualization of complete computer systems in the cloud. Computer systems are created by a hypervisor, a tool that abstracts from a physical machine and provides a virtualized environment. In this environment, operating systems and other software can be installed like on physical computers.

Steps in Infrastructure Management

Virtualization in the cloud allows to create a complete IT infrastructure consisting of web application servers, database servers, servers that run your tests, and backup servers. The steps to create these virtual machines — as they are called in the cloud computing domain — following steps are done:

  • Identification – The available computer systems and their hardware capabilities are identified.
  • Machine Configuration – The operating system is determined, a hostname assigned, and the mechanism to access the system provided (commonly with SSH).
  • Machine Installation – The operating system is installed and accordingly configured.
  • Package Configuration – Each machine fulfills a specific role in the infrastructure, and this role is realized by its software packages. This step identifies the packages, configures them, and resolves possible dependencies (Note: this step can be combined with the system configuration as well).
  • Package Installation – Finally the packages are installed on the machines.

To create a complete IT infrastructure, you have to repeat these steps for all machines. As explained in the introduction, for either very complex infrastructures or short-lived, temporary ones, doing all these steps manually is tedious and error prone. The CIN (Cloud Infrastructure) DSLs help to automatize these steps.

An Example Infrastructure

To illustrate how the different DSL’s work together, we want to create a simple infrastructure consisting of two machines: One application server hosting Apache and Redmine (a project management application written with Ruby on Rails), and a database server with MySQL.

Identification & Machine Configuration

Defining a machine includes to determine its hostname, its operating system, and it’s hardware resources. You also need to decide which hypervisor you want to use. There are several cloud provider to choose from: Amazon offers EC2, where machines are created in Amazon’s computing centers and users configure machines via a web interface or remote API’s, VmWare ESXI, offering software to virtualize machines in a local computing center, and others

Currently, CIN only supports the EC2 Hypervisor.

For the first application server that hosts Redmine behind an Apache server, we only need a small machine. As the operating system, we choose Debian. Here is the machine management DSL to express these decisions:

 1 app_server_1 = Machine "AppServer1" do
 2   owner ""
 3   os :debian
 4   hypervisor :EC2 do  
 5     ami "ami-dcf615b5"
 6     source "alestic/debian-5.0-lenny-base-2009..."
 7     size :m1_small
 8     securitygroup "default"
 9     private_key "ec2-us-east"
10   end
11   hostname ""
12 end 

I bet you can understand most of this code if you are familiar with cloud computing. Nevertheless, lets go thorough these lines:

  • Line 1 — A new Machine object is created by providing it with a domain name and a block of configuration options.
  • Line 2—3 — The most basic configuration options are the definition of the machine’s owner and the operating system.
  • Line 4 — The hypervisor block begins with identifying which hypervisor to use, and it receives block with the hypervisor-specific options.
  • Line 5 — 6 — The ami is the “Amazon Machine Identification” number, a unique identifier, which we complement with the source, an XML file, of the machine image
  • Line 7 — Amazon offers several sizes for the machine, which govern the speed of the CPU, its cores, the RAM and the HD size
  • Line 8 — Security groups define a firewall-like policy which ports are opened from the machine to the outward world. You definitely want to use SSH to connect to your machine and execute commands.
  • Line 9 — This line identifies the private key file that you can use for SSH connections.
  • Line 11 — The hostname with which the machine can be accessed.

Machine Installation

Once the machines are configured, the following command creates this machine:


Other commands that can be used on machines are destroy! and reboot for the obvious intentions.

Package Configuration

In our example infrastructure, we need to install Apache and Redmine on the application server. Package configuration is supported by the package management DSL, it defines software packages and configuration options as an overlay to the package management of the concrete operating system. This DSL is important for interoperability of the CIN DSLs.

Lets consider how the apache2 package is defined with the following expressions:

package	:apache2 do
  platforms	:redhat, :debian, :ubuntu
  features "mod_ssh , mod_php, mod_rails"
  license "Apache Licence 2.0"
  tags :webserver, :apache, :opensource
  description "Apache is a Webserver"
  versions "2.2.5<=X<=2.2.15"
  installation_script :apache2

As we see, most of these options identify specific properties of the software packages and identifies alternative installations:

  • Line 1 — A package is defined with the package method, it receives a name and a block of configuration options.
  • Line 2 — The operating systems for which this package configuration can be used.
  • Line 3features describe additional functionality that this package can provide. For Apache, several modules can be added to support for example SSH communication, to enable PHP programs to run on the server, to host “Ruby on Rails”:“” web applications, and more.
  • Line 4 — License information, used to distinguish open source from paid software.
  • Line 5 — Tags that can be used to find this package in the package repository.
  • Line 6 — A simple description of this package.
  • Line 7versions are needed to differentiate differences in a packages functionality, especially its compatibility to other packages.
  • Line 8 — The installation_script identifies an installation script with which this software package is installed on the targets operating system.

Note that the package management DSL only needs to be used when a software package needs to be installed for which there is no entry in the package repository. Most likely the user can access and use one of the existing package declarations.

Package Installation

Finally we can install the software packages with the package deployment DSL. This DSL reflects the configuration options of the package declarations, especially dependency resolution, and identifies on which server the packages gets installed. The following expression is all that is needed:

deploy :Redmine, :on => ["Application Server"] do 
  enroll "Redmine/Database", :on => ["Database Server"] do 
    prefer :package => "MySQL" 
  enroll "Redmine/Rails Server", :on => ["Application Server"] do 
    force :package => "Apache2" 

Again, let walk through this DSL line by line:

  • Line 1 — We want to deploy the :Redmine package on the server with the hostname "Application Server".
  • Line 2—4 — The first dependency, the "Database", is enrolled on the "Database Server", where we prefer the “MySQL” package (if “MySQL is not available, we need to choose another package to fulfill this dependency).
  • Line 5—7 — The second dependency is the Rails Sever, which we also enroll on the "Application Server".

Executing this DSL will install the packages Redmine and Apache on the application server, and MySQL on the database server.

Our infrastructure is complete.


CIN allows to automatize all steps of IT infrastructure management. It provides readable and concise DSLs that help to identify, configure, and install complete machines and their software packages. Three DSLs are used. The machine management DSL describes physical or virtual machines and uses an Hypervisor API to communicate with cloud providers to create the machines. The package management DSL provides an operating-system independent overlay to identify software packages with their configuration options, installation features, dependencies, and installation scripts. Finally, the package deployment DSL expresses which software packages, and its dependencies, are installed on which machines. CIN enables easy replication of the IT infrastructure, as well as serving as its documentation. Overall, it reduces the effort of infrastructure creation to the minimum of writing small, concise DSL expressions.

Comments & Social Media

blog comments powered by Disqus