Digital Elf

The Future is Now

IPv6 the SmartOS Way

There have been a lot of requests for IPv6 support in SmartOS. I’m happy to say that there is now full support for IPv6 in SmartOS, though it’s not enabled by default and there may be some things you don’t expect. This essay is specific to running stand-alone SmartOS systems on bare metal. This doesn’t apply to running instances in the Joyent Cloud or for private cloud SDC.

Update: I now have a project up on Github that fully automates enabling SLAAC IPv6 on SmartOS. It works for global and non-global zones and automatically identifies all interfaces available, regardless of the driver name.

First, some definitions so we’re all speaking the same language.

  • Compute Node (CN): A non-virtualized physical host.
  • Global Zone (GZ): The Operating System instance in control of all real hardware resources.
  • OS Zone: A SmartMachine zone using OS virtualization. This is the same thing as a Solaris zone.
  • KVM Zone A zone running a KVM virtual machine using hardware emulation.
  • Compute Instance (CI): A SmartMachine zone or KVM virtual machine.
  • Smart Data Center (SDC): Joyent’s Smart Data Center private cloud product. SDC backends the Joyent Cloud.

There are two modes of networking with SmartOS. The default is for the global zone to control the address and routes. A static IP is assigned in the zone definition when it’s created, along with a netmask and default gateway and network access is restricted to the assigned IP to prevent tennants from causing shenanigans on your network. The other is to set the IP to DHCP, enable allow_ip_spoofing and be done with it. The former mode is preferred for public cloud providers (such as Joyent) and the latter may be preferred for private cloud providers (i.e., enterprises) or small deployments where all tennants are trusted. For example, at home where I have only a single CN and I’m the only operator, I just use DHCP and allow_ip_spoofing.

By far the easiest way to permit IPv6 in a SmartOS zone is to have router-advertisements on your network and enable allow_ip_spoofing. As long as the CI has IPv6 enabled (see below for enabling IPv6 within the zone) you’re done. But some don’t want to abandon the protection that anti-spoofing provides.

Whether you use static assignment or DHCP in SmartOS, the CI (and probably you too) doesn’t care what the IP is. In fact, KVM zones with static IP configuration are configured for DHCP with the Global Zone acting as the DHCP server. If you have another DHCP server on your network it will never see the requests and they will not conflict. In SDC, entire networks are allocated to SDC. By default SDC itself will assign IPs to CIs. In the vast majority of cases it doesn’t matter which IP a host has, just as long as it has one.

Which brings us to IPv6. It’s true that in SmartOS when a NIC is defined for a CI you can’t define an IPv6 address in the ip field (in my testing this is because netmask is a required parameter for static address assignment, but there’s no valid way to express an IPv6 netmask that is acceptable to vmadm). But like it or not, IPv4 is still a required part of our world. A host without some type of IPv4 network access will be extremely limited. There’s also no ip6 field.

But there doesn’t need to be. Remembering that in almost all cases we don’t care which IP so long as there is one, IPv6 can be enabled without allowing IP spoofing by adding IPv6 addresses to the allowed_ips property of the NIC. The most common method of IPv6 assignment is SLAAC. If you’re using SLAAC then you neither want, nor need SmartOS handing out IPv6 addresses. The global and link-local addresses can be derived from the mac property of NIC of the CI. Add these to allowed_ips property of the NIC definition and the zone definition is fully configured for IPv6 (you don’t need an IPv6 gateway definition because it will be picked up automatically by router-advertisements).

Permitting IPv6 in a Zone

Here’s an example nic from a zone I have with IPv6 addresses allowed. Note that both the derived link-local and global addresses are permitted.

[root@wasp ~]# vmadm get 94ff50ad-ac74-46ac-8b9d-c05ddf55f434 | json -a nics
[
  {
    "interface": "net0",
    "mac": "72:9c:d5:34:47:59",
    "nic_tag": "external",
    "gateway": "198.51.100.1",
    "allowed_ips": [
      "fe80::709c:d5ff:fe34:4759",
      "2001:db8::709c:d5ff:fe34:4759"
    ],
    "ip": "198.51.100.37",
    "netmask": "255.255.0.0",
    "model": "virtio",
    "primary": true
  }
]

In my workflow, I create zones with autoboot set to false, then add IPv6 addresses based on the mac assigned by vmadm then I enable autoboot and boot the zone. This is scripted of course, so it’s a single atomic action.

Enabling IPv6 in a SmartMachine Instance

Once the zone definition has the IPv6 address(es) allowed it needs to be enabled in the zone. For KVM images, most vended by Joyent will already have IPv6 enabled (even Ubuntu Certified images in Joyent Cloud will boot with link-local IPv6 addresses, though they will be mostly useless). For SmartOS instances you will need to enable it.

In order to enable IPv6 in a SmartOS zone you need to enable ndp and use ipadm create-addr.

svcadm enable ndp
ipadm create-addr -t -T addrconf net0/v6

Instead of doing this manually I’ve taken the extra step and created an SMF manifest for IPv6.

I have a user-script that downloads this from github, saves it to /opt/custom/smf/ipv6.xml and restarts manifest-import. After the import is finished, IPv6 can be enabled with svcadm. Using the -r flag enables all dependencies (i.e., ndp) as well.

svcadm enable -r site/ipv6

Enabling the service is also done as part of the user-script.

If you do actually want specific static IPv6 assignment, do everthing I’ve described above. Then, in addition to that use mdata-get sdc:nics to pull the NIC definition and extract the IPv6 addresses from allowed_ips and explicitly assign them. I admit that for those who want explicit static addresses this is less than ideal, but with a little effort it can be scripted and made completely automatic.

A Primer on CFEngine 3.6 Autorun

Update: For CFEngine 3.6.2.

CFEngine recently released version 3.6, which makes deploying and using cfengine easier than ever before. The greatest improvement in 3.6, in my opinion, is by far the autorun feature.

I’m going to demonstrate how to get a policy server set up with autorun properly configured.

Installing CFEngine 3.6.2

The first step is to install the cfengine package, which I’m not going to cover. But I will say that I recomend using an existing repository. Instructions on how to set this up are here. Or you can get binary packages here. If you’re not using Linux (like myself) you can get binary packages from cfengineers.net. Or for SmartOS try my repository here (IPv6 only). If you’re inclined to build from source I expect that you don’t need my help with that.

Having installed the cfengine package, the first thing to do is to generate keys. The keys may have already been generated for you, but running the command again won’t harm anything.

/var/cfengine/bin/cf-key

Setting up Masterfiles and Enabling Autorun

Next you’ll need a copy of masterfiles. If you downloaded a binary community package from cfengine.com you’ll find a copy in /var/cfengine/share/CoreBase/masterfiles.

As of 3.6 the policy files have been decoupled from the core source code distribution so if you’re getting cfengine from somewhere else it may not come with CoreBase. In this case this you’ll want to get a copy of the masterfiles repository at the tip of the branch for your version of CFEngine (in this case, 3.6.2), not from the master branch where the main development happens. There’s already development going on for 3.7 in master so for consistency and repeatability grab an archive of 3.6.2. Going this route you also need a copy of the cfengine core source code (although you do not need to build it).

curl -LC - -o masterfiles-3.6.2.tar.gz https://github.com/cfengine/masterfiles/archive/3.6.2.tar.gz
curl -LC - -o core-3.6.2.tar.gz https://github.com/cfengine/core/archive/3.6.2.tar.gz
tar zxf masterfiles-3.6.2.tar.gz
tar zxf core-3.6.2.tar.gz

You’ll now have the main masterfiles distribution unpacked. This isn’t something that you can just copy into place, you need to run make to install it.

cd masterfiles-3.6.2
./autogen.sh --with-core=../core-3.6.2
make install INSTALL=/opt/local/bin/install datadir=/var/cfengine/masterfiles

Note: Here I’ve included the path to install. This is required for SmartOS. For other systems you can probably just run make install.

At this point it’s time to bootstrap the server to itself.

/var/cfengine/bin/cf-agent -B <host_ip_address>

You should get a message here saying that the host has been successfully bootstrapped and a report stating ‘I’m a policy hub.’

To enable autorun simplet make the following change in def.cf.

-      "services_autorun" expression => "!any";
+      "services_autorun" expression => "any";

Note: There’s a bug in masterfiles-3.6.0, so make sure to use at least 3.6.2.

Using Autorun

With the default configuration autorun will search for any files in services/autorun/ with the tag autorun and execute it. At this point you can see autorun working for yourself.

/var/cfengine/bin/cf-agent -K -f update.cf
/var/cfengine/bin/cf-agent -Kv

Here I’ve enabled verbose mode. You can in the verbose output that autorun is working.

Now, like Han Solo, I’ve make a couple of special modifications myself. I also like to leave the default files in pristine condition, as much as possible. This helps when upgrading. This is why I’ve only made very few changes to the default polcies. It also means that instead of using services/autorun.cf I’ll create a new autorun entry point. This entry point is the only bundle executed by the default autorun.

I’ve saved this to services/autorun/digitalelf.cf

body file control
{
   agent::
      inputs => { @(digitalelf_autorun.inputs) };
}

bundle agent digitalelf_autorun
{
  meta:
      "tags" slist => { "autorun" };

  vars:
      "inputs" slist => findfiles("$(sys.masterdir)/services/autorun/*.cf");
      "bundle" slist => bundlesmatching(".*", "digitalelf");

  methods:
      "$(bundle)"
          usebundle => "$(bundle)",
          ifvarclass => "$(bundle)";

  reports:
    inform_mode::
      "digitalelf autorun is executing";
      "$(this.bundle): found bundle $(bundle) with tag 'digitalelf'";
}

This works exactly the same as autorun.cf, except that it looks for bundles matching digitalelf and only runs them if the bundle name matches a defined class. Also note that enabling inform_mode (i.e., cf-agent -I) will report which bundles have been discovered for automatic execution.

For example I have the following services/autorun/any.cf.

bundle agent any {

meta:

    # You must uncomment this line to enable autorun.
    "tags" slist => { "digitalelf" };

vars:

    linux::
        "local_bin_dir" string => "/usr/local/bin/";

    smartos::
        "local_bin_dir" string => "/opt/local/bin/";

files:

    "/etc/motd"
        edit_line => insert_lines("Note: This host is managed by CFEngine."),
        handle => "declare_cfengine_in_motd",
        comment => "Make sure people know this host is managed by cfengine";

}

Since the tag is digitalelf it will be picked up by services/autorun/digitalelf.cf and because bundle name is any, it will match the class any in the methods promise, and therefore run.

You can drop in bundles that match any existing hard class and it will automatically run. Want all linux or all debian hosts to have a particular configuration? There’s a bundle for that.

Extending Autorun

You may already be familiar with my cfengine layout for dynamic bundlesequence and bundle layering. My existing dynamic bundlesequence is largely obsolete with autorun, but I still extensively use bundle stack layering. I’ve incorporated the classifications from bundle common classify directly into the classes: promises of services/autorun/digitalelf.cf. I can trigger bundles by discovered hard classes or with any user defined class created in bundle agent digitalelf_autorun. By using autorun bundles based on defined classes you can define classes from any source. Hostname (like I do), LDAP, DNS, from the filesystem, network API calls, etc.


Using 2048-bit DSA Keys With OpenSSH

There’s a long running debate about which is better for SSH public key authentication, RSA or DSA keys. With “better” in this context meaning “harder to crack/spoof” the identity of the user. This generally comes down in favor of RSA because ssh-keygen can create RSA keys up to 2048 bits while DSA keys it creates must be exactly 1024 bits.

Here’s how to use openssl to create 2048-bit DSA keys that can be used with OpenSSH.

(umask 077 ; openssl dsaparam -genkey 2048 | openssl dsa -out ~/.ssh/id_dsa)
ssh-keygen -y -f ~/.ssh/id_dsa > ~/.ssh/id_dsa.pub

After this, add the contents of id_dsa.pub to ~/.ssh/authorized_keys on remote hosts and remove your RSA keys (if any). I’m not recomending either RSA or DSA keys. You need to make that choice yourself. But key length is no longer an issue. We can now go back to having this debate on the merit of math.

How the NSA Is Breaking SSL

This isn’t a leak. I don’t have any direct knowledge. But I have been around the block a few times. It’s now widely known that the NSA is breaking most encryption on the Internet. What’s not known is how.

We also know that the Flame malware was signed by a rogue Microsoft certificate. That rogue Microsft certificate was hashed with MD5, which is what allowed it to be impersonated.

On my Ubuntu box I just ran an analysis of the Root CA certificates (from the ca-certificates package which itself comes from Mozilla). This certificate list is widely used by thrird-party programs as an authoritative list. But other distributors (e.g., Google, Apple, Microsoft) have a substantially similar list due to the need for SSL to work in all browsers. If any one vendor shipped a substantially different list then end users would merely preceve that browser as being broken and not use it.

Back to my analysis. Mozilla includes 20 Root CA certificates that use MD5 and 2 that use MD2. This is frightening. We already know that a Microsoft certificate with MD5 was used to distribute the Flame malware and it is all but proven that Flame was created and distributed by the U.S. government.

The situation is clear. The NSA is in the posession of one or more Root CA keys. It is only prudent to expect that the NSA has spoofed copies of all 22 CAs that use MD5 or MD2. It is also possible that they have exact copies (i.e., true keys, not spoofed) of other major U.S. based certificate authorities (I shudder to think of a world where a national security letter requests a Root CA key as being relavent to an investigation).

The NSA would then use these keys to spoof SSL certificates in real time, creating Subjects identical to the target web site, becoming a completely invisible man-in-the-middle. This method would be impossible to detect for all but the most skilled users.

Edit: Turns out I was right on the money.
Edit April 2014: Heartbleed notwithstanding, I still firmly believe the NSA is actively executing MITM attacks using genuine or spoofed Root CA keys. Why let an IDS fingerprint you when you can engage in active and undetectable surveillance?

Timeout in a Shell Script

Although GNU coreutils includes a timeout command, sometimes that’s not available. There are a lot of ham fisted approaches by very intelligent people.

The “right” way to do this is with the ALRM signal. That’s what it’s for. So rather than reinvent the wheel, here’s a correctly working timeout function. This works in at least bash and zsh.

cleanup () {
  [[ -z $! ]] && kill -s TERM $!
  sleep 1
  [[ -z $! ]] && kill -s KILL $!
}

timeout () {
  ( sleep $1 ; kill -s ALRM $$ ) &
  shift
  "$@" &
  wait $!
}

trap cleanup ALRM
timeout 5 sleep 7

In this case, timeout 5 executes with timeout of 5 seconds and sleep 7 is the command to execute. This example will timeout. The timeout function will return with 142 if the process timed out.

Statement by Edward Snowden to Human Rights Groups at Moscow’s Sheremetyevo Airport

Republished from WikiLeaks.


Friday July 12, 15:00 UTC

Edward Joseph Snowden delivered a statement to human rights organizations and individuals at Sheremetyevo airport at 5pm Moscow time today, Friday 12th July. The meeting lasted 45 minutes. The human rights organizations included Amnesty International and Human Rights Watch and were given the opportunity afterwards to ask Mr Snowden questions. The Human Rights Watch representative used this opportunity to tell Mr Snowden that on her way to the airport she had received a call from the US Ambassador to Russia, who asked her to relay to Mr Snowden that the US Government does not categorise Mr Snowden as a whistleblower and that he has broken United States law. This further proves the United States Government’s persecution of Mr Snowden and therefore that his right to seek and accept asylum should be upheld. Seated to the left of Mr. Snowden was Sarah Harrison, a legal advisor in this matter from WikiLeaks and to Mr. Snowden’s right, a translator.

Transcript of Edward Joseph Snowden statement, given at 5pm Moscow time on Friday 12th July 2013. (Transcript corrected to delivery)


Hello. My name is Ed Snowden. A little over one month ago, I had family, a home in paradise, and I lived in great comfort. I also had the capability without any warrant to search for, seize, and read your communications. Anyone’s communications at any time. That is the power to change people’s fates.

It is also a serious violation of the law. The 4th and 5th Amendments to the Constitution of my country, Article 12 of the Universal Declaration of Human Rights, and numerous statutes and treaties forbid such systems of massive, pervasive surveillance. While the US Constitution marks these programs as illegal, my government argues that secret court rulings, which the world is not permitted to see, somehow legitimize an illegal affair. These rulings simply corrupt the most basic notion of justice &emdash; that it must be seen to be done. The immoral cannot be made moral through the use of secret law.

I believe in the principle declared at Nuremberg in 1945: “Individuals have international duties which transcend the national obligations of obedience. Therefore individual citizens have the duty to violate domestic laws to prevent crimes against peace and humanity from occurring.”

Accordingly, I did what I believed right and began a campaign to correct this wrongdoing. I did not seek to enrich myself. I did not seek to sell US secrets. I did not partner with any foreign government to guarantee my safety. Instead, I took what I knew to the public, so what affects all of us can be discussed by all of us in the light of day, and I asked the world for justice.

That moral decision to tell the public about spying that affects all of us has been costly, but it was the right thing to do and I have no regrets.

Since that time, the government and intelligence services of the United States of America have attempted to make an example of me, a warning to all others who might speak out as I have. I have been made stateless and hounded for my act of political expression. The United States Government has placed me on no-fly lists. It demanded Hong Kong return me outside of the framework of its laws, in direct violation of the principle of non-refoulement &emdash; the Law of Nations. It has threatened with sanctions countries who would stand up for my human rights and the UN asylum system. It has even taken the unprecedented step of ordering military allies to ground a Latin American president’s plane in search for a political refugee. These dangerous escalations represent a threat not just to the dignity of Latin America, but to the basic rights shared by every person, every nation, to live free from persecution, and to seek and enjoy asylum.

Yet even in the face of this historically disproportionate aggression, countries around the world have offered support and asylum. These nations, including Russia, Venezuela, Bolivia, Nicaragua, and Ecuador have my gratitude and respect for being the first to stand against human rights violations carried out by the powerful rather than the powerless. By refusing to compromise their principles in the face of intimidation, they have earned the respect of the world. It is my intention to travel to each of these countries to extend my personal thanks to their people and leaders.

I announce today my formal acceptance of all offers of support or asylum I have been extended and all others that may be offered in the future. With, for example, the grant of asylum provided by Venezuela’s President Maduro, my asylee status is now formal, and no state has a basis by which to limit or interfere with my right to enjoy that asylum. As we have seen, however, some governments in Western European and North American states have demonstrated a willingness to act outside the law, and this behavior persists today. This unlawful threat makes it impossible for me to travel to Latin America and enjoy the asylum granted there in accordance with our shared rights.

This willingness by powerful states to act extra-legally represents a threat to all of us, and must not be allowed to succeed. Accordingly, I ask for your assistance in requesting guarantees of safe passage from the relevant nations in securing my travel to Latin America, as well as requesting asylum in Russia until such time as these states accede to law and my legal travel is permitted. I will be submitting my request to Russia today, and hope it will be accepted favorably.

If you have any questions, I will answer what I can.

Thank you.


For further information, see:

http://wikileaks.org/Statement-from-Edward-Snowden-in.html

http://wikileaks.org/Statement-by-Julian-Assange-after,249.html

Raspbery Pi and EW-7811Un

I’m setting up Raspberry Pi’s using the Edimax EW-7811Us wifi module available on Amazon for a mere $11.

Following the Debian WiFi wiki page initially didn’t work. The EW-7811Us uses an RTL8188CUS chipset which requires the rtl8192 kernel driver. There’s no firmware-realtek package on Raspbian, and the best answer I found was to download some dude’s hacked kernel module. No thanks.

Instead, install the rpi-update package then run rpi-update. The firmware will be updated in a way officially supported by raspbian (if there is such a thing). Then reboot.

A Case Study in CFEngine Layout

I’ve been working a lot with CFEngine newbies. CFEngine has been described as flour, eggs, milk and butter. All the ingredients needed to make a cake. Getting the new CFEngine user to recognize, then become excited about the possibilities that CFEngine provides they are now faced with the question of “What next?”

Indeed, anybody can throw some flour, eggs, milk and butter into a bowl, mix and bake it. But will it taste good?

This is an exposé of how I have managed my CFEngine repository for more than eight years. This design was used to manage over 1,000 host instances.

This works best if you have an agile infrastructure. Use SmartOS, OpenStack, Amazon EC2, CloudStack or similar.

The repository, and version control

Firstly, place your cfengine repository in some revision control. I am highly partial to git. Get the Pro Git book (or download it). Read chapters 1, 2, 3. This will make you a git power user. After you’re comfortable using git read chapters 6 and 7. When you’re hungry for more, read the rest.

I symlink /var/cfengine/masterfiles to /cfengine/inputs. This contains all of my policy files.

I also create /cfengine/files for files that get copied to remote systems. This mostly contains my configuration files.

/cfengine/ is initialized as a git repository. Changes made to either inputs or files should be atomic. Adding something new for Apache? Any inputs and files involved should be checked in as a single commit. This makes reverting a change easier.

Environments

I use four environments.

  • Alpha
  • Beta
  • pre-Production
  • Production

I also lied about initializing /cfengine as a git repository. I use a central repository server that contains only a bare git repository. The central repository has four branches.

  • master
  • beta
  • preprod
  • prod

Astute readers will notice there’s no alpha branch. I’ll get to that later.

beta is a full integration environment. Everything in beta is expected to work, yet not to be relied upon. That is to say, nothing should move to beta that is known broken. Beta will break. But don’t do that intentionally. If it’s half finished keep it out of beta.

prod is the full production environment. Breaking this means losing money. Don’t break this. Prod is tagged daily. Rolling back is done by checking out the appropriate tag.

preprod is for final quality assurance testing. Preprod should be identical to prod except for changes to be imminently released to prod. Preprod can also be used for offline testing of the production environment without affecting capacity or availability. Preprod should be in your production network fabric.

master is the trunk. All code is initially merged here, then merged to the appropriate branches. No one should be allowed to merge directly to any branch other than master. The repositry czar merges commits to other branches.

A DevOps Workflow

This is why there’s no alpha branch.

Let’s assume that you’re going to be making a change to the configuration of Tomcat.

  1. Spawn a new cfengine instance.
  2. git clone the cfengine master branch and bootstrap the server to itself.
  3. Spawn as many instances as necessary for your application to work. This will be at least Tomcat instances, possibly including Apache and Postgres instances and bootstrap all of them to your new cfengine server instance.
  4. By editing only the cfengine files out of the cloned repository make all of your updates.
  5. Code review
  6. When that feature is ready push a single commit to master and merge to beta
  7. Integration testing in Beta
  8. Changes that need to be made are done in the private instance set. When they’re ready proceed from step 5.
  9. When that feature is ready merge from beta to preprod.
  10. Final QA testing.
  11. Changes that need to be made are done in the private instance set. When they’re ready proceed from step 5. (Yes, that means it goes through integration again).
  12. When that feature is ready merge from preprod to prod.

Managing The CFEngine Repository: Layers

I use a layered approach. Each layer is contained within a single bundle.

  • Meta – These are things that affect every host that runs cfengine.

Layers that are based on intrinsic characteristics:

  • Operating system families – windows, unix (anything Unix like)
  • Operating System – linux, solaris, bsd, darwin
  • Distribution – debian, redhat, solaris11, omnios, freebsd, openbsd
    • Distro Version sub-layer – debian_6, redhat_6, centos_6

Layers that are based on the role

  • Application – apache, postgresql, mysql, bind, tomcat (These are often named after packages)
    • Application sub-layer – apache1_3, apache2, tomcat6, tomcat7
  • Role – external_web, internal_web, proxy, smarthost
  • Hostname – web_f7f274, web_4d06a8

Bundle <--> Layer Mapping

I generally contain one bundle per file, per layer. The default policy files that come with cfengine are in what I consider the meta layer.

This is a subset of my policy files to give you an idea of the organization.

  • unix.cf – bundle agent unix
  • linux.cf – bundle agent linux
  • debian.cf – bundle agent debian
  • redhat.cf – bundle agent redhat
  • solaris.cf – bundle agent solaris
  • apache2.cf – bundle agent apache2
  • bind9.cf – bundle agent bind9
  • web_ext.cf – bundle agent web_ext (policy for public facing web servers)
  • dpkg.cf – bundle agent dpkg (Package management common to Debian)
  • rpm.cf – bundle agent rpm (Package management common to RedHat)
  • ips.cf – bundle agent ips (Package management common to the Image Package System, used by Solaris)
  • digitalelf_stdlib.cf – Private library of bundles and bodies. This is similar in nature to cfengine_stdlib.cf, but I never change cfengine_stdlib.cf. I put things into my private library. When they are well tested I open a pull request with cfengine/core to contribute it.

All promises are added to the lowest layer bundle (with global being the lowest and hostname behing the highest). Thus, changes to /etc/resolv.conf, because all Unix like systems treat /etc/resolv.conf alike goes into the unix layer. The sysctl handling is different per operating system so they go into linux and bsd bundles at the OS layer.

An external facing web server, by nature of being a web server must include apache as does an internal facng web server, so each automatically pulls in apache2. Likewise canonical DNS servers and caching DNS servers alike pull in bind9.

Dynamic bundlesequence

Because of the layered approach, which inputs and bundles need to be run are dynamically generated. Public web servers running on Debian Linux will be able to select the ext_web, apache2, debian, and linux bundles automatically. I can have the same web content on Solaris 11 and it will instead choose ext_web, apache2, and solaris bundles.

I have a very large header to promises.cf to facilitate this. Here is an excerpt, along with additional commentary of my promises.cf to show how the bundlesequence is dynamically generated.

bundle common classify {

  # This section classifies hosts instances into roles based on the hostname
  # I use a completely virtualized infrastructure with hostnames determined by
  # on a role specific prefix and a hex string separated by an underscore.
  # The hex string is the last 3 bytes of the MAC address of the lowest
  # numbered interface (e.g., eth0). Instances are created this way by my
  # provisioning system.

  classes:

    "dns_ns"        or => { classmatch("ns[0-9]*") };
    "dns_forwarder" or => { classmatch("dns_[0-9a-f]*") };
    "db_server"     or => { classmatch("db_[0-9a-f]*") };
    "gitlab"        or => { classmatch("gitlab_[0-9a-f]*") };
    "web_ext"       or => { classmatch("www_[0-9a-f]*") };
    "web_int"       or => { classmatch("web_[0-9a-f]*") };
    "xwiki"         or => { classmatch("xwiki_[0-9a-f]*") };

    # Roles choose application bundles
    "apache"        expression => "dpkg_repo|web_ext|web_int";
    "bind"          expression => "dns_ns|dns_forwarder";
    "postgresql"    expression => "db_server";
    "tomcat"        expression => "xwiki|jira";
    "rails"         expression => "gitlab"

    # Roles and/or applications can be grouped
    "app_server"    expression => "rails|tomcat"

    # Applications may also depend on other applications
    "sql_client"    expression => "app_server";
    "ssl"           expression => "apache|tomcat|rails";
    "stunnel"       expression => "mysql";

}

bundle common g {

  # This section assigns bundles to application/role/grouping classes.
  # An array is created, named **bundles**. Each *key* is named after
  # a *bundle*. The *value* of each key is the input file where that
  # bundle can be found.

  vars:

    # These classes were defined by me in the classify bundle
    apache::
      "bundles[apache]"     string => "apache.cf";

    bind::
      "bundles[bind]"       string => "bind.cf";

    postgresql::
      "bundles[postgresql]"      string => "postgresql.cf";

    ssl::
      "bundles[ssl]"        string => "ssl.cf";

    stunnel::
      "bundles[stunnel]"    string => "stunnel.cf";

    # Thse are hard classes determined by cfengine. I don't need to explicitly
    # classify them.
    debian::
      "bundles[dpkg]"       string => "dpkg.cf";
      "bundles[debian]"     string => "debian.cf";

    centos::
      "bundles[rpm]"        string => "rpm.cf";
      "bundles[centos]"     string => "centos.cf"

    sunos_5_11::
      "bundles[ips]"        string => "ips.cf";
      "bundles[solaris]"    string => "solaris.cf";

    xen_dom0::
      "bundles[xen_dom0]",  string => "xen_dom.cf0";

    # Now the magic.
    # I create two slists. One named "sequence" and one named "inputs".
    # The "sequence" slist contains a list of bundle names.
    # The "inputs" slist contains a list of input files.
    any::
      "sequence"  slist => getindices("bundles");
      "inputs"    slist => getvalues("bundles");

}

body common control {

  # The bundlesequence now includes those things which are common to all, plus
  # the contents of the slist "sequence" (which has ben dynamically generated),
  # plus the unqualified hostname.
  bundlesequence => { "global", "main", "@{g.sequence}", "${sys.uqhost}"};

  # The inputs now includes common libraries and main.cf which will be run by
  # all systems, plus the contents of the slist "inputs" (which has been
  # dynamically) generated, plus an input based on the unqualified hostname.
  inputs => { "cfengine_stdlib.cf", "digitalelf_stdlib.cf", "main.cf", "@{g.inputs}", "${sys.uqhost}.cf" };

  # Sometimes I need to have any specific configuration for a single host (e.g.,
  # one of dns_ns will be the master and the rest will be slaves so the master
  # needs special configuration). The following options will allow cfengine to
  # skip the hostname bundle/input if one does not exist (which it usually
  # doesn't).
  ignore_missing_bundles => "true";
  ignore_missing_inputs  => "true";

  version => "Community Promises.cf 1.0.0";
}

Notice that instances are automatically classified by their hostname. So if I need a new external web server I provision a new instance with the name prefix www_ (I can also choose the OS at provisioning time). My provisioning system automatically assigns them a unique ID, creates the instance, installs the OS, installs cfengine, bootstraps it to the cfengine master server, runs cfengine to apply the final configuration and finally adds the instance’s services to the appropriate load balancer entries.

I have repository mirrors of all platforms I run so a newly provisioned host can be in production with a perfect configuration in as little as five minutes.