1.1. Introduction to Configuration Management with Puppet
1.1.2. What is Configuration Management?
Within virtually every organization, there's probably a number of systems running Linux, Solaris, Mac OS X and/or HP-UX. These systems need to be configured appropriately to be able to function properly. Some will need special drivers, and all of them will need correct DNS settings, certain packages installed and certain other packages removed, users created, and SSH host keys exchanged. The more systems, the more these diverge in the configuration they need, diverge in the way this configuration needs to be applied, and the more these configurations will show discrepancies arising over time.
More specifically, an organization may have a couple of webservers, fileservers, a DNS and a DHCP server, a number of desktop PCs, and a number of laptops. The laptops may need slightly different system configuration (no LDAP authentication, and with a VPN client installed, for example), and the desktop PCs may need different applications installed then the servers, and so forth. Yet, between, say, a hundred desktop PCs, you would want the configuration to be as similar as possible. You may want to diverge between a software developer's desktop PC and a desktop PC in Human Resources, but in essence these are desktop profiles diverging on the application level, applied upon a stable system configuration which remains the same, or similar at least.
By the time the organization grows, replaces the hardware, upgrades to another version of the operating system, or applies changes, the challenge to making everything work yet maintain a similar configuration between all nodes becomes bigger. While every attempt made to control the situation can be called a form of configuration management, the solution without a configuration management framework is often comprised of:
a number of scripts (with or without revision control), to move around files, install packages, perform daily check-ups,
NFS mounts with programs pre-installed, so that nodes can mount these NFS shares and the software needs to be provided once, in one location, for all to share,
file server shares with pre-compiled drivers, or driver sources being compiled on the nodes by scripts running on the nodes,
terminal servers or desktop servers like with FreeNX, so that configuration concentrates on a smaller number of boxes
This means that workarounds for actual (user) problems maybe require an additional if-then-else in one or the other script, and updates to programs installed require manual compilation and installation. The success rate of these solutions never reaches 100%, and as it turns out the longer such a implemented solution runs, the more exotic problems become and the more machines will fail to remain up-to-date regardless of any attempt made to fix the issue; simply because it becomes to diverse and unmaintainable.
1.1.2.1. Configuration Management
Generally speaking, with configuration management, it's about managing the configuration of one or more organizational resources in order to have it be in a state in which it can perform the operations required by, and possibly critical to, the organization's operations. In addition to that, configuration management often concerns administrative tasks as to what systems provide a service and what SLA or OLA is applicable to that service, as well as the purchase date, location of the system, responsible party, etcetera.
In this workshop though, we are not going to explore configuration management of a coffee machine. Instead we look at the computers in a network running any platform but the one from a prominent proprietary North America-based vendor. We are talking automation and further enhancement of Computer Systems Administration.
When managing the operating system and software running on mainframes, servers, desktop PCs and laptops, you may find yourself looking for answers to questions such as:
How do I manage what packages are installed on a given system?
How do I make sure the services that every machine needs to run are actually running?
How do I manage monitoring the services or a machine's state?
A job needs to run periodically (maybe via crontab), but how do I make sure it is run, and how can I change or remove the job later?
Given different operating systems and operating system versions, how do I make sure I apply the correct routine for adding a user, starting a service, install/update/remove a package?
1.1.2.2. Configuration Management Requirements
This section is about what you would want Configuration Management to do for you, as a system administrator for the systems within your organization. These could very well include:
Consistency across systems is key in understanding where a problem might come from and assessing where problems may be first introduced. If each and every system is unique, you may end up searching for unique aspects of the system's configuration in order to determine the cause of a problem, while if systems are mostly consistent and the exceptions to the rule are easily determined, you may have found the problem even before your users experience the consequences.
Of course keeping systems consistent in their configuration doesn't say all your systems should be entirely equal, because that would not be feasible for many organizations and defeat the purpose of configuration management. Needless to say though, having all systems be entirely unique defeats part of the purpose of configuration management as well.
Grouping systems into categories like (for example) desktop, server and/or laptop, helps in applying changes to one category, such as installing GNOME or keeping systems up-to-date according to a schedule that may (servers) or may not (desktops, laptops) need a service or maintenance window.
More generally speaking, different profiles for each of these categories may be defined as well. A developer's desktop most likely has different requirements then a publicly accessible information booth at the reception desk.
Version control lets you keep track of changes applied to the overall configuration management framework, which is important because if you are managing different aspects of a (large) number of systems, and something goes wrong, the changes applied to the configuration Puppet uses will most likely be the first clue as to what caused the new problem and lets you recover relatively fast. Additionally, version control adds a layer that also gives you the chance to perform access control, to have notifications of changes applied sent to interested people, and to branch off.
Being able to quickly tell what a system does exactly, and how it differs from another system not only aids in performing risk assessments (impact of a given change), but may also help in determining the impact of a change beforehand, as well as determine the impact of an unexpected system or service interruption. Providing an example to the latter; if you update httpd across systems (whether tested or untested), but the new software version doesn't work as expected, a configuration management framework should be able to quickly give you an overview of impacted systems and services.
Some systems can be updated irregularly, such as desktop PCs, but need to be kept up-to-date nonetheless. Other systems have service and/or maintenance windows, such as servers, and thus need a very regular and strict update schema, compliant with the update policies in place.