I spent the weekend playing with CoreOS. Here are some things that I found interesting.
I think the first thing to really understand are the concepts behind systemd units. They aren’t particularly tricky, and going through the entire official document is very enlightening. The main things I took away from it are the following:
This section is primarily for a simple description of the unit (for use in log
and status output) and dependency definitions. The main attributes for defining
After. It is important to note
that none of these implies the other. As an example, if
serviceA depends on
serviceB to be started successfully as a prerequisite, the unit file for
serviceA would need to indicate that with both
attributes set to
I think it’s important to understand this section, but nothing about it is
worth pointing out. Well, except that
fleet managed service units on a CoreOS
cluster will explicitly and intentionally exclude this section. Instead, a
service unit configuration will likely include some attributes in an X-Fleet
section instead. More on that later.
On a CoreOS cluster, most user services will be configured through service
configuration files and scheduled by
Type attribute is important to understand. I found that the
value can be used for something like initializing a dataset. For instance,
I might have a
database.service of Type
simple. Then a
Requires and runs
unit to initialize the dataset
webapp.service unit can be
This can also be useful for Docker volume containers, too. In this case, you
would probably want to also toggle the
RemainAfterExit boolean. This is
mostly because convention for volume containers is that they don’t have a long
running service, but rather usually just an
echo and quick exit.
There are a few
Exec* attributes available to service units and they are
generally pretty self explanatory. The main ones being
ExecStopPost. The value of these options
needs to be an absolute path to an executable followed by arguments to that
command. If this command filename is prefixed with
-, the exit code of that
command is ignored.
It’s also important to note that while the command syntax intended to be similar to shell syntax, there are many aspects of shell syntax which are intentionally unsupported.
CoreOS recognizes a subset of configuration items of the cloud-init project’s
cloud-config file. This is generally used to initialize individual nodes of a
cluster through a cloud provider’s user-data option. This usually starts by
generating a new
etcd discovery token.
The trick here is that your cluster won’t be considered healthy until you have
the minimum number of nodes in your cluster. You configure that when you
generate your discovery token. If you don’t explicitly set a cluster size, the
default is 3 nodes. This was tripping me up, as I couldn’t figure out why the
fleet service was failing when I had just one node in the cluster. Looking
back, this seems a little more obvious.
One notable component is the
attribute. You might use this to distinguish between cluster nodes in different
regions, for instance.
CoreOS expects that you will be managing service units on your cluster through
an abstraction provided as
fleet. These service units with be Docker
containers, since that’s the only real service container supported natively at
the time of this writing. You will be defining service units as you would for
systemd, except without an
[Install] section and you might optionally include
some fleet-specific properties
[X-Fleet] section. You might use these options to manage where your
service is scheduled on your cluster.