A few weeks ago I got a new laptop, and, even though I’ve been doing quite a bit of development work with it, I haven’t bothered to install either virtualenv, rvm, bundler or any other dev management tool… Besides Docker and Compose
Compose (formerly known as Fig) is a layer on top of Docker that, in a very simple way, allows for both runtime configuration of containers and linking between them. Runtime configuration is the kind of stuff you may pass in the command line to a Docker image when running it: port forwardings, volume mounting, entrypoints… Linking of containers allows them to see each other without the need to expose any ports. And all this is done using a really simple YAML configuration file that you can check into your repository to make the whole environment replicable.
So, if you have a Ruby web app that uses MongoDB for persistence and Memcache
as a caching service, instead of installing every single dependency on your
dev machine, you may just create a Dockerfile for the app, and then create
a Compose configuration that mounts the source of the app as a volume (so that
changes you make in your code are immediately available on the container),
sets up the port forwaring so you can see the app from your browser and that
links it with containers for Memcache and Mongodb. You probably won’t even
need to do anything special with these two since they are already built on the
Docker Hub. Then just by doing
docker-compose up
you’ll have your whole environment up and ready to use.
Make sure to check Compose’s
quickstart for an example of all
this in action.
Impressive as this is (and it is, not only in what it does but in how easy and
painless the whole process is) it isn’t enough for a development machine. The
first issue I encountered was doing stuff inside the container besides the
entrypoint declared on the Dockerfile, like running a REPL, a debugger or a db
console. Compose’s default way of doing this is through docker-compose run
SERVICE COMMAND
, which will start a new container for the same image and run
COMMAND
. This is not that great, since it takes a while for the container to
start (a simple echo for the container I use for this blog takes about 1.5
seconds). Docker already has exec, which runs the command on the same
container (the same test using docker exec takes 0.33 seconds). If
I understand correctly there’s already work to integrate it into Compose, so
this hopefully won’t be an issue for much longer, in the meantime I’ve been
playing around with an idea to tie commands to containers that can be then run
using docker exec
in Fede (keep in mind
this is extremely alpha software)
The next problem I found is a permissions one. Even though I use Docker as a non-root user, files created in mounted volumes inside the container are created as root (since I just use root as the user for the container), so checking logs or uploaded files sometimes means chmodding them. Luckily this was solved a couple of weeks ago in Docker, so it should be included in Docker’s next release.
And finally, the last issue I’ve been having is long build times when using
external package managers like bundler or pip. Since Docker’s cache is per
command, adding a new dependency to your requirements.txt means that when
rebuilding the image, once it gets to RUN pip install requirements.txt
it
won’t have any package cached, so if you have enough packages this will take
a long time. In one project where I was using scipy, it got to be so bad that
I just had to add a RUN pip install scipy
before the requirements line so
that at least this package was cached. While I’m not the first one to notice
it, I haven’t yet seen
any good solution to this problem. I’d like to explore the possibility of
having this kind of tools emit a series of Docker commands that could be run
sequentially and that Docker could cache (a kind of Docker meta command), so
pip install requirements.txt --docker
would return a series of RUN pip
install <package>
that Docker could then run and cache independently for
future builds.
All things considered, I’m quite happy with using Docker for the kind of hobby stuff I do at home. The UX can certainly be improved, but being able to tap into Docker’s hub for all kinds of services and base images for tons of programming languages, that I can link together and bring up and down with a single command, without any discernible performance penalty is just so great that I’m willing to put up with it while the kinks are ironed out, which in my experience with Docker will probably be pretty soon.
A templating language or library should be on the toolbox of every programming environment. While mostly used in web development to output HTML, they are useful in plenty other situations, from creating dynamic configuration files to doing meta programming and preprocessing.
Today we are going to take a look at a couple of ways to do templates in bash, half for fun and half because it’s actually useful: creating a configuration file with dynamic fields by running a simple script can be a pain without it.
If you google for bash templates, you’ll eventually find m4. m4 is a pretty old school macro preprocessor that has found it’s way in the standard unix toolset, mostly due to its use in autotools. It’s mode of operation can be quite complex, with directives to control output, rule rewriting, recursive looping… Have a look at the example on wikipedia for a small taste. This complexity mostly kills it for me, it seems that once you start using m4 you are going to be in one of those Now you have 2 problems scenarios.
However, if our templating needs are limited to some string substitutions, m4 can be a good option. With a template like this:
GREET, USER, your home is HOME
We can simply run m4 -DUSER=$USER -DHOME=$HOME -DGREET=Hi template.m4
(careful
here, the order of the parameters is important, -DRULE declarations always go
before the template file) and get
Hi, diego, your home is /home/diego
If we are limiting ourselves to simple substitutions we have at least to make a passing mention of sed. Let’s replicate the simple m4 example with sed. The template is the same, and our command line would be something like this:
sed template.m4 -e "s/USER/$USER/g" -e "s/HOME/$HOME/g" -e "s/GREET/Hi/g"
However using double quotes to allow for string evaluation of $HOME
makes
the expression invalid, as after substitution it’ll be s/HOME/home/diego/g
.
We can avoid this error by changing the separator like this:
sed template.m4 -e "s/USER/$USER/g" -e "s|HOME|$HOME|g" -e "s/GREET/Hi/g"
Which works, but we’ve arrived at a solution that is both more verbose and more brittle when compared to the m4 one, while having no obvious benefits, so I think we can safely ignore sed for this use case.
For now we are stuck with m4, which is very powerful but complex, and involves learning a new language. Can’t we simply use bash and have it work like PHP, where plain strings are just outputted and code is executed and then outputted?
The best idea for this I’ve come up with is hackish, not as pretty as a pure templating language, but it works. The idea is to use bash’s string interpolation to output environment variables that are passed to the template. To output we’ll use cat, and to provide multi line strings heredocs. Let’s have a look:
cat <<EOF
$GREET, $USER, your home is $HOME
EOF
This is our template, while there’s a little bit of noise at the beginning and
end of it, it’s mostly plain strings with the vars marked with $
in front.
To run this template we can simply do this:
GREET=Hi . template.sh
More complex stuff is possible, but it may not be very pretty. For example loops:
for i in {1..5}
do
cat <<EOF
$GREET $USER
EOF
done
cat <<EOF
your home is $HOME
EOF
Which outputs
Hi diego
Hi diego
Hi diego
Hi diego
Hi diego
your home is /home/diego
As I said, it’s not the prettiest, and it’s not quite the same as a pure templating language since the primary mode of operation is execution + output, not plain output, but for simple use cases, it can work pretty well.
So far I’ve tried to stay true to the standard unix toolbox, however since we are in bash no one is preventing us from using PHP, or perl or ruby or whatever. Setting up the environment for each template can be more complex, but the power and simplicity they provide probably can’t be matched by any of the options we’ve talked about.
If you remember last time we were in the spooky mansion of the evil toymaker, cutting equivalent pieces of cake. We already have a way to generate adjacent states to a given one and check for solutions, however we haven’t yet explored how to actually use these functions to iterate and solve the problem.
We first used best first to solve the sliding tiles puzzle with great results, so let’s try it here:
(let [initial (build-initial-state 6 cake)
solution? (build-solution-checker 6 cake)
get-next-states (build-get-next-states 6 cake)]
(bfser/solve initial solution? get-next-states))
Which arrives to the following solution in almost no time:
[[3 8 8 8 7 6]
[3 3 3 8 7 6]
[4 4 3 8 7 6]
[4 4 4 7 7 6]
[5 5 5 5 5 6]]
However, let’s remember for a moment something we said when we first introduced best-first:
Since we know that our solution must lie somewhat close to the initial node (It’s unlikely that any game designer would be so evil to have us make hundreds of moves to solve the puzzle), our best bet is to use a breadth-first approach.
Does the solution for our current problem lie closer to the root node, or is it nearer the leaves? Think about the tree we are generating, it starts with all chunks being available for taking and in each level we add one chunk to the active piece, until in the last level there are no available chunks, that is, in the last level all the pieces are complete for every branch, or yet again in other way, all the solutions are in the last level!.
Thinking about it this way exploring a full level before moving on to the next seems like a waste. If we explore the tree in a depth-first manner, always moving down before moving sideways, we’ll probably find a solution sooner, since we’ll be getting to the last level of the tree regularly, while in a breath-first solution we’ll only get to the last level once we have explored all the previous ones.
As for the implementation of a depth-first search, thanks to the magic arts of recursion it’s as simple as:
(defn solve [state solution? get-next-states]
(if (solution? state)
[state]
(if-let [states (some identity (map #(solve % solution? get-next-states) (get-next-states state)))]
(conj states state))))
For consistency with the other solvers we are returning a list (a vector here) with the history of all the states that lead to the solution, we don’t need it this time, but who knows what future puzzles we’ll find in our quest?
The solution this solver comes up with is the same as the solution the best-first provided, and since it’s a pretty simple problem the difference in execution time it’s negligible, but for bigger cakes the best-first solver would probably get stuck much sooner than the backtracker.
What would you expect if an evil toy maker invited you and six other random third rate actors to spend a night in a campy early-nineties CGI haunted house for the chance to win a fortune? The answer is obvious: horror themed logic puzzles!
Let’s delve into 1993’s the 7th Guest (now playable almost anywhere with ScummVM), a series of puzzles barely held together by a “horror” storyline. One of the first puzzles we’ll have to solve involves a cake with skull, tombstone and plain toppings, which we will have to cut into six pieces with the same type and number of toppings.
Let’s start by getting a representation for the cake, I don’t think anyone will be surprised when we turn to our trusty matrix, with 1s being skulls, 2s tombstones and 0s plains:
(def cake [[1 2 2 1 2 1]
[0 2 1 1 2 1]
[0 2 2 0 1 0]
[1 1 2 0 1 2]
[2 2 1 0 1 2]])
For the solution, we can use numbers from 3 to 8 to represent each of the pieces, so this would be one of the multiple solutions:
(def solution [[3 8 8 8 7 6]
[3 3 3 8 7 6]
[4 4 3 8 7 6]
[4 4 4 7 7 6]
[5 5 5 5 5 6]])
As you can see each piece has 2 skulls, 2 tombstones and a plain topping. For starters, we need to get the number of each type of topping that a piece will have, something like this:
{0 1, 1 2, 2 2}
Starting from the cake and the number of pieces we want to make, here’s a way of getting this piece spec
(defn get-spec [pieces cake]
(into {} (map #(vector (first %) (-> % last count (/ pieces)))
(group-by identity (flatten cake)))))
Basically we group the toppings, count them and divide by the number of pieces for each, creating a hash with the result. With this spec it now becomes easier to think about how we are going to solve the puzzle. Starting from a fixed position, in each iteration we are going to add a chunk to a piece, making sure that it conforms to the spec.
For the next iteration the cake will have one less available chunk (the one we added to the piece) and the spec will reflect that it needs one less of the topping we added. Once a piece is complete, we’ll reset the spec to start creating a new piece.
The following piece of code updates all these parts. Note that we are using a map to store all the different pieces of information and moving them around, and we are merging it with the updated versions of each of them:
(defn update-position [pos state]
(let [{:keys [cake piece-num spec]} state]
(merge state {:cake (assoc-in cake pos piece-num)
:last-pos pos
:spec (update-in spec [(get-in cake pos)] dec)})))
With this the next part should be pretty familiar for us, from a position we gather all the adjacent chunks that could be added to the current piece and return a list of these new states:
(defn valid-next [{:keys [last-pos spec cake]}]
(filter #(valid-val? (spec (get-in cake %)))
(for [coord [[-1 0] [0 -1] [1 0] [0 1]]]
(map + last-pos coord))))
(defn get-next-states [original-state state]
(let [state (update-spec original-state state)]
(map #(update-position % state) (valid-next state))))
Next up, we need a way of checking for solutions. One way would be to check that each piece conforms to the spec, however, since we are ensuring that each piece conforms while we are creating it, checking for the solution is as simple as ensuring that there are no toppings left on it:
(defn solution? [{piece-num :piece-num} state]
(every? #(>= % piece-num) (flatten (:cake state))))
We are almost done, if you check the
repo
you’ll see some more functions that, given a cake and a number of pieces
create the map we use to iterate and curry the solution?
and
get-next-states
functions so that they can be used in our solvers.
I think we’ve had enough cake for one sitting, come back next time to see how our generic solvers fare when confronted with this sweet problem.
One neat little feature in clojure is that sets and hashes can be used as functions. While weird at first, it’s actually pretty useful, more so when combined with higher order functions like filter:
(#{:a :b} :b); => :b
(#{:a :b} :c); => nil
({:a 1 :b 2} :a); => 1
({:a 1 :b 2} :c); => nil
(filter #{1 3} [1 2 3]); => (1 3)
I hadn’t seen this data-structures-as-functions construct before, but I really like it, it exploits the fact that there’s a clear default action associated with the type, looking up keys in a hash or elements in a set in the examples, and uses it to make code both terser and more self-explanatory.
Actually, come to think about it, this is not as alien as it may seem. In ruby
we use Symbol#to_proc
probably once every 5 lines. In fact we are so used to
it that we probably don’t think much about what it really means:
[1, 2, 3].inject(&:+)
What this code does is take the symbol +
, and use the unary operator &
to
call to_proc
on it and ‘turn’ it into a block, which is then used as a block
for inject
. The key thing to realize here is that we can provide a default
way for a type (Symbol
in this case) to become a block, by using &
and
to_proc
. This does sound similar to what clojure is doing, doesn’t it? It’s
really just a matter of some monkeypatching to create hashes and sets that work
like functions:
class Hash
def to_proc
Proc.new {|x| self[x]}
end
end
class Set
def to_proc
Proc.new {|x| self.contains?(x) }
end
end
Let’s see how it works:
[1, 2, 3].select(&Set.new(2)) # => [2]
h = {:a => 1, :b => 2, :c => 3}
[:a, :b, :c].map(&h) # => [1, 2, 3]
Again, while unfamiliar at first, the more I look at it the more I like it, especially the hash example. Another type that has a well defined default action is the regular expression:
class Regexp
def to_proc
Proc.new {|x| x.match(self)}
end
end
Which makes code like this possible:
["a", "b", "c"].detect(&/a/) # => "a"
["a", "b", "c"].select(&/a/) # => ["a"]
# Note that this is the same as ["a", "b", "c"].grep(/a/)
["a", "b", "c"].reject(&/a/) # => ["b", "c"]
This one I like even better, it’s much easier to spot when using literal
regexes and, as evidenced by the existence of Enumerable#grep
, it’s a pretty
useful feature.
Array
may seem like another candidate, but what should the action be, look
up elements by position, or check for element presence? Having to choose
probably means that it’s better to leave it alone. As for others I can’t
really think of more default types with action semantics clear enough to merit
a to_proc
, but I’d gladly hear any suggestions.