The Importance of Compiled Metadata

Metadata is extremely important. It tells our clients what to expect from the cookbook we're about to execute. Back in September of 2012 I opened a ticket regarding the name attribute not being required in a cookbook's metadata. The gist of the problem is that the name of a cookbook, if not explicitly defined, is derived from the name of the directory the metadata.rb is located in. This caused a severe problem when working with the source of cookbooks because you may not have cloned a cookbook into a directory sharing the same name as the cookbook (especially since the naming convention I follow on Github is {cookbook_name}-cookbook.

The good thing is that the auto-magic cookbook naming problem currently only manifests on our development machines, but what if the Chef-Client started storing cookbooks in a directory not named after the cookbook itself? Well, you'd have a problem since the Chef-Client currently evaluates your cookbook's raw metadata at runtime.

In the current version of Chef-Client (and all prior versions) the raw metadata file (metadata.rb) takes precedence over the compiled metadata file (metadata.json). You never see your cookbook's compiled metadata on your development machine because it's automatically generated by Knife (or Ridley) on upload. If you've ever authored a Rubygem you should be familiar with this. Your code repository contains a .gemspec file which contains arbitrary Ruby code that is executed when you build your gem. What would happen though if that gemspec was evaluated on the Rubygem's servers before sending you the Gem, or worse, on your own machine?

Well, you'd for sure have a problem on your hands if you didn't have Git installed. It's a very common pattern to automatically generate a list of files contained in your Rubygem by using this one-liner. Luckily that isn't the case with Rubygems, the gemspec is compiled on the developers machine before they even send the gem to Rubygems.

As I mentioned before, since the raw metadata preferred by the Chef-Client over the compiled metadata, unhandled exceptions like the gemspec git ls-files example are possible at runtime when the Chef-Client attempts to load your cookbook. I created these two tickets to address the issue:

In the mean time I patched Ridley to no longer include raw metadata when uploading cookbooks to the Chef Server. This will prevent the Chef Client from prioritizing the raw metadata since it never gets to an instance of Chef-Client.

Berkshelf 3.0 has also been updated so the raw metadata will compiled and stripped out during a berks upload, berks vendor, or berks package.

With these changes we can now feel confident in dynamically generating portions of our metadata. We can now do things like:

  • Gather a list of Public Recipes in our cookbook and add an entry for each in our metadata
  • Gather a list of Public Attributes in our cookbook and add an entry for each in the metadata including their default value, type, etc.
  • Automatically version our cookbook from the contents of a VERSION file or a version control tag

I will be adding helper functions to help automatically generate metadata for your cookbooks in a future version of Berkshelf. In the mean time feel free to add these things yourself provided you're using Ridley or Berkshelf to upload your cookbooks!