urbanautomaton

Bacon Driven Development

Git Grep and Language-aware Diffs

I’m a heavy user of git-diff - before every commit I review my changes for a final sanity check. However, I only recently discovered that my diffs weren’t telling me as much as they might.

Git’s diffs are presented in hunks, with each set of adjacent changes shown with some surrounding context, and a header line providing further contextual information. If I run the command

$ git diff app/models/landing_page.rb

I see:

landing_page.rb.diff
1
2
3
4
5
6
7
8
9
10
diff --git a/app/models/landing_page.rb b/app/models/landing_page.rb
index 6ad43a7..3524d43 100644
--- a/app/models/landing_page.rb
+++ b/app/models/landing_page.rb
@@ -29,6 +29,7 @@ class LandingPage < ActiveRecord::Base
"http://tribesports.com/products/#{cheapest_result}"
end
end
+ hello
end

Here we can see some file information, some git object information, and then a single diff hunk, prefixed with the header:

@@ -29,6 +29,7 @@ class LandingPage < ActiveRecord::Base

After the line information there is some local context for the hunk. In this case, however, we’re only being shown the first line of the file. This is because git doesn’t automatically know it’s looking at a ruby file. But git has a number of language-aware diff modes that allow it to provide finer-grained hunk headers. If I add the line

*.rb diff=ruby

to the .gitattributes file in my project root, I get the following output:

landing_page.rb.diff
1
2
3
4
5
6
7
8
9
10
diff --git a/app/models/landing_page.rb b/app/models/landing_page.rb
index 6ad43a7..3524d43 100644
--- a/app/models/landing_page.rb
+++ b/app/models/landing_page.rb
@@ -29,6 +29,7 @@ def cheapest_url
"http://tribesports.com/products/#{cheapest_result}"
end
end
+ hello
end

Now diff hunks are presented with the appropriate context; in this case, a method declaration.

This seems like a fairly small benefit, but the language-aware diff features are used in other contexts. For example, the command

$ git grep -p <pattern>

will find all instances of <pattern> in your checked-in code, labelling them by the detected context. Without ruby diffs enabled, if I run

$ git grep -p rescue

I will see all classes that contain rescue statements. With ruby diffs enabled, I’ll see all methods containing rescue statements:

$ git grep -p rescue
app/models/landing_page.rb=  def refresh_results
app/models/landing_page.rb:      rescue

Nifty, eh?

Paperclip: Timestamping Attachments on Upload

It’s not my intention to make this a Paperclip-only blog, but I recently had another requirement that was quite simply accomplished, and thought it was common enough to be useful to others.

Timestamping attachments could be valuable for several reasons – in our case, it was made necessary due to our site moving to S3 and Cloudfront for the delivery of our static assets. If the content of a file changes but its name remains the same, Cloudfront’s edge nodes will happily continue serving stale content for any asset they’ve cached, which is obviously undesirable.

We therefore need to ensure that attachments have their filename timestamped on upload. Initially I did this with a ghastly hack involving subclassed Tempfiles with an overridden #original_filename method, but this was, well, ghastly. I also briefly tried using Paperclip’s built-in callbacks, but it turned out to be much nicer to do it with a custom processor, like so:

The key bit is using the attachment’s #instance_write method, as suggested by Trevor Turk for a slightly different purpose. This sets the instance variables that Paperclip uses to determine the uploaded filename.

Place this file somewhere in your load path (ideally lib/paperclip_processors), and tell your models to use it after thumbnailing (or before, actually, it doesn’t really matter). Note that you now have to explicitly tell Paperclip that you want to use the Thumbnail processor.

You can specify the date format using the standard format strings, or you can omit the option and a default will be used. Now whenever an attachment is created or altered, its filename will have a date string prepended to it, and your new content will be served as intended, whatever your distribution details are.

Paperclip: Padding Image Attachments

At Tribesports we deal with a vast quantity of external images. We use thoughtbot’s excellent Paperclip gem for all of our image attachments; it allows us to manage everything from design formats to CDN storage with so little effort it’s almost insulting.

I’m currently resizing the images for our site to fit a new design, and came across a requirement not supported by Paperclip’s image format strings (which are inherited from ImageMagick); namely, to pad images to the desired aspect ratio, rather than crop or distort them.

This is very simple to achieve using a custom processor. First, we set up a simple model with our desired geometries, and tell Paperclip that we want to use a custom processor called Padder:

Next, we create our custom Padder processor. This needs to go somewhere in our application’s auto-load path; I placed it in lib/paperclip.

We want to inject our custom behaviour at the resizing stage, so we sub-class Paperclip’s Thumbnail class, and override its #transformation_command method. This returns an array of strings that are joined to produce an ImageMagick convert command. To pad an image to a specified size, we use ImageMagick’s -extent option with a background colour and an alignment. Then we simply append it to the super’s transformation command:

The #geometry_extent method just creates a pure geometry specification string (without any special imagemagick modifiers). And that’s all there is to it.