Conductor Asset References

Warning: technical content ahead

This August I rolled out a new feature for Conductor: Asset References. In short, Conductor now provides a means of tracking which Pages, News, and/or Events make use of a given Asset. This feature is certainly helpful in both trouble-shooting your site and to help you understand where the content is being used.

I’m going to step through the system development that went into this feature.

Data Structures

Conductor is backed by a Relational Database responsible for storing the content of the site. As part of the database, we have a table for Pages, Events, and News. Each of those tables can have one or more of the following fields: content, excerpt, “parts” and “meta-fields”; A reference to an Asset can be found in any of the preceding attributes an HTML link, but not all of the tables have the same attributes (see below).

Events News Pages
content yes yes yes
excerpt yes yes no
parts no no yes
meta-fields yes yes yes

As a result there is a bit of a challenge; I don’t want to write three separate functions that looked for references to Assets (i.e. one for Events, one for News, etc.). So lets do a little bit of abstraction, blur your eyes if you will, and say that the heterogeneous Pages, Events and News are all “ThingsWithContent”.

Calculating an asset reference is not the primary role of a “ThingWithContent”; Managing content is. So I’m going to create another table called AssetReference who’s sole purpose is to record the relation between ThingsWithContent and Assets by using a many-to-many relationship.

An AssetReference is created by taking a “ThingWithContent” and then checking each of it’s attributes that could have references to an Asset. This process is run each time a “ThingWithContent” is created or updated.

Observers

Since a Page, News, and Event’s primary role is not checking for AssetReferences, I wanted to loosely couple the AssetReferences calculation; The goal of loose coupling is to insulate the coupled objects from any changes in the other. By implementing the Observer Pattern, I’m able to loosely couple a ThingWithContent to an AssetReference. I created a ReferenceObserver which is notified when a ThingWithContent is created or updated. The ReferenceObserver then builds the AssetReferences by inspecting the observed ThingWithContent and only processing the attributes (i.e. content, excerpt, parts, meta-fields) that the ThingWithContent has implemented. In short the ReferenceObserver can take any type of ThingWithContent and successfully check for AssetReferences.

Pattern Matching

Each of the implemented fields that might have an AssetReference are then checked by using a Regular Expression to find all of the links to an Asset. Regular expressions are a powerful tool for finding and even replacing text. If you haven’t seen them before, they are extremely daunting. And if you’ve seen them before, they are merely daunting. The regular expression I use is as follows:

/\/assets\/(a-conductor-site\.nd\.edu\/)?(\d+)\//

I’m going to break this down so you can understand what’s going on. To simplify, I’m going to remove most of the back-slashes (\). They are a technical necessity but in this instance a learning obstacle, so I’ll remove them.

//assets/(a-conductor-site.nd.edu/)?(\d+)//

Then I’m going to remove the leading and trailing forward-slash (/) as those are used to indicate a regular expression.

/assets/(a-conductor-site.nd.edu/)?(\d+)/

This is a bit more legible but still needs explanation. See the “/assets/(a-conductor-site.nd.edu/)?”, what that chunk says is to find text that includes “/assets/” and optionally includes “a-conductor-site.nd.edu/”; The parenthesis indicates a Group of text and the question mark means the Group is optional. The next chunk (\d+)/ means a Group of one or more numbers; The \d stands for any digit and the plus-sign (+) means one or more of the preceding character.

For example: “/assets/22/” would match the patterns, but “/assets/2a/” would not.

Once I have the asset URLs, it is a matter of connecting the Asset to the ThingWithContent.

Conclusion

By keeping the responsibility of determining AssetReferences away from the ThingWithContent, I’m able to keep the core responsibility of a ThingWithContent intact (i.e. provide content). By use of object inspection, I’m able to handle a heterogeneous set of ThingWithContent. And by use of Regular Expressions, I’m able to find and register links to multiple assets.

Posted: 2010-08-23

Permanent Link

Pretty Output for irb and script/console

Rather simple means of adding pretty output to irb

  • Get the hirb gem
    sudo gem install cldwalker-hirb --source http://gems.github.com
  
  • Configure ~/.irbrc
    require 'rubygems'
    require 'hirb'
    Hirb.enable
    extend Hirb::Console
  
  • Look at the output
    >> Site.find(:first)
    +----+-----------+-----------+-----------+-----------+-----------+
    | id | name      | domain    | site_s... | create... | update... |
    +----+-----------+-----------+-----------+-----------+-----------+
    | 1  | Web Group | webgro... | 1         | 2007-0... | 2008-0... |
    +----+-----------+-----------+-----------+-----------+-----------+
    1 row in set
    >>
  

This makes me happy. A prettier output for Ruby objects. For more information, see http://github.com/cldwalker/hirb/tree/master

Posted: 2009-07-22

Permanent Link

Rails Subdomain caching

Today I spent some time implementing caching in a Rails application that makes use of subdomains.
With a quick bit of google-fu, I stumbled upon Nathaniel Bibler’s post.
A very informative guide, however, the apache rewrite rules proved to be a bit more involved. In addition, the shennagans of the subdomain seemed a little unnecessary, so I opted for host instead of subdomain. The changes are below.

    class ApplicationController < ActionController::Base  
      private
      def cache_page_with_host(content = nil, options = nil)
        path = case options
          when Hash
            url_for(options.merge(:only_path => true, :skip_relative_url_root => true, :format => params[:format]))
          when String
            options
          else
            request.path
        end
        cache_page_without_host(content, File.join("/#{request.host}/", path == "/" ? 'index.html' : path))
      end
    end

Apache Config

RewriteEngine On

# Rewrite rule to check for host cached pages.
# This will check the file system for a cached copy, if that exists
# use the static cached page. 
# The [QSA,L] is very important. See http://httpd.apache.org/docs/2.2/mod/mod_rewrite.html#rewriteflags
RewriteCond %{DOCUMENT_ROOT}/cache/%{HTTP_HOST}/index.html -f
RewriteRule ^/$ /cache/%{HTTP_HOST}/index.html [QSA,L]
RewriteCond %{DOCUMENT_ROOT}/cache/%{HTTP_HOST}/%{REQUEST_URI}.html -f
RewriteRule ^(.*)$ /cache/%{HTTP_HOST}/$1.html [QSA,L]

Posted: 2009-05-28

Permanent Link

First Post and in all likelihood the last.

I have begun to explore using Jekyll and Github pages. Oh github I <3 you.

Posted: 2009-05-19

Permanent Link

git add –patch file.name

I would highly recommend reading “this post”:http://tomayko.com/writings/the-thing-about-git, it highlights yet another awesome feature of git. Git lets you do your work how you do it, instead of how the SCM says you should. Here is a great example. I have a hopelessly muddled set of changes in file_the_other.txt. The changes are for two (or more) different purposes, and I need to only push a portion of it. Enter git add --patch. I can go through each diff element and select to put it in the queue (the Index) for what will be committed.

  $ git add --patch file_the_other.txt 
  diff --git a/file_the_other.txt b/file_the_other.txt
  index 2120d21..e3aa90d 100644
  --- a/file_the_other.txt
  +++ b/file_the_other.txt
  @@ -1,3 +1,6 @@
  +World is here
   A file like another file, but with different content.

  -There is even more content here.
  \ No newline at end of file
  +There is even more content here.
  +
  +Hello World
  \ No newline at end of file
  Stage this hunk [y/n/a/d/s/?]?

Posted: 2008-04-08

Permanent Link