Conductor Asset References
Warning: technical content ahead
This August I rolled out a new feature for Conductor: Asset References. In short, Conductor now provides a means of tracking which Pages, News, and/or Events make use of a given Asset. This feature is certainly helpful in both trouble-shooting your site and to help you understand where the content is being used.
I’m going to step through the system development that went into this feature.
Data Structures
Conductor is backed by a Relational Database responsible for storing the content of the site. As part of the database, we have a table for Pages, Events, and News. Each of those tables can have one or more of the following fields: content, excerpt, “parts” and “meta-fields”; A reference to an Asset can be found in any of the preceding attributes an HTML link, but not all of the tables have the same attributes (see below).
| Events | News | Pages | |
| content | yes | yes | yes |
| excerpt | yes | yes | no |
| parts | no | no | yes |
| meta-fields | yes | yes | yes |
As a result there is a bit of a challenge; I don’t want to write three separate functions that looked for references to Assets (i.e. one for Events, one for News, etc.). So lets do a little bit of abstraction, blur your eyes if you will, and say that the heterogeneous Pages, Events and News are all “ThingsWithContent”.
Calculating an asset reference is not the primary role of a “ThingWithContent”; Managing content is. So I’m going to create another table called AssetReference who’s sole purpose is to record the relation between ThingsWithContent and Assets by using a many-to-many relationship.
An AssetReference is created by taking a “ThingWithContent” and then checking each of it’s attributes that could have references to an Asset. This process is run each time a “ThingWithContent” is created or updated.
Observers
Since a Page, News, and Event’s primary role is not checking for AssetReferences, I wanted to loosely couple the AssetReferences calculation; The goal of loose coupling is to insulate the coupled objects from any changes in the other. By implementing the Observer Pattern, I’m able to loosely couple a ThingWithContent to an AssetReference. I created a ReferenceObserver which is notified when a ThingWithContent is created or updated. The ReferenceObserver then builds the AssetReferences by inspecting the observed ThingWithContent and only processing the attributes (i.e. content, excerpt, parts, meta-fields) that the ThingWithContent has implemented. In short the ReferenceObserver can take any type of ThingWithContent and successfully check for AssetReferences.
Pattern Matching
Each of the implemented fields that might have an AssetReference are then checked by using a Regular Expression to find all of the links to an Asset. Regular expressions are a powerful tool for finding and even replacing text. If you haven’t seen them before, they are extremely daunting. And if you’ve seen them before, they are merely daunting. The regular expression I use is as follows:
/\/assets\/(a-conductor-site\.nd\.edu\/)?(\d+)\//
I’m going to break this down so you can understand what’s going on. To simplify, I’m going to remove most of the back-slashes (\). They are a technical necessity but in this instance a learning obstacle, so I’ll remove them.
//assets/(a-conductor-site.nd.edu/)?(\d+)//
Then I’m going to remove the leading and trailing forward-slash (/) as those are used to indicate a regular expression.
/assets/(a-conductor-site.nd.edu/)?(\d+)/
This is a bit more legible but still needs explanation. See the “/assets/(a-conductor-site.nd.edu/)?”, what that chunk says is to find text that includes “/assets/” and optionally includes “a-conductor-site.nd.edu/”; The parenthesis indicates a Group of text and the question mark means the Group is optional. The next chunk (\d+)/ means a Group of one or more numbers; The \d stands for any digit and the plus-sign (+) means one or more of the preceding character.
For example: “/assets/22/” would match the patterns, but “/assets/2a/” would not.
Once I have the asset URLs, it is a matter of connecting the Asset to the ThingWithContent.
Conclusion
By keeping the responsibility of determining AssetReferences away from the ThingWithContent, I’m able to keep the core responsibility of a ThingWithContent intact (i.e. provide content). By use of object inspection, I’m able to handle a heterogeneous set of ThingWithContent. And by use of Regular Expressions, I’m able to find and register links to multiple assets.
Posted: 2010-08-23
Pretty Output for irb and script/console
Rather simple means of adding pretty output to irb
- Get the hirb gem
sudo gem install cldwalker-hirb --source http://gems.github.com
- Configure ~/.irbrc
require 'rubygems'
require 'hirb'
Hirb.enable
extend Hirb::Console
- Look at the output
>> Site.find(:first)
+----+-----------+-----------+-----------+-----------+-----------+
| id | name | domain | site_s... | create... | update... |
+----+-----------+-----------+-----------+-----------+-----------+
| 1 | Web Group | webgro... | 1 | 2007-0... | 2008-0... |
+----+-----------+-----------+-----------+-----------+-----------+
1 row in set
>>
This makes me happy. A prettier output for Ruby objects. For more information, see http://github.com/cldwalker/hirb/tree/master
Posted: 2009-07-22
Rails Subdomain caching
Today I spent some time implementing caching in a Rails application that makes use of subdomains.
With a quick bit of google-fu, I stumbled upon Nathaniel Bibler’s post.
A very informative guide, however, the apache rewrite rules proved to be a bit more involved. In addition, the shennagans of the subdomain seemed a little unnecessary, so I opted for host instead of subdomain. The changes are below.
class ApplicationController < ActionController::Base
private
def cache_page_with_host(content = nil, options = nil)
path = case options
when Hash
url_for(options.merge(:only_path => true, :skip_relative_url_root => true, :format => params[:format]))
when String
options
else
request.path
end
cache_page_without_host(content, File.join("/#{request.host}/", path == "/" ? 'index.html' : path))
end
end
Apache Config
RewriteEngine On
# Rewrite rule to check for host cached pages.
# This will check the file system for a cached copy, if that exists
# use the static cached page.
# The [QSA,L] is very important. See http://httpd.apache.org/docs/2.2/mod/mod_rewrite.html#rewriteflags
RewriteCond %{DOCUMENT_ROOT}/cache/%{HTTP_HOST}/index.html -f
RewriteRule ^/$ /cache/%{HTTP_HOST}/index.html [QSA,L]
RewriteCond %{DOCUMENT_ROOT}/cache/%{HTTP_HOST}/%{REQUEST_URI}.html -f
RewriteRule ^(.*)$ /cache/%{HTTP_HOST}/$1.html [QSA,L]
Posted: 2009-05-28
First Post and in all likelihood the last.
I have begun to explore using Jekyll and Github pages. Oh github I <3 you.
Posted: 2009-05-19
git add –patch file.name
I would highly recommend reading “this post”:http://tomayko.com/writings/the-thing-about-git, it highlights yet another awesome feature of git. Git lets you do your work how you do it, instead of how the SCM says you should. Here is a great example. I have a hopelessly muddled set of changes in file_the_other.txt. The changes are for two (or more) different purposes, and I need to only push a portion of it. Enter git add --patch. I can go through each diff element and select to put it in the queue (the Index) for what will be committed.
$ git add --patch file_the_other.txt diff --git a/file_the_other.txt b/file_the_other.txt index 2120d21..e3aa90d 100644 --- a/file_the_other.txt +++ b/file_the_other.txt @@ -1,3 +1,6 @@ +World is here A file like another file, but with different content. -There is even more content here. \ No newline at end of file +There is even more content here. + +Hello World \ No newline at end of file Stage this hunk [y/n/a/d/s/?]?
Posted: 2008-04-08