Tuesday, November 27, 2012

Why Not To Use My Library clj-record

In 2008 I discovered Clojure, and I was extremely excited to finally have an easy entry-way to Lisp. Pretty early on I wanted to get my head around one of Lisp's killer features: macros. A little playing around in the REPL was easy but not satisfying: I wanted to build something powerful.

Having spent a few years doing Ruby on Rails, I thought ActiveRecord's macro-like capabilities (has_many and whatnot) would be a good goal. So I built clj-record.

In retrospect, I went too far trying to build in Rails-like magic, so I want to point out the design flaws that I recommend you avoid in your own work and recommend that you choose something other than clj-record for your Clojure projects that use an RDBMS.

The API of clj-record is all in one macro: clj-record.core/init-model. It commits a few macro sins.

  • It looks at the namespace from which you call it to find the name of your model.
  • It intentionally captures a db var from your namespace.
  • It defs a bunch of fns in your namespace (most of which are just partially applied versions of clj-record.core fns).

So you do this:

and that turns into something like this (leaving out the ns):

This provided really good practice (and an example I've gone back to more than once) for quoting, syntax quoting, unquoting, and unquote-splicing.

But this sucks!

What init-model should do is return a data-structure that you then pass to clj-record's find, insert, and update fns. In addition to being simpler to understand than the hidden registry of model metadata that clj-record has to maintain, it would make it more natural to write generic, composable code, since the model itself would be a value.

If that were all init-model needed to do, the code (both inside clj-record and user code) would probably be even simpler if there were no macro at all. A set of functions could enhance the model data-structure as needed so that even model setup code could be composed like any other series of fn-calls.

Maybe it would be worthwhile having a macro that went something like this:

This macro would do a def, but it's completely intuitive what that def is. The third "sin" on my list above was about defining a bunch of fns in your namespace. I call this a sin because it's not clear from reading the code that there are a bunch of vars defined, making both discovery and debugging more difficult. There may be cases where "hidden defs" are justified by performance gains or code reduction, but the fns defined by init-model don't pass that test (because (clj-record.core/find-records widget attributes) is no worse and more easily composed with other generic model-processing fns than (my.model.widget/find-records attributes)).

If no one else were writing libraries, I'd try to remedy this in the next major revision of clj-record, but someone else has done this work (or something very much like it) already. Chris Granger wrote korma, and it offers similar functionality in a saner way. I'm also not using Clojure with a SQL database, so I wouldn't be able to eat my own dogfood.

If you're modeling entities in a relational database, I'd recommend korma over clj-record. There may be even better options for you (including just using clojure.java.jdbc directly), so look around.

If you're writing macros, don't capture vars, don't make their functionality depend on globals like *ns*, and consider alternatives to creating vars in the caller's namespace. While you're at it, consider whether the job would be better done with plain old functions.

Wednesday, November 14, 2012

Using Aleph's Asynchronous HTTP Client for OAuth

If you're implementing authentication in your web app on the server-side using some third party's OAuth, you ideally want to dedicate minimal system resources to waiting for HTTP responses. If you're working in Clojure, you can do this using Aleph's asynchronous http client. Here I'll walk through using it to do server-side authentication with Facebook.

[There's nothing auth-specific about this. Any integration with HTTP end points might benefit from similar treatment.]

Here's what our application needs to do:

  1. Send the user to Facebook's OAuth "dialog," passing along a redirect_uri parameter. On the dialog they'll see what permissions your app wants and click a button to say they want to log in. Facebook will redirect them back to your redirect_uri with a code parameter in the query string.
  2. Make a server-side request for an access-token for the user, passing along the code parameter and your own redirect_uri
  3. Make a second server-side request for whatever information the app needs (e.g., the user's name and email address), passing along the access-token.

We can't do anything for the user between steps two and three, so we'll perform them one after another. It's easy to picture what that might look like if we aren't worried about tying up a thread waiting for responses.

Ideally we'd like an asynchronous implementation to read just as clearly.

We'll start with the easiest implementation to understand, which uses on-realized to register success and error callbacks on each HTTP request. The fn now takes success and error callbacks. In an aleph web application, those would each enqueue a ring-style response map onto the response channel.

Unfortunately this reads terribly due to the nested callbacks.

We can use lamina's run-pipeline macro to create a cleaner version of the same thing.

At this point, however, we realize that we're missing something. In addition to the user data we get from the second HTTP request, we want the "fb-user" passed to the success callback to have the access-token and access-token-expiration we got in response to the first HTTP request.

We can achieve that by breaking the pipeline into two and holding onto the result of the first so that we can refer to it again later (in the call to combine-results.

Note the deref ("@") in that second reference to access-token-result. It's needed because the pipeline returns a result-channel which may not yet be realized. We don't have to worry about the deref blocking, since the value is guaranteed to be realized before the second pipeline will proceed past the initial value, but it looks like something that might block. The whole thing's also a bit sloppy. Zack Tellman pointed out that it's a bit cleaner to split them like so.

The deref is no longer needed because the nested fn will be called with the realized value once it's ready.

If we extract the details so that we have the same level of abstraction we had in the initial (synchronous) version, we end up with this.

Comparing that last fn to the synchronous version, we come out looking pretty good here! The only thing I find a little awkward is that the options to run-pipeline are the second argument. It breaks up the flow when the first "value" is just a call to another pipeline.

Assorted Thoughts

Reusing the connection

Aleph's http-client fn lets you reuse one connection for multiple requests to the same host. It mucks things up a bit, since you want to be sure the connection gets closed at the end.

Not being able to just wrap a (try ... (finally ...)) around the whole thing is a bummer, but it's still not awful. Zach has said there will likely soon be a cleaner way to do "finally" in pipelines.

The :error-handler

The empty :error-handler fns on the "get-fb-" fns suppress "unhandled exception" logging from lamina. Since both of those pipelines are run as part of the outer pipeline, when either of them errors, the error-handler from the outer pipeline will be run, so that's the only place we need to use the error-callback. In my real app, those inner error-handlers just log, but you could just as easily leave the options out completely if you're ok with lamina's logging.

The rest

I've left out many details, like the FB-specific URL-templates and response parsing. If there's interest, I'm happy to share those. I just thought the lamina/aleph stuff was the interesting part.

Monday, November 05, 2012

How to Handle File Uploads from Clojure on Heroku (or anywhere else)

I recently wanted to deploy a Clojure web app that allowed users to upload a CSV and work with the data. Heroku is as easy as it gets for web app deployment, but I figured I'd have to jump through some extra hoops to do the file upload. It turns out there are no hoops. Thanks to Heroku's ephemeral filesystem, we can handle file uploads just as we would on any other system, so long as we either process the file immediately or send it someplace else to be stored.

Here's how to handle a file upload in a very simple ring app.

The 'render-' fns are left to you!