Friday September 07, 2007
I’ve been watching CouchDb for a while, but it wasn’t until recently when it changed it transport format from XML to JSON that I got real interest in doing something with it, something I apparently wasn’t alone about.
One of the things I’m doing with it is a library called CouchObject, and one of the things it does is allowing you to serialize arbitrary ruby objects to and from CouchDb JSON documents by including a module and defining a few methods on your class:
class Bike
include CouchObject::Persistable
def initialize(wheels)
@wheels = wheels
end
attr_accessor :wheels
def to_couch
{:wheels => @wheels}
end
def self.from_couch(attributes)
new(attributes["wheels"])
end
end
The #to_couch
method is the one that describes the format we want the class instances’ attributes serialized as a document in the CouchDb database:
{
"_id": "6FA2AFB684A93ECE77DEAAF52BB02565",
"_rev": 1745167971,
"attributes": {
"wheels": 4
},
"class": "Bike"
}
Our #to_couch return result is stored in the attributes
key, and the class of the object is the class
key, for querying purposes (_id
and _rev
are CouchDb document attributes).
The from_couch
class method is what describes how we should set up our new Bike object that we load from the database, the attributes
parameter is the attributes
key from the CouchDb document. In this case we just instantiate a new Bike with a number of wheels:
>> bike_4wd = Bike.new(4)
=> #<Bike:0x6a0a68 @wheels=4>
>> bike_4wd.save("couchobject")
=> {"_rev"=>1745167971, "_id"=>"6FA2AFB623A93E0E77DEAAF59BB02565", "ok"=>true}
>> bike = Bike.get_by_id("couchobject", bike_4wd.id)
=> #<Bike:0x64846c @wheels=4>
As I started on this last night there’s still lots of little things to add, like better server and database semantics (in the above #save call, the argument is the database name and the host is hardcoded for now; not pretty).
Another thing I’ve been thinking about doing is a more formal way to describe “models”, something along the DataMapper pattern perhaps, but we’ll see if I actually need it once I get the Persistable module some more features.
Update: I’ve uploaded the Git repository here, I want to add a few things before I do a release.
Sep 07 at 13:56
Excellent – I’ve been hoping for something a bit more robust for a while, but not got round to writing anything.
Is the source available anywhere? I’d love to have a play around with it – I’d especially like to see if the to_couch and from_couch methods could be dropped in standard use cases.
Sep 07 at 14:38
it’ll be up on rubyforge soon, hopefully something after the weekend.
My first approach was actually to just copy the instance variables in and out.
Sep 07 at 18:44
Could the wheel attribute be put at the same level as the class, _id, and _rev attributes instead of in a “attributes” field? Is this a convention, something forced by CouchDb, or what? I believe it only add unnecessary clutter to the data structure.
Sep 09 at 02:27
Does it make sense to have an ActiveRecord adapter for CouchDB?
Sep 09 at 20:22
I’ve been thinking about ActiveRecord and CouchDB, maybe ActiveCouch. :)
It’s perhaps not the best fit. A rails like storage model is a great idea, but because couchdb doesn’t have set scheams, we need to define that in our model. Perhaps an AR style definition of the fields would be a good addition to this lib.
Sep 10 at 06:01
@Dado: not enforced by CouchDb at all, I just think its nice to separate the metadata from the actual object data.
@Dan/rabble: Yeah, after working a bit with the approach from the post here, I find that I need, or want rather, a more formal and descriptive model of my data, since my current wish isn’t really to store arbitrary Ruby objects in CouchDb, but rather a domain-specific set of objects.
I don’t think the ActiveRecord pattern at it’s core maps too well to CouchDb’s loose (schemaless) structure. But I’d certainly want to do something along these lines:
Sep 10 at 09:24
I’m not fully up to speed with couch but ive been interested in the query side of it using javascript constructs to declare the map functions. could it be modelled in ruby in the same way with a block then the block serialised to a javascript construct? does that make sense?
Sep 10 at 09:32
there’s ruby2js, never used it though. But more interesting is the fact that it looks fairly easy to change the query engine in CouchDb (it’s essentially shelling out to spidermonkey right now).
Next on my list is obviously to try and make the query engine use Ruby instad of Javascript :)
Sep 10 at 15:43
@rabble: Could the schema be derived from db/schema.rb in the rails app rather than the models?
Sep 13 at 16:57
Or perhaps a CouchDB document for CouchDB documents including the schema description, using CouchDB to describe itself (sort of).
Sep 14 at 10:50
If you are interested in existing implementations of formal schema definitions and doing data modelling uses pure dynamic objects, there has been a lot of different projects within the Python community.
In Django the models contain the schema definition directly – they’ve experimented with schema inheritance but I think that’s on-hold since they have an ORM to deal with:
http://code.djangoproject.com/wiki/ModelInheritance
In the Zope and Plone world we have been publishing persistent dynamic objects to the web for a long time using the Zope Object Database (ZODB), this is very similar to the method used by Gemstone – implementation details are of course quite different, but the core concept is the same. Plone developed Archetypes which uses multiple inheritance to do schema inheritance, so mix-ins style schemas are possible. Archetypes does a good job, but like Django, Archetypes tightly couples Widget objets by embedding them within the schema, making code reuse hard. It has it’s other warts too:
http://plone.org/documentation/tutorial/borg/to-archetype-or-not-to-archetype
When the core Zope developers did the whole let’s-start-over-from-scratch thing after they had been working on Zope 2 for a long time, it took them a lot more years to produce Zope 3. The zope.interface and zope.schema packages in Zope 3 provides a very formal way of specifiying boths APIs and Schemas respectively. These are very well written packages. Schemas are considered an aspect of your API, since in the world of objects the two are tightly linked. Interfaces are just objects thought, and your model declares that it implements specific schemas. This is a much more pleasant way of doing it, IMO.
http://pypi.python.org/pypi/zope.schema/
Except of course Zope 3 requires a great deal of explicit configuration in the form of XML. Which isn’t always the most fun stuff to write. Recently there has been a movement to create a way of working with Zope 3 that uses a lot of the same ideas where Ruby on Rails did a lot of innovation, such as convention over configuration. This project is called Grok and it makes Zope 3 a heck off a lot more fun to play with. It can also give you a glimpse of what Ruby on Rails might be like if it used an OODB:
http://grok.zope.org
Sep 15 at 06:52
@Maraby: That strikes me as throwing away the benefit of having schema-free storage. It makes much more sense (to me anyway) to define fields in one’s model – you know, close to the validation rules and other smartness that go with the object.
In fact, it seems to me that CouchDb makes it far easier to embrace ruby’s dynamism, since neither ruby nor couch really cares what you store in your attributes. The default behavior should be to just store your data and get on with it. If you want specific behavior, ruby already has many excellent ways of doing that, like actually defining the setter/getter methods with specific code, validation macros, type-casting macros etc etc.
Sep 21 at 15:21
Looks great, but why not use yaml instead of manually creating the to/from methods?
Sep 21 at 16:16
Well, the idea was that there’s no general way of knowing exactly how any particular object should map it attributes (could be into accessors, methods, class/instance/local variables etc etc), hence the mapping methods