Implementing Burt Beckwith’s GORM Performance – No Collections

Posted on February 7, 2011. Filed under: grails |

No More GORM/Hibernate Collections?

Mr. Burt Beckwith http://burtbeckwith.com/ gave a talk on improving GORM performance in Grails. His presentation is available at http://www.infoq.com/presentations/GORM-Performance.

A point of his was to not use collections on your domain objects because the collections may need to be fully loaded from the database before objects can be added.

I decided to use it on one of my work projects. This project has 18 domain classes. I ran into a few problems, scratched my head for a few hours, then I was able to fix them. It took about 1/2 day to update the project from a “hasMany/belongsTo” style to a no collection style. Here is a very-simplified domain that uses a typical gorm one-to-many cascading relationship.

Sample Shopping Cart Application

Lets start with sample code of a Shopping Cart. We have a Cart object that contains many Item objects.

Typical GORM Relationship

grails-app\domain\shopping\Cart.groovy

package shopping

class Cart {

	String name

	static hasMany = [ items : Item ]

	static mapping = {
		sort "name"
		items sort:"name"
	}

    static constraints = {
    	name nullable:false, blank:false, unique:true
    }

    String toString() {
    	"Cart $name"
    }
}

grails-app\domain\shopping\Item.groovy

package shopping

class Item {

	String name
	Integer quantity

	static belongsTo = [ cart : Cart ]

	static mapping = {
		sort "name"
	}

    static constraints = {
    	name nullable:false, blank:false, unique:['cart']
    	quantity nullable:false
    }

    String toString() {
    	"$quantity $name"
    }
}

grails-app\conf\BootStrap.groovy

import shopping.*

class BootStrap {

    def init = { servletContext ->
    
    	def cart = new Cart(name:"alpha").save()
    	cart.addToItems new Item(name:"apples", quantity:10)
    	cart.addToItems new Item(name:"milk", quantity:2)
    	cart.addToItems new Item(name:"bread", quantity:3)
    	cart.save()
    }

    def destroy = {
    }
}

Removing hasMany and belongsTo

Now, following Burt’s advice, we remove the belongsTo and hasMany relationship, and embed a the parent object in the child.

Now our code looks like this:

BelongsTo and HasMany Removed

grails-app\domain\shopping\Cart.groovy

package shopping

class Cart {
	
	String name
	
	static mapping = {
		sort "name"
	}
	
    static constraints = {
    	name nullable:false, blank:false, unique:true
    }
    
    String toString() {
    	"Cart $name"
    }
}

grails-app\domain\shopping\Item.groovy

package shopping

class Item {
	
	String name
	Integer quantity
	Cart cart
	
	static mapping = {
		sort "name"
	}
	
    static constraints = {
    	name nullable:false, blank:false, unique:['cart']
    	quantity nullable:false
    }
    
    String toString() {
    	"$quantity $name"
    }
}

grails-app\conf\BootStrap.groovy

import shopping.*

class BootStrap {

    def init = { servletContext ->
    	def cart = new Cart(name:"alpha").save()
    	new Item(cart:cart, name:"apples", quantity:10).save()
    	new Item(cart:cart, name:"milk", quantity:2).save()
    	new Item(cart:cart, name:"bread", quantity:3).save()
    }
    
    def destroy = {
    }
}

Here’s what changed.

in Cart.groovy

  • removed static hasMany = [ items : Item ]
  • removed items sort:”name” from mappings

in Item.groovy

  • removed belongsTo
  • added Cart cart

in BootStrap.groovy

  • Changed the way we created child objects.
    • from: cart.addToItems new Item(name:”apples”, quantity:10)
    • to : new Item(cart:cart, name:”apples”, quantity:10)

New Problems and Solutions

Are we done? Of course not – it’s not quite that easy. There were a few other problems that needed to be addressed:

  1. No longer have easy access to the collection of child items.
  2. Sorting of collections (defined in mapping) no longer takes place.
  3. Can’t delete the parent object because of foreign-key relationships.
  4. Scaffolding no longer shows the child objects.

Lets solve these, one-at-a-time.

No longer have easy access to the collection of child items

This is easily fixed. We just add a new method to the parent object (Cart) that returns the collection of child objects (Item) that are in the cart:

grails-app\domain\shopping\Cart.groovy

    ....
    def getItems() {
    	Items.findAllByCart(this)
    }
    ....

Sorting of collections (defined in mapping) no longer takes place.

This is also easy. We just need a slight modification to the code above. We can add a sort parameter to the findAllBy* method.

grails-app\domain\shopping\Cart.groovy

    ....
    def getItems() {
        Item.findAllByCart(this, [sort:"name"])
    }
    ....

Now, we can access the cart’s items just like we did when we had a hasMany relationship.

    println cart.items

Can’t delete the parent object because of foreign-key relationships.

This was the head-scratcher. Not because it’s difficult, but it took a little research to find a elegant solution.

The problem is if we do a cart.delete(), it will fail it there are items in the cart. The old hasMany / belongsTo solution would automatically cascade-delete the items, and the cart. We don’t have that now, so we must do it manually.

The best solution that I know of (let me know if there is a better one) is to use the beforeDelete event on the Cart object.

Before a object is deleted, GORM fires the beforeDelete event. You can place code in it to do whatever you want. In our case, we will delete the child Item objects. Take a look at the documentation for beforeDelete .

Lets update the Cart.groovy code and add the beforeDelete method.

grails-app\domain\shopping\Cart.groovy

    ....
    def beforeDelete() {
        Item.withNewSession { items*.delete() }	
    }
    ....

(This is what I love about groovy – one line of code does so much work!)

First, we create a new hibernate session. If you read the docs for beforeDelete, you know that (in this case) you should use a new session for the delete in order to avoid StackOverflow exceptions.

Next, we delete all of the child items. We use groovy’s spread operation to call a method on all items in a list.

So, with that line of code, all child objects (Item) will be deleted when the parent object (Cart) is deleted.

Scaffolding no longer shows the child objects.

Since the parent object no longer knows about the child objects, the default grails scaffolding will now longer display the items in a cart. I don’t know a way around this, other than not using scaffolding.

Final Cart and Item Code

grails-app\domain\shopping\Cart.groovy

package shopping

class Cart {
	
	String name

	static mapping = {
		sort "name"
	}

	static constraints = {
		name nullable:false, blank:false, unique:true
	}

	String toString() {
		"Cart $name"
	}

	def getItems() {
		Item.findAllByCart(this, [sort:"name"])
	}

	def beforeDelete() {
		Item.withNewSession { items*.delete() }	
	}
    
}

grails-app\domain\shopping\Item.groovy

package shopping

class Item {

	String name
	Integer quantity
	Cart cart

	static mapping = {
		sort "name"
	}

	static constraints = {
		name nullable:false, blank:false, unique:['cart']
		quantity nullable:false
	}

	String toString() {
		"$quantity $name"
	}
}

My Opinions

Implementation Difficulty

Once I worked out how to implement this solution, installing it in my domain objects was pretty simple. Remember that you may need to update any code that uses addTo* and removeFrom* methods

Complexity / Maintenance

Does this solution make my apps more complex to understand? I don’t think so. A experienced groovy person can understand the cart.getItems() method. The beforeDelete will take a little time to understand, but it is quickly learned.

Also there is complexity in using belongsTo / hasMany. See GORM Gotchas Part 2. I don’t think this solution is any more complex the using belongsTo / hasMany.

Performance

Don’t know yet. I don’t have any real-world apps that I considered the loading of child objects to be a performance problem.

Would I Do It Again

Definitely. I had a project fall-apart because it has > 20 objects and >30 relationships between objects. Hibernate and GORM were causing various exceptions, and at the time I was too inexperienced to know how to solve the problems. I am considering attempting that project again, using this “no collections” technique to see if I can finish the project. I have a good feeling that this method will work, but I need to prove it.

Advertisements

23 Responses to “Implementing Burt Beckwith’s GORM Performance – No Collections”

RSS Feed for Mr Paul Woods's Weblog Comments RSS Feed

How can you do items*.delete() when “items” has not been defined?

When groovy sees the .items field, it doesn’t find it. So, it looks for the getter method (getItems()). It finds it and calls it.

See http://groovy.codehaus.org/Groovy+Beans

excellent run-trough on what to do with that talk that Burt gave. Thanks for posting

Great post. I saw Burt’s presentation when it came out and he’s had me thinking about this ever since.

Like yourself, I may go down this route due to the potential performance problems. But it feels weird to abandon such a core part of GORM/Grails (the hasMany/belongsTo). I’m surprised this wasn’t discovered years ago and handled in a major release.

But this post is very helpful in being very detailed about the before and after of making these changes, some of the challenges faced, and the solutions to those new challenges.

Very cool. Thanks.

Michael.

Thanks.

I agree – It is strange to stop using hasMany/belongsTo. I believe that most programs can use hasMany / belongsTo with no problem at at.

I wonder if this used to be a standard technique (in the old days) that was dropped when ORMs became popular, and now is coming back into style.

Paul

Excellent, thanks a lot for sharing.

I think this implementation is easier than using belongsTo and hasMany (given the reasons in the GORM Gotchas link) and I think it is more intuitive.

Cheers!

Thanks,

I agree. I like having the relationship explicitly defined, without the cascading-delete & foreign key relationship implied.

Paul

Exellent post. I think logging the SQL queries that get generated on each case would even help more to understand the real gain when not using collections.

Regarding performance, as Burt commented during his talk, this should only be considered when we can foresee huge volumes of data; therefore it’s a design decision: either you’re on a hurry and you know your data is not going to grow in a way that OOTB collection will shoot you on your feet, or you take some more time to avoid OOTB collections where needed and rest in peace 🙂

From Bert’s perspective, he does this for the performance improvement.

My programs don’t suffer from the GORM collection performance problem. They suffer from relationships between large numbers of objects.

So far (I’ve used this on only two projects), my domains are far more stable and understandable. I’m not getting spurious HibernateExceptions like I was before.

Thanks
Paul

Very useful post. Thanks for sharing this.

Exellent post, but if I use my attributs with Lazy, is a solution?

sorry by my english…..

Early and Lazy loading are performance-improving solutions. This technique breaks relationships between database tables.

Using the beforeDelete technique could be a problem. If there is an error when deleting the cart, it will not be deleted, but its items will be.

True, but if your deleting multiple objects (especially across more than one domain class), that code should be in a transaction.

Yes, I agree, but then it would be required writing code to deal with deleting Carts and its Items inside a transaction, and we would lose the cascade behavior since instead of calling delete on Cart we would need to call the method that deals with the transaction.
After I posted here, I tried to overcome the problem with some groovy “magic”, and I came up with this:

class DeleteCascadeInterceptor implements ApplicationListener {
GrailsApplication grailsApplication;
static final Log log = LogFactory.getLog(DeleteCascadeInterceptor)

void register() {
// Load all dynamic methods
// It is required to do this before overriding the methods because of how HibernatePluginSupport deals
// with inheritance
grailsApplication.domainClasses.each { GrailsDomainClass domainClass ->
log.debug(“Loading dynamic methods: ${domainClass}”)
HibernatePluginSupport.initializeDomain(domainClass.clazz)
}
grailsApplication.domainClasses.each { GrailsDomainClass domainClass ->
register(domainClass)
}
}

void register(GrailsDomainClass domainClass) {
Class clazz = domainClass.clazz
MetaMethod cascadeDelete = clazz.metaClass.getMetaMethod(‘cascadeDelete’, [Map] as Object[])

if (cascadeDelete) {
log.info(“Overriding delete() method of ${domainClass}”)
MetaMethod delete = clazz.metaClass.getMetaMethod(‘delete’, [] as Object[])
clazz.metaClass.delete {->
def obj = delegate
clazz.withTransaction { TransactionStatus status ->
cascadeDelete.invoke(obj, [null] as Object[])
delete.invoke(obj)
}
}

log.info(“Overriding delete(Map) method of ${domainClass}”)
MetaMethod deleteMap = clazz.metaClass.getMetaMethod(‘delete’, [Map] as Object[])
clazz.metaClass.delete { Map map ->
def obj = delegate
clazz.withTransaction { TransactionStatus status ->
Object[] args = [map] as Object[]
cascadeDelete.invoke(obj, args)
deleteMap.invoke(obj, args)
}
}
}
}

void onApplicationEvent(ApplicationEvent e) {
if (e instanceof GrailsContextEvent && e.eventType == 0) {
register()
}
}
}

What this class does is to check if a domain class has a cascadeDelete(Map) method declared, and if the method exists, it injects a new delete method in the domain class, which will call cascadeDelete and then the old delete method (the one injected by the hibernate plugin) inside a transaction.
To use it, just register it as a spring bean in resources.groovy

Now, if we declare a method in Cart like this:

void cascadeDelete(Map params) {
items*.delete()
}

We can call cart.delete() and all items will be deleted in the same transaction.
It works, but there is a problem: It will only works if the we call delete. If we do something like Cart.executeUpdate(‘delete from Cart’), cascadeDelete will not be called.
Anyway, seems like the bag support coming in grails 2.0 will address this performance problem with collections.

Great blog post.

One comment:

Item.withNewSession { items*.delete() } is highly ineffective (it does n “delete from Item where id = ?”)

This should perform much better:

Item.withNewSession{ Item.executeUpdate(“delete Item where cart=:cart”, [cart: this]) }

Yes, the executeUpdate is much better. I wonder if the withNewSession call will still be needed?

I suspect the “withNewSession” would not be needed, but have not tested (I will).

Have a look at https://github.com/bjornerik/sandbox

Which is a grails project inspired by your post trying to get some numbers. I would say this is a good pattern to use in cases with potentially many items on the many end. Like citizens in a country (like my example) or in the video you posted: Acegi`s (now Spring Security) User *—* Role where a very popular role will take a long time to add in a webapp with lots of users.

It wont work if there is a >2 hierarchy.
Say A has many B and B has many C. If you do a delete of B, you can get a integrity exception because ‘C’s exist.

This seems like a lot of complexity to solve a problem that exists regardless of Grails/GORM/Hibernate/Toplink … the core issue is really around locking, blocking, and deadlocking and large transaction scopes. Think databases and semaphores.

You see, the paradigm that Mr. Beckwith leads us into is where updates are issued to child records prior to updates [and thus locks] on the parent object. Right from the get-go, this is solution for disaster, regardless of technology used, and is a great way to create deadlocks in any application. I’ve seen it in terms of a databases and in terms of threading.

So, let’s talk about how we can address deadlocks …

First, have a plan on how tables will have DML statements issues, where that plan defines a specific order in which table locks will occur. When formulating this plan, order your inserts/updates so that parent objects are updated prior to child objects, in a top-to-bottom approach. You would do this for inserts, right? The order in which you do inserts should be a good model for the order in which to do updates and deletes (or psuedo-deletes).

Second, does it always make sense to do a real database delete? If you do a real database delete, then yes, the child objects would need 1) to be deleted first, or 2) your tables would need to support cascading deletes at the database level. So what about all of those times that folks need an ‘undo’ feature for something that they accidentally deleted? I would propose that more often than not, a real database delete could be replaced with an enabled/disabled column (aka. psuedo-delete), providing a very useful audit trail (who deleted my record?) as well has helping to facilitate our planned ordering of updating tables in a specific order.

Third, if you need stronger performance for updates and deletes, refactor your apps to do more inserts and less updates and deletes, again from the top down. Depending on the database, an update statement may be making 1) an exclusive row level lock, or 2) and exclusive write lock (still allowing non-blocking readers), or 3) no lock at all (MySql : MyIsam -vs- Inodb). Inserts however, tend to be done with higher levels of concurrency. The trick to pulling this off is to have your new ID values known in advance of issuing the DML statements to the database. Hibernate’s sequence pooling is absolutely fantastic model for this. Additionally, if need be, make ‘aggregation tables’, to store those references from parent-object to min/max child-record-ids, so that your tables are updated in the order of : parent, aggregation table, child table.

Now, back to Hibernate and Grails and Spring. I love them all, really I do. There are a lot of really smart folks that have contributed to those projects. At some point, it’s up to to the folks that are using the tools to get it right, for our specific use cases. That said, I really wish Hibernate and Grails were a bit more consistent with the order in which DML statements are issued upon flush … To solve entry-level issues with Hibernate/Grails, I find it much more natural to think like a database, and maybe place an extra flush here or there, and to use criteria queries with explicit fetching strategies on an as-needed basis.

doesn’t hibernate bags solve this issue?

i believe you can declare you collection to be of type Collection and in grails 2.0+ it maps to hibernate maps as opposed to a set or a list.

Hello,
Thanks a lot for a great blog post I read it a few days back and I went back to my drawing board to redesign my app from the suggestions of this blog post. But I came across a problem that is not discussed here but I believe is a very common scenario. In a one to many application there are lot of cases where one would want to retrieve data in one go and the user can not do a left outer join because there is no reference in the parent class. Here is a more detailed question http://stackoverflow.com/questions/16982161/grails-left-outer-join-between-unrelated-domains-one-way-one-to-many-relationsh I will really appreciate if you can have a look into it.
Thanks

Hi Sapan. You just found on one the reasons that technique doesn’t always work. Since you now must manually link tables, I think in your case (and in many cases) you should stick to the grails way of using associations, and use other techniques for improving performance.

Paul.


Comments are closed.

Liked it here?
Why not try sites on the blogroll...

%d bloggers like this: