0

I have 3 models that look like this (simplified):

User:

id
username

Resource:

id
name
description

Comment:

id
user_id (relationship User)
resource_id (relationship Resource)
data
date_created

I am trying to query the comments for a user and group them by Resource. I'd like the results to come back as: [(Resource A, [Comment, Comment, Comment, ...]), (Resource B, [Comment, Comment, ...]), (Resource X, [Comment])]

I have tried various ways of constructing this and I just can't seem to figure it out. What would be the proper way to do something like this?

EDIT

Right now the code looks like this:

contrib = db_session.query(Resource).filter(Comment.user==user, Resource.uuid==Comment.resource_id).distinct(Comment.resource_id).order_by(desc(Comment.date_created))
comments = db_session.query(Comment, Resource).filter(Comment.user==user, Comment.resource_id.in_([r.uuid for r in contrib]), Resource.uuid==Comment.resource_id).order_by(desc(Comment.date_created))

I then use some list/dictionary comprehension to combine these results into something that looks like

[{resource: Resource, comments:[Comment, Comment, Comment]}, {resource: Resource, comments:[Comment, .....]}, .....]

There has got to be a better way to do this!

0

2 Answers 2

1

You could use a custom MappedCollection to group the comments:

from sqlalchemy.orm.collections import collection, MappedCollection

class GroupedCollection(MappedCollection):

  def __init__(self):
    super(GroupedCollection, self).__init__(
      self,
      lambda e: e.resource_id # the key we want to group by
    )

  @collection.internally_instrumented
  def __setitem__(self, key, value, _sa_initiator=None):
    if key in self:
      # there is already another comment for that resource
      # we simply append the comment (or you could do something
      # more fancy here if you would like to order the comments)
      self[key]['comments'].append(value)
    else:
      # we create a new entry with a dictionary containing the
      # resource and comment
      super(GroupedCollection, self).__setitem__(
        key,
        {'resource': value.resource, 'comments': [value]},
        _sa_initiator
      )

You then add the corresponding relationship on your User class:

class User(Base):

  # ...

  grouped_comments = relationship(
    'Comment',
    collection_class=GroupedCollection
  )

Accessing it will give you the comments grouped by resource:

>>> user.grouped_comments
{
  'resource_id_1': {'resource': <Resource 1>, 'comments': [<Comment ...>, <Comment ...>]},
  'resource_id_2': {'resource': <Resource 2>, 'comments': [<Comment ...>]}
}
>>> user.grouped_comments.values()
[
  {'resource': <Resource 1>, 'comments': [<Comment ...>, <Comment ...>]},
  {'resource': <Resource 2>, 'comments': [<Comment ...>]}
]

Note that this relationship should only be used to view the related models, enabling adding/deleting models would require extra work.

Finally, if this is a pattern you would like to reproduce, you can easily create a GroupedCollection factory function where you can specify the grouping key.

Sign up to request clarification or add additional context in comments.

3 Comments

Wow - this is incredible and exactly what I was looking for. Thank you so much. I have a question: if I were to slice this, so .values()[5:10], would it actually be returning all the results under the hood and then slicing the resulting python list, or would the slice apply to the query that SQLAlchemy was issuing? Is there anyway to affect the ordering or filtering of the comments being used to populate this structure? Maybe something like .grouped_comments.filter(Comment.startswith('the')).order_by(desc(Comment.date_created)) -> would that apply? Thanks again!
Slicing u.grouped_comments.values() will slice the list and not the query. On the plus side though, the query will only run once - say you call u.grouped_comments.values()[1:3] and then u.grouped_comments.values()[4:6], the query is only emitted once (as all the comments have been loaded the first time). You have a few options for ordering your results, you can either specify an order_by in the relationship or change the way they are appended in the GroupedCollection class.
If you often have to filter your results, I would recommend using a dynamic relationship instead (setting lazy='dynamic') and implementing a separate grouping logic.
0

Please check demo: http://sqlfiddle.com/#!2/9f2ea/2

Hope this helps

1 Comment

I didn't downvote you, just so you know, but this was really helpful for understanding how to setup the query. Not exactly what I am looking for (trying to return SQLAlchemy ORM objects), but much appreciated

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.