Dynamically creating django models for m2m relationships

Occasionally when using a django ManyToManyField I find myself wanting to query the join table directly. Consider for example the models below:

class Category(models.Model):
    name = models.TextField()

class Entry(models.Model):
    data = models.TextField()
    categories = models.ManyToManyField(Category)

Suppose I have an entry, and want to find entries which are similar, defined here as having the most categories in common. I could define a through table:

class Relationship(models.Model):
    entry = models.ForeignKey('Entry2')
    category = models.ForeignKey('Category')

class Entry2(models.Model):
    data = models.TextField()
    categories = models.ManyToManyField(Category, through=Relationship)

However, if all I want to do is query the relationship (rather than store extra data with each relationship, which is the common use case for through tables), there is another way: generating the model on the fly.

Digression: Dynamic classes in python

We take a step back to briefly mention dynamically created classes in python.

Beyond "the usual" way to define classes in python using the class keyword, it is also possible to create classes dynamically. The following are equivalent:

class Spam(object):
    parrots = 2
    def speak(self):
        return "We have %s parrots" % self.parrots

>>> Ham = type(
        'Ham',
        (object, ),
        dict(
            parrots = 2,
            speak = lambda self: "We have %s parrots" % self.parrots
        )
    )

    >>> s = Spam()
    >>> s.parrots
    2
    >>> s.speak()
    'We have 2 parrots'
    >>> h = Ham()
    >>> h.parrots
    2
    >>> h.speak()
    'We have 2 parrots'

Everything in python is an object, and just like we instantiate s by calling Spam(), we can instantiate classes by calling type(). A good explanation of this (along with loads of other good stuff) can be found in this stackoverflow answer.

Back to django

Now that we know how to dynamically create classes, we can apply this knowledge to django models. Before continuing, we need a few more pieces.

Models for existing tables

Normally when you create a django model, django will create database tables for you. However, it is also possible to create django models that "sit on top of" already existing database tables. This is accomplished by specifying a db_table property on the Meta class:

class Relationship(models.Model):
    entry = models.ForeignKey('Entry')
    category = models.ForeignKey('Category')

    class Meta:
        db_table = 'app_entrey_categories'

Now for the fun bit: we can read the model metadata and create this model on the fly, at runtime.

from django.db import models

def create_m2m_model(cls, field_name):
    """ dynamically create a model to query m2m relations directly

        Example:
            m2m = create_m2m_model(Entry, 'categories')

            m2m.objects.count()
              37
            m2m.objects.all()[0].category
              <Category: Pirates>
    """

    field = cls._meta.get_field(field_name)
    subfield1 = field.m2m_field_name()
    subrelation1 = cls
    subcolumn1 = field.m2m_column_name()

    subfield2 = field.m2m_reverse_field_name()
    subrelation2 = field.rel.to
    subcolumn2 = field.m2m_reverse_name()

    table_name = field.m2m_db_table()

    module = cls.__module__

    # we only want to use this for reads, not writes
    def save(self, *args, **kwargs):
        pass

    attrs = {
        'save': save,
        subfield1: models.ForeignKey(subrelation1, db_column=subcolumn1),
        subfield2: models.ForeignKey(subrelation2, db_column=subcolumn2),
    }

    model = type(
        table_name,
        (models.Model,),
        dict(
            __module__ = module,
            Meta = type(
                'Meta',
                (),
                dict(
                    managed = False,
                    db_table = table_name,
                )
            ),
            **attrs
        )
    )
    return model

Some explanation

We use the _meta property of our django model to introspect the relationship. Note: _meta is technically an undocumented api and as such, is subject to change. Use at your own risk.

We find the types involved in the relationship, the name of the join table and the columns used. We then create a model class that is mapped to that table. Note the Meta class being created as well, as another instance of type. The only extra piece is the __module__ attribute, which is required by django. We set it explicitly to the module of the model.

Addendum

How did I use this to find similar entries?

import operator
from django.db.models import Count, Q

EntryCategories = create_m2m_model(Entry 'categories')
categories = entry.categories.all()
if categories:
    filters = reduce(
        operator.or_, (Q(category=category) for category in categories))

    related_qs = EntryCategories.objects.exclude(entry=entry.pk
        ).filter(filters
        ).values('entry').annotate(count=Count('category')
        ).order_by('-count'
        )[:5]
    # we can't just add another .values() clause to the qs
    related_pks = [value['entry'] for value in related_qs]
    related = entry.objects.filter(pk__in=related_pks)

Comments !