eat some code

Chained Queries in Django

Improving the quality of your code by extending models.QuerySet

January 2016 #models.Manager  #models.QuerySet  #ORM  #Clean code  #Django 

Writing clean code is as important as getting things working. In this article I'll show you three different ways to handle queries in Django; from worst to best.

What's clean code?

Clean code is code that you're happy to work with. In my opinion, respecting the following principles is the key to keep your code clean (for more information, I'd advise you to read Clean Code):

  • Refactor as you go; that requires automated tests
  • KISS - Keep It Simple Stupid (or Keep it Simple and Straighforward)
  • Keep the functions / methods / procedures short
  • 1 level of abstraction per function / method / procedure
  • DRY - Don't Repeat Yourself

I love that illustration where discovering new good code involves a couple of headaches (different ways of working + discovering business rules) - while the bad code involves plenty of headaches. Now, let's talk about queries in Django !

For example

Let's pretend that you're building an application to review products online. The initial models are really simple:

class Product(models.Model):
    name = models.CharField(max_length=100)
    slug = models.SlugField(max_length=120, unique=True, db_index=True)
    # ... description, link etc.
    is_approved = models.BooleanField(
        default=False, help_text="Check to show this product and its approved reviews"
    )
    created_at = models.DateTimeField(auto_now_add=True)
    updated_at = models.DateTimeField(auto_now=True)


class Review(models.Model):
    product = models.ForeignKey(Product, related_name="reviews")
    author = models.ForeignKey(User, related_name="reviews")
    rating = models.SmallIntegerField()
    is_approved = models.BooleanField(
        default=False, help_text="Check to show this review"
    )
    created_at = models.DateTimeField(auto_now_add=True, verbose_name="Date")
    updated_at = models.DateTimeField(auto_now=True)

Basically a product can have zero to many reviews. A review requires an admin approval to be shown to end users (and for its rating to be taken into account). A user can add a review to a new product - in which case that product needs admin approval too.

For example, you want to show:

  • a product page, this is: product_detail
  • a page to show the a latest reviews: latest_reviews

1.The "Easy" Way (directly in the view)

Django ORM has a really neat syntax so it's really tempting to just write the queries in the views. This is fine when getting started but will need refactoring at some point.

def product_detail(request, product_slug):
    product = get_object_or_404(Article.objects.filter(is_approved=True), slug=product_slug)
    reviews = Review.objects.filter(product=product).filter(is_approved=True)

    return render(request, 'product/detail.html', {
        'product': product,
        'reviews': reviews
    })


def latest_reviews(request):
    reviews = Review.objects.filter(product__is_approved=True).filter(is_approved=True).order_by('-created_at')

    return render(request, 'product/detail.html', {
        'reviews': reviews
    })

I can hear some developers swearing that this is really simple and perfectly fine but there are 2 obvious problems:

Problem 1: No reuse of ".filter(is_approved=True)"

That particular piece of code is likely to change in future. For example, the approval date could be recorded (instead of just a boolean) and new rules might be added such as hiding a product or a review temporary even though it was previously approved

Problem 2: Different levels of abstraction

The view is meant to handle the request from a high level of abstraction (e.g. Find the product, find its review, render the template). Therefore, it should not contain details of what rules/fields make a product visible to end users or not.

2.The "Old" Way (in the Manager)

By extending models.Manager, you can solve the second issue but the first problem (not DRY) remains:

class ProductManager(models.Manager):

    def get_published_product_by_slug(self, slug):
        return self.filter(is_approved=True).filter(slug=slug)


class ReviewManager(models.Manager):

    def get_published_product_reviews(self, product):
        return self.filter(product=product).filter(is_approved=True)
    
    def get_latest_reviews(self):
        return self.filter(product__is_approved=True)\
            .filter(is_approved=True)\
            .order_by('-created_at')
def product_detail(request, product_slug):
    try:
        product = Product.objects.get_published_product_by_slug(product_slug)
    except Product.DoesNotExist:
        raise Http404('No product matches the given query.')
    reviews = Review.objects.get_published_product_reviews(product)

    return render(request, 'product/detail.html', {
        'product': product,
        'reviews': reviews
    })


def latest_reviews(request):
    reviews = Review.objects.get_latest_reviews()

    return render(request, 'product/detail.html', {
        'reviews': reviews
    })

3.The Nice Way (in the QuerySet)

You can make it even better by extending models.QuerySet. As a result we have reusable shorter functions:

class ProductQuerySet(models.QuerySet):

    def published(self):
        return self.filter(is_approved=True)


class ReviewQuerySet(models.Manager):

    def published(self):
        return self.filter(is_approved=True)

    def by_product(self, product):
        return self.filter(product=product)

    def for_published_product(self):
        return self.filter(product__is_approved=True)

    def latest_first(self):
        return self.order_by('-created_at')
def product_detail(request, product_slug):
    product = get_object_or_404(Article.objects.published(), slug=product_slug)
    reviews = Review.objects.by_product(product).published().latest_first()

    return render(request, 'product/detail.html', {
        'product': product,
        'reviews': reviews
    })


def latest_reviews(request):
    reviews = Review.objects.for_published_product().published().latest_first()

    return render(request, 'product/detail.html', {
        'reviews': reviews
    })

The above code is the most readable and easier to update. Imagine if you had to change the publication rules as previously mentioned. Imagine the need for a new feature: override the order of the reviews in the admin.

Bonus: Some extra love

The method by_product(product) obviously expects a Product instance as a parameter. In Django, if you run Review.objects.by_product(john), it will "work" !! - that would return results based on the User ID. To make debugging easier it is helpful to add assumptions (especially the obvious ones) in the form of "pre-conditions":

    def by_product(self, product):
        assert isinstance(product, Product)
        return self.filter(product=product)

Lovely, isn't it?

Image Credits

Chains Picture - by Julia Freeman-Woolpert via Free Images