カスタムのルックアップ

Django には (たとえば exacticontains などの) フィルタリングを行う built-in lookups が数多くあります。このドキュメントでは、カスタムルックアップをどのように作るかや、既存のルックアップの動作をどのようにして変更するかについて説明します。ルックアップの API リファレンスについては Lookup API reference を参照してください。

シンプルなルックアップの例

シンプルなカスタムルックアップからはじめましょう。これから、ne というカスタムルックアップを書いていきます。このルックアップは、exact と反対の動作をします。Author.objects.filter(name__ne='Jack') は、次の SQL に変換されます。

"author"."name" <> 'Jack'

この SQL はバックエンドに依存しない書き方になっているため、別のデータベースについて心配する必要はありません。

このルックアップを動作させるためには、2つのステップが必要です。第1に、ルックアップを実装する必要があります。第2に、Django に実装したルックアップの情報を伝えて上げる必要があります。実装に関しては、次のようにとてもわかりやすい形です。

from django.db.models import Lookup

class NotEqual(Lookup):
    lookup_name = 'ne'

    def as_sql(self, compiler, connection):
        lhs, lhs_params = self.process_lhs(compiler, connection)
        rhs, rhs_params = self.process_rhs(compiler, connection)
        params = lhs_params + rhs_params
        return '%s <> %s' % (lhs, rhs), params

NotEqual ルックアップを登録する方法は、ルックアップを利用できるようにしたいフィールドクラス上で register_lookup を呼び出すだけです。今回のルックアップは、Field のすべてのサブクラス上で意味を持つため、親クラスの Field 上で直接登録しましょう。

from django.db.models.fields import Field
Field.register_lookup(NotEqual)

ルックアップは、デコレータパタンを使っても登録できます。

from django.db.models.fields import Field

@Field.register_lookup
class NotEqualLookup(Lookup):
    # ...

これ以降、どのようなフィールドであっても foo に対しては foo__ne が利用できます。なお、これを用いるクエリーセットを生成するならば、それ以前にルックアップの登録を行っておく必要があります。実装する場所は models.py ファイル内で行ってもよいし、AppConfig 内の ready() メソッドにてルックアップの登録を行うのでも構いません。

実装の詳細について見ると、最初に必要な属性は lookup_name です。この属性があると、ORM が name_ne を解釈できるようになり、NotEqual を使って SQL を生成できます。慣習として、こうした名前は常に小文字のアルファベットのみからなる文字列にしますが、必ず守らなければならない制約は、文字列 __ を決して含んではならないということです。

次に、"as_sql" メソッドを定義する必要があります。これには "compiler" という"SQL Compiler" オブジェクトと、アクティブなデータベース接続を必要とします。"SQL Compiler" オブジェクトについてのドキュメントはありませんが、それがSQL文字列を含むタプルと文字列に挿入されるパラメータを返す``compile()`` メソッドを持つ、ということだけ知っていれば十分です。ほとんどの場合、これを直接用いる必要はなく、process_lhs() および process_rhs() に渡すことができます。

A Lookup works against two values, lhs and rhs, standing for left-hand side and right-hand side. The left-hand side is usually a field reference, but it can be anything implementing the query expression API. The right-hand is the value given by the user. In the example Author.objects.filter(name__ne='Jack'), the left-hand side is a reference to the name field of the Author model, and 'Jack' is the right-hand side.

We call process_lhs and process_rhs to convert them into the values we need for SQL using the compiler object described before. These methods return tuples containing some SQL and the parameters to be interpolated into that SQL, just as we need to return from our as_sql method. In the above example, process_lhs returns ('"author"."name"', []) and process_rhs returns ('"%s"', ['Jack']). In this example there were no parameters for the left hand side, but this would depend on the object we have, so we still need to include them in the parameters we return.

最後に、これらの部分を `` <> ``を使ってSQL式にまとめ、クエリのすべてのパラメータを指定します。その後、生成されたSQL文字列とパラメータを含むタプルを返します。

シンプルな変換の例

The custom lookup above is great, but in some cases you may want to be able to chain lookups together. For example, let's suppose we are building an application where we want to make use of the abs() operator. We have an Experiment model which records a start value, end value, and the change (start - end). We would like to find all experiments where the change was equal to a certain amount (Experiment.objects.filter(change__abs=27)), or where it did not exceed a certain amount (Experiment.objects.filter(change__abs__lt=27)).

注釈

This example is somewhat contrived, but it nicely demonstrates the range of functionality which is possible in a database backend independent manner, and without duplicating functionality already in Django.

We will start by writing an AbsoluteValue transformer. This will use the SQL function ABS() to transform the value before comparison:

from django.db.models import Transform

class AbsoluteValue(Transform):
    lookup_name = 'abs'
    function = 'ABS'

Next, let's register it for IntegerField:

from django.db.models import IntegerField
IntegerField.register_lookup(AbsoluteValue)

We can now run the queries we had before. Experiment.objects.filter(change__abs=27) will generate the following SQL:

SELECT ... WHERE ABS("experiments"."change") = 27

By using Transform instead of Lookup it means we are able to chain further lookups afterwards. So Experiment.objects.filter(change__abs__lt=27) will generate the following SQL:

SELECT ... WHERE ABS("experiments"."change") < 27

Note that in case there is no other lookup specified, Django interprets change__abs=27 as change__abs__exact=27.

This also allows the result to be used in ORDER BY and DISTINCT ON clauses. For example Experiment.objects.order_by('change__abs') generates:

SELECT ... ORDER BY ABS("experiments"."change") ASC

And on databases that support distinct on fields (such as PostgreSQL), Experiment.objects.distinct('change__abs') generates:

SELECT ... DISTINCT ON ABS("experiments"."change")
Changed in Django 2.1:

Ordering and distinct support as described in the last two paragraphs was added.

When looking for which lookups are allowable after the Transform has been applied, Django uses the output_field attribute. We didn't need to specify this here as it didn't change, but supposing we were applying AbsoluteValue to some field which represents a more complex type (for example a point relative to an origin, or a complex number) then we may have wanted to specify that the transform returns a FloatField type for further lookups. This can be done by adding an output_field attribute to the transform:

from django.db.models import FloatField, Transform

class AbsoluteValue(Transform):
    lookup_name = 'abs'
    function = 'ABS'

    @property
    def output_field(self):
        return FloatField()

This ensures that further lookups like abs__lte behave as they would for a FloatField.

Writing an efficient abs__lt lookup

When using the above written abs lookup, the SQL produced will not use indexes efficiently in some cases. In particular, when we use change__abs__lt=27, this is equivalent to change__gt=-27 AND change__lt=27. (For the lte case we could use the SQL BETWEEN).

So we would like Experiment.objects.filter(change__abs__lt=27) to generate the following SQL:

SELECT .. WHERE "experiments"."change" < 27 AND "experiments"."change" > -27

実装は次のようになります。

from django.db.models import Lookup

class AbsoluteValueLessThan(Lookup):
    lookup_name = 'lt'

    def as_sql(self, compiler, connection):
        lhs, lhs_params = compiler.compile(self.lhs.lhs)
        rhs, rhs_params = self.process_rhs(compiler, connection)
        params = lhs_params + rhs_params + lhs_params + rhs_params
        return '%s < %s AND %s > -%s' % (lhs, rhs, lhs, rhs), params

AbsoluteValue.register_lookup(AbsoluteValueLessThan)

There are a couple of notable things going on. First, AbsoluteValueLessThan isn't calling process_lhs(). Instead it skips the transformation of the lhs done by AbsoluteValue and uses the original lhs. That is, we want to get "experiments"."change" not ABS("experiments"."change"). Referring directly to self.lhs.lhs is safe as AbsoluteValueLessThan can be accessed only from the AbsoluteValue lookup, that is the lhs is always an instance of AbsoluteValue.

Notice also that as both sides are used multiple times in the query the params need to contain lhs_params and rhs_params multiple times.

The final query does the inversion (27 to -27) directly in the database. The reason for doing this is that if the self.rhs is something else than a plain integer value (for example an F() reference) we can't do the transformations in Python.

注釈

In fact, most lookups with __abs could be implemented as range queries like this, and on most database backends it is likely to be more sensible to do so as you can make use of the indexes. However with PostgreSQL you may want to add an index on abs(change) which would allow these queries to be very efficient.

双方向変換の例

The AbsoluteValue example we discussed previously is a transformation which applies to the left-hand side of the lookup. There may be some cases where you want the transformation to be applied to both the left-hand side and the right-hand side. For instance, if you want to filter a queryset based on the equality of the left and right-hand side insensitively to some SQL function.

Let's examine the simple example of case-insensitive transformation here. This transformation isn't very useful in practice as Django already comes with a bunch of built-in case-insensitive lookups, but it will be a nice demonstration of bilateral transformations in a database-agnostic way.

We define an UpperCase transformer which uses the SQL function UPPER() to transform the values before comparison. We define bilateral = True to indicate that this transformation should apply to both lhs and rhs:

from django.db.models import Transform

class UpperCase(Transform):
    lookup_name = 'upper'
    function = 'UPPER'
    bilateral = True

Next, let's register it:

from django.db.models import CharField, TextField
CharField.register_lookup(UpperCase)
TextField.register_lookup(UpperCase)

Now, the queryset Author.objects.filter(name__upper="doe") will generate a case insensitive query like this:

SELECT ... WHERE UPPER("author"."name") = UPPER('doe')

Writing alternative implementations for existing lookups

Sometimes different database vendors require different SQL for the same operation. For this example we will rewrite a custom implementation for MySQL for the NotEqual operator. Instead of <> we will be using != operator. (Note that in reality almost all databases support both, including all the official databases supported by Django).

We can change the behavior on a specific backend by creating a subclass of NotEqual with an as_mysql method:

class MySQLNotEqual(NotEqual):
    def as_mysql(self, compiler, connection):
        lhs, lhs_params = self.process_lhs(compiler, connection)
        rhs, rhs_params = self.process_rhs(compiler, connection)
        params = lhs_params + rhs_params
        return '%s != %s' % (lhs, rhs), params

Field.register_lookup(MySQLNotEqual)

We can then register it with Field. It takes the place of the original NotEqual class as it has the same lookup_name.

When compiling a query, Django first looks for as_%s % connection.vendor methods, and then falls back to as_sql. The vendor names for the in-built backends are sqlite, postgresql, oracle and mysql.

Django がルックアップと変換のいずれを使うかを決定する仕組み

In some cases you may wish to dynamically change which Transform or Lookup is returned based on the name passed in, rather than fixing it. As an example, you could have a field which stores coordinates or an arbitrary dimension, and wish to allow a syntax like .filter(coords__x7=4) to return the objects where the 7th coordinate has value 4. In order to do this, you would override get_lookup with something like:

class CoordinatesField(Field):
    def get_lookup(self, lookup_name):
        if lookup_name.startswith('x'):
            try:
                dimension = int(lookup_name[1:])
            except ValueError:
                pass
            else:
                return get_coordinate_lookup(dimension)
        return super().get_lookup(lookup_name)

You would then define get_coordinate_lookup appropriately to return a Lookup subclass which handles the relevant value of dimension.

There is a similarly named method called get_transform(). get_lookup() should always return a Lookup subclass, and get_transform() a Transform subclass. It is important to remember that Transform objects can be further filtered on, and Lookup objects cannot.

When filtering, if there is only one lookup name remaining to be resolved, we will look for a Lookup. If there are multiple names, it will look for a Transform. In the situation where there is only one name and a Lookup is not found, we look for a Transform and then the exact lookup on that Transform. All call sequences always end with a Lookup. To clarify:

  • .filter(myfield__mylookup) will call myfield.get_lookup('mylookup').
  • .filter(myfield__mytransform__mylookup) will call myfield.get_transform('mytransform'), and then mytransform.get_lookup('mylookup').
  • .filter(myfield__mytransform) will first call myfield.get_lookup('mytransform'), which will fail, so it will fall back to calling myfield.get_transform('mytransform') and then mytransform.get_lookup('exact').