Fields

gcloudc includes a collection of Django model fields which leverage aspects of the Google Cloud Datastore.

ListField

This field takes advantage of the Datastore's ability to store multiple values under a single field/property name.

The field takes another field definition as its first argument, and will then store a list of values of the given field.

Example usage:

from gcloudc.db.models.fields.charfields import CharField
from gcloudc.db.models.fields.iterable import ListField

class MyModel(models.Model):
    things = ListField(CharField(), choices=[('a', 'A'), ('b', 'B')])

Note that the choices are set on the outer ListField, not on the inner Charfield.

See also Querying on iterable fields

SetField

This is essentially the same as ListField, but removes duplicate values and the values are unordered.

CharOrNoneField

A field that stores only non-empty strings or None (it won't store empty strings). This is useful if you want values to be unique but also want to allow empty values.

It is generally not good practice to put null=True on a Django CharField, and even if you do, when you edit such a field through a Django form, an empty value will be stored as an empty string rather than None.

This field will instead store empty values as None, which then allows enforcement of uniqueness for non-empty values (because multiple empty string values will violate a unique constraint, whereas multiple None values will not).

TrueOrNoneField

A field that stores either True or None. This is useful for when you want to enforce that only one object can have this field set to True. Any objects on which the field value is None will be ignored by unique constraints, thereby allowing you to set unique=True on the field to enforce that only one object can have the field set to True.

Similarly, you can use the field in a unique_together constraint, and any None values will be ignored by the constraint.

CharField

On the Cloud Datastore, every str property has the same maximum length of 1500 bytes (not characters), so the required max_length argument of Django's CharField is both ignored and unreliable.

gcloudc's CharField sets a default max_length of 1500 and enforces it in bytes, the same way as the Datastore does.

This means you can omit the max_length argument from the field, and can rely on the field to correctly validate the values to the requirements of the database.

RelatedListField

This is a specific variation of SetField which essentially stores a list of ForeignKeys.

This is a powerful and very useful alternative to Django's ManyToManyField. There is a limit to the number of related objects which can be stored in a single field, so you need to think about your architecture carefully in terms of how it will scale, but for many use cases it can be a great solution.

For example, if you have an application in which each user can browse products and add products to a list of favourites, you could have a User model with a RelatedListField of Product objects, and so long as you restrict each use to a maximum of 500 favourited products, the application can scale to an unlimited number of users without any problems.

RelatedListfield(to, limit_choices_to=None, related_name=None, on_delete=models.DO_NOTHING, **kwargs)

Example usage:

from gcloudc.db.models.fields.related import RelatedListField

class Product(models.Model):
    ...

class User(models.Model):
    favourites = RelatedListField(Product)

user = User.objects.get(pk=1)
product = Product.objects.get(name="kettle")
user.favourites.append(product)
user.save()

RelatedSetField

This is essentially the same as RelatedListField, but removes duplicate values and the values are unordered.

Computed fields

In order to ensure that all queries are efficient, the Cloud Datastore does not allow queries which perform query-time calculations on the field values, such as HOUR(field_name) or field_a > field_b. In some of these cases (such as the HOUR example), gcloudc's special indexes functionality can pre-compute these values for you and save them to an index. But sometimes you'll want to pre-compute values yourself in order to query on them.

gcloudc.db.models.fields.computed provides a set of fields which allow you to easily compute a value at save-time which can then be queried directly.

Each computed field generally takes the same values as the standard version of the field, but the first argument must be either:

  • A callable which will be passed the model instance and should return the computed value to be stored.
  • A string giving the name of a method on the model class which will return the computed value to be stored.

Example usage:

class MyModel(models.Model):
    wholesale_price = PositiveIntegerField()
    retail_price = PositiveIntegerField()

    profit = ComputedPositiveIntegerField("_calculate_profit")

    def _calculate_profit(self):
        return self.retail_price - self.wholesale_price

With all computed fields, the values are computed and saved during the execution of the save method. So if you're adding a computed field to an existing model, you'll need to re-save all existing objects before you can query on the computed values.

The available computed fields are:

  • ComputedCharField
  • ComputedTextField
  • ComputedIntegerField
  • ComputedPositiveIntegerField
  • ComputedBooleanField
  • ComputedCollationField (see below)

If you need another type of computed field, you can easily make your own using ComputedFieldMixin, like this:

from gcloudc.db.models.fields.computed import ComputedFieldMixin

class ComputedDateField(ComputedFieldMixin, models.Datefield):
    pass

ComputedCollationField works slightly differently to the others. App Engine sorts strings based on the unicode codepoints that make them up. When you have strings from non-ASCII languages this makes the sort order incorrect (e.g. Ł will be sorted after Z). This field uses the pyuca library to calculate a sort key using the Unicode Collation Algorithm, which can then be used for ordering querysets correctly.

Unlike the other computed fields, this field should be passed the name of the field whose collation it is storing as the first argument.

Example usage:

from gcloudc.db.models.fields.charfields import CharField
from gcloudc.db.models.fields.computed import ComputedCollationField

class Customer(models.Model):
    name = CharField()
    sortable_name = ComputedCollationField("name")

Customer.objects.create(name="Ale")
Customer.objects.create(name="Łukasz")
Customer.objects.create(name="Rachel")

Customer.objects.order_by("name")
[<Customer Ale>, <Customer Rachel>, <Customer Łukasz>]

Customer.objects.order_by("sortable_name")
[<Customer Ale>, <Customer Łukasz>, <Customer Rachel>]

Querying on iterable fields

ListField, SetField, RelatedListField and RelatedSetField can all be queried using the __contains and __overlap filters that you can use on the Postgres ArrayField. The __contained_by filter is not currently supported.

Note that for __contains filters, if you're only querying with a single value then you can simply do: .filter(iterable_field__contains=value) rather than .filter(iterable_field__contains=[value]).

For the RelatedListField and RelatedSetField you can pass either model instances or primary keys as values to the filter.