Concepts & Limitations
Cloud Datastore
It is strongly recommended that you read the Cloud Datastore API documentation before using this ORM backend. Understanding of the Datastore vs SQL will help avoid unexpected surprises!
The Google Cloud Datastore is not your traditional SQL database, and for that reason the Datastore backend doesn't support all of the functionality of the Django ORM (although it supports the majority). Also, some things don't always work the way you'd expect. As the Datastore is a No-SQL database, anything relying on cross-table queries or aggregates is basically unsupported.
Here are some of the limitations and differences:
- There is no support for savepoints, nested atomic() blocks are effectively a no-op
- No support for select_related(), although prefetch_related() works
- No support for cross-table ordering
- Only up-to 500 entities can be read or written inside an atomic() block
- No support for aggregate queries (yet)
- Queries can only contain a single inequality operation (gt, lt, lte, gte, isnull=False), and the resultset must be ordered by the field you're testing for inequality
The advantage of course is that you can build your Django application for near-infinite scalability of data, and increased uptime.
Further differences are discussed below.
Foreign Keys
The Cloud Datastore doesn't have the concept of a foreign key constraint,
so relations are not enforced at the database level in the same way that they would be on a classic SQL database,
but the ForeignKey
field is supported and generally works entirely as expected.
Many-to-many relationships
As the Cloud Datastore can't do JOIN
queries between tables, Django's ManyToManyField
is not supported,
but gcloudc provides RelatedSetField
and RelatedListField
which provide functionality for many-to-many relationships.
See fields.
Migrations
As the Cloud Datastore is schemaless, the concept of migrations doesn't really apply in the same way, as there's no schema to update. So in most cases, where you would normally need to run a migration on a SQL database (e.g. to add a new table or new column), you can generally do without a migration with gcloudc, because the required schema will just be applied to new rows as they are saved.
There are however a few occasions when you might need to apply some form of "migration". gcloudc does not currently provide functionality for applying these changes, but it may do in the future.
Deleting fields
If you delete a field but do not apply a migration to delete the "column" from the database, then as with a SQL database, Django will ignore any columns which are not defined in the model, and so the existing data for this deleted field will remain in the DB. If you want to delete the data from the old column, then you'll need to do this directly on the Datastore.
Example: you've committed crimes against the internet by storing users' plain-text passwords in your database and need to delete them.
class User(models.Model):
name = CharField()
# raw_password = models.CharField(max_length=10)
# You've removed the offending field, but the data is still in the entities in the Datastore.
# Let's fix that...
from google.cloud.datastore.client import Client
client = Client(project="my-cloud-project-id", namespace=None)
query = client.query(User._meta.db_table, namespace=None)
results = query.Run(limit=None)
for entity in results:
del entity["raw_password"]
client.put(entity)
Making new fields queryable
If you add a new field to a model, then so long as you have provided a default
for the field,
you can immediately create/update model instances with this new field without the need to run any sort of migration.
However, if you want to be able to query the existing objects on this new field, then you will first need to re-save the existing objects in order to populate the new value into the rows in the DB. Without doing this, rows in which the new field has not yet been saved will not be returned by the query.
Complex queries
The Cloud Datastore can only perform queries using an index. In other words, it won't do a table scan. As such, it can generally only perform queries which can be done by working its way through a single, ordered index. Therefore some types of queries are not possible, or can only be performed with additional trickery.
Inequalities
Queries which contain multiple inequality filters, e.g. .filter(field_a__gte=x, field_b__gte=y)
are not possible.
Inequalities are any of __gt
, __lt
, __lte
, __gte
, __isnull=False
.
Additionally, if your query contains an inequality then it must be ordered by the field you're testing for inequality.