Datastore Overview

The App Engine datastore provides robust, scalable storage for your web application, with an emphasis on read and query performance. An application creates entities, with data values stored as properties of an entity. The app can perform queries over entities. All queries are pre-indexed for fast results over very large data sets.

Introducing the Datastore

App Engine provides two different data storage options differentiated by their availability and consistency guarantees:

In the High Replication datastore, data is replicated across data centers using a system based on the Paxos algorithm. High Replication provides very high availability for reads and writes (at the cost of higher-latency writes). Most queries are eventually consistent. Storage quota and CPU costs are approximately three times those in Master/Slave option. The High Replication datastore is the only datastore type supported by Python 2.7.
The Master/Slave datastore uses a master-slave replication system, which asynchronously replicates data as you write it to a physical data center. Since only one data center is the master for writing at any given time, this option offers strong consistency for all reads and queries, at the cost of periods of temporary unavailability during data center issues or planned downtime. This option also offers the lowest storage and CPU costs for storing data. Not supported in Python 2.7.

For more information, please see Choosing a Datastore.

The App Engine datastore saves data objects, known as entities. An entity has one or more properties, named values of one of several supported data types. For instance, a property can be a string, an integer, or even a reference to another entity.

The datastore can execute multiple operations in a single transaction. By definition, a transaction cannot succeed unless every operation in the transaction succeeds. If any of the operations fail, the transaction is automatically rolled back. This is especially useful for distributed web applications, where multiple users may be accessing or manipulating the same data at the same time.

Unlike traditional databases, the datastore uses a distributed architecture to automatically manage scaling to very large data sets. It is very different from a traditional relational database in how it describes relationships between data objects. Two entities of the same kind can have different properties. Different entities can have properties with the same name, but different value types. While the datastore interface has many of the same features of traditional databases, the datastore's unique characteristics imply a different way of designing and managing data to take advantage of the ability to scale automatically. This documentation explains how to design your application to take the greatest advantage of the datastore's distributed architecture.

Introducing the Python Datastore API

In Python, datastore entities are created from Python objects; the object's properties become properties for the entity. To create a new entity, you call the parent entity's base class (if desired), set the object's attributes, then save the object (by calling a method such as put()). The base class becomes the name of the entity's kind. Updating an existing entity requires you to get the entity's object (for example, by using a query), modify its properties, and then save it with the new properties.

Datastore entities are schemaless: Two entities of the same kind need not have the same properties; nor do they need to use the same value types for the same properties. The application is responsible for ensuring that entities conform to a schema when needed. For this purpose, the Python SDK includes a rich library of data modeling features that make enforcing a schema easy.

In the Python API, a model describes a kind of entity, including the types and configuration for its properties. An application defines a model using Python classes, with class attributes describing the properties. Entities of the same kind are represented by instances of the corresponding model class, with instance attributes representing the property values. An entity can be created by calling the constructor of the class, then stored by calling the put() method.

import datetimefrom google.appengine.ext import dbfrom google.appengine.api import users
class Employee(db.Model):
    name = db.StringProperty(required=True)
    role = db.StringProperty(required=True, choices=set(["executive", "manager", "producer"]))
    hire_date = db.DateProperty()
    new_hire_training_completed = db.BooleanProperty(indexed=False)
    account = db.UserProperty()

e = Employee(name="John",
             role="manager",
             account=users.get_current_user())
e.hire_date = datetime.datetime.now().date()
e.put()

The datastore API provides two interfaces for queries: a query object interface, and a SQL-like query language called GQL. A query returns entities in the form of instances of the model classes that can be modified and put back into the datastore.

training_registration_list = [users.User("Alfred.Smith@example.com"),
                              users.User("jharrison@example.com"),
                              users.User("budnelson@example.com")]
employees_trained = db.GqlQuery("SELECT * FROM Employee WHERE account IN :1",
                                training_registration_list)
for e in employees_trained:
    e.new_hire_training_completed = True
    db.put(e)

Entities and Properties

A data object in the App Engine datastore is known as an entity. An entity has one or more properties, named values of one of several data types, including integers, floating point values, strings, dates, binary data, and more.

Each entity also has a key that uniquely identifies the entity. The simplest key has a kind and an entity ID provided by the datastore. The kind categorizes the entity so you can query it more easily. The entity ID can also be a string provided by the application.

An application can fetch an entity from the datastore by using its key, or by performing a query that matches the entity's properties. A query can return zero or more entities, and can return the results sorted by property values. A query can also limit the number of results returned by the datastore to conserve memory, run time, and CPU time.

Unlike relational databases, the App Engine datastore does not require that all entities of a given kind have the same properties. The application can specify and enforce its data model using libraries included with the SDK, or its own code.

A property can have one or more values. A property with multiple values can have values of mixed types. A query on a property with multiple values tests whether any of the values meets the query criteria. This makes such properties useful for testing for membership.

Queries and Indexes

An App Engine datastore query operates on every entity of a given kind (a data class). It specifies zero or more filters on entity property values and keys, and zero or more sort orders. If a given entity has at least one (possibly null) value for every property in the filters and sort orders, and all the filter criteria are met by the property values, then that entity is returned as a result.

Every datastore query uses an index, a table that contains the results for the query in the desired order. An App Engine application defines its indexes in a configuration file (although indexes for some types of queries are provided automatically). The development web server automatically adds suggestions to this file when it encounters queries that do not yet have indexes configured. You can tune indexes manually by editing the file before uploading the application. As the application changes datastore entities, the datastore updates the indexes with the correct results. When the application executes a query, the datastore fetches the results directly from the corresponding index.

This mechanism supports a wide range of queries and is suitable for most applications. However, it does not support some kinds of queries common in other database technologies. In particular, joins and aggregate queries aren't supported.

Transactions and Entity Groups

With the App Engine datastore, every attempt to create, update or delete an entity happens in a transaction. A transaction ensures that every change made to the entity is saved to the datastore, or, in the case of failure, none of the changes are made. This ensures consistency of data within an entity.

You can perform multiple actions on an entity within a single transaction using the transaction API. For example, say you want to increment a counter field in an object. To do so, you need to read the value of the counter, calculate the new value, then store it. Without a transaction, it is possible for another process to increment the counter between the time you read the value and the time you update the value, causing your app to overwrite the updated value. Doing the read, calculation, and write in a single transaction ensures that no other process interferes with the increment.

You can make changes to multiple entities within a single transaction using entity groups. You declare that an entity belongs to an entity group when you create the entity. For apps using the Master-Slave datastore, all entities fetched, created, updated, or deleted in a transaction must be in the same entity group. For apps using the High Replication datastore (HRD), the entities in a transaction can either be in a single entity group or they can be in different entity groups. (See cross-group transactions).

Entity groups are defined by a hierarchy of relationships between entities. To create an entity in a group, you declare that the entity is a child of another entity already in the group. The other entity is the parent. An entity created without a parent is a root entity. A root entity without any children exists in an entity group by itself. Each entity has a path of parent-child relationships from a root entity to itself (the shortest path being no parent). This path is an essential part of the entity's complete key. A complete key can be represented by the kind and ID or key name of each entity in the path.

The datastore uses optimistic concurrency to manage transactions. While one app instance is applying changes to entities in an entity group, all other attempts to update the group, either by updating existing entities or creating new entities, fail on commit. The app can try the transaction again to apply it to the updated data. Note that because the datastore works this way, using entity groups limits the number of concurrent writes you can do on any entity in that group.

Cross-Group Transactions

Applications using High Replication Datastore (HRD) can perform transactions on entities that belong to different entity groups. This feature is called cross-group transactions, or XG transactions for short. XG transactions give you more flexibility in deciding how to divide your data amongst entity groups because you are not forced to put two disparate pieces of data in the same entity group just because you need atomic writes on that data.

XG transactions can be used across a maximum of 5 entity groups. An XG transaction will succeed as long as no concurrent transaction touches any of the entity groups used in the transaction, which is an extension of the behavior users experience with single-group transactions. An XG transaction that only touches a single entity group will behave like a single entity group transaction. With regard to billing and quota impact, operations within an XG transaction have the same performance and cost as the equivalent single-group transactions, but the commit itself will be slower.

Similar to single entity group transactions, you cannot perform a non ancestor query in an XG transaction. However you can perform ancestor queries on separate entity groups.

Note: The first read of an entity group in an XG transaction may throw a TransactionFailedError exception if there is a conflict with other transactions accessing that entity group. This means that an XG transaction that performs only reads can fail with a concurrency exception.

Non transactional (non ancestor) queries may see all, some, or none of the results of a previously committed transaction. (For background on this issue, see Understanding Datastore Writes: Commit, Apply, and Data Visibility.) However, such non transactional queries are more likely to see the results of a partially committed XG transaction than the results of a partially commited single-entity group transaction.

Differences From SQL

The App Engine datastore differs from a traditional relational database in several important ways.

The App Engine datastore is designed to scale, allowing apps to maintain high performance as they receive more traffic. Datastore writes scale by automatically distributing data as necessary. Datastore reads scale because the only supported queries are those whose performance scales with the size of the result set (as opposed to the data set). This means that a query whose result set contains 100 entities performs the same whether it searches over a hundred entities or a million entities. This property is the key reason some types of queries are not supported.

Because all queries on App Engine are served by pre-built indexes, the types of queries that can be executed are more restrictive than those allowed on a relational database with SQL. No joins are supported in the datastore. The datastore also does not allow inequality filtering on multiple properties or filtering of data based on results of a sub-query.

Unlike traditional relational databases, the App Engine datastore doesn't require data kinds to have a consistent property set (although you can choose to enforce this requirement in your application's code). When querying the datastore, it is not currently possible to return only a subset of kind properties. The App Engine datastore can either return entire entities or only entity keys from a query.

For more in-depth information on the design of the datastore, read our Mastering the datastore series of articles.

Understanding Datastore Writes: Commit, Apply, and Data Visibility

Note: For a full discussion of this topic, see Life of a Datastore Write and Transaction Isolation in App Engine.

For App Engine apps, data is written to the datastore in two phases: Commit and Apply. The Commit phase occurs first; in it, the entity data is recorded in certain logs. The Apply phase occurs after the Commit phase. The Apply phase consists of two actions done in parallel: (a) the entity data is written, and (b) the index rows for the entity are written. (Notice that it can take longer for the index rows to be written than for the entity data to be written.) For apps using Master-Slave datastore (not recommended), the datastore usually returns after everything is written, that is, after the end of the Apply phase. For apps using High Replication Datastore (HRD), the datastore returns after the Commit phase and then the Apply phase is done asynchronously.

If there is a failure during the Commit phase, there are automatic retries, but if failures continue, the datastore returns an error message that your app receives as an exception. If the Commit phase succeeds but the Apply fails, the Apply is rolled forward to completion when one of the following occurs:

Periodic datastore "sweeps" check for uncompleted Commit jobs and applies them.
The next app read*, write, or transaction that uses the impacted entity group causes the committed but not applied changes to be applied before the read, write, or transaction.

Note: *For HRD, reads that causes an apply are: get or ancestor query. For Master-Slave, reads that cause an apply are: get or ancestor query in a transaction.

Data Visibility after Datastore Writes

The datastore write behavior described above has an impact on how and when data is visible to your app at different parts of the Commit and Apply phases. Data visibility is usually not an issue for an app using Master-Slave datastore, because the entire transaction is normally completely applied before the datastore returns. However, for apps using HRD the transaction may not be completely applied for a few hundred milliseconds or so after the datastore returns. In this event, data can be visible with updates that are only partially complete. The following list shows how your app might be impacted by this:

Subsequent datastore gets, writes, and ancestor queries always reflect the results of the Commit phase because these operations apply any outstanding modifications before executing. This means that a request looking up an updated entity by its key (a get) is guaranteed to see the latest version of that entity, even if the indexes for the entity are not yet completed.
Queries spanning more than one entity group cannot determine if there are any outstanding modifications before executing and may return stale or partially applied results.
Concurrent requests executing a query whose predicate (the 'where clause' in SQL/GQL) is not satisfied by the pre-update entity—but is satisfied by the post-update entity—the entity will only be part of the result set if the query executes after the apply operation has reached the end of the Apply phase, after the indexes are written.

Datastore Statistics

The datastore maintains statistics about the data stored for an application, such as how many entities there are of a given kind, or how much space is used by property values of a given type. You can view these statistics in the Administration Console, under Datastore > Statistics.

You can also access these values programmatically within the application by querying for specially named entites using the datastore API. For more information, see Datastore Statistics.

Quotas and Limits

Each call to the datastore API counts toward the Datastore API Calls quota. Note that some library calls result in multiple calls to the API, and so use more of your quota.

Data sent to the datastore by the app counts toward the Data Sent to (Datastore) API quota. Data received by the app from the datastore counts toward the Data Received from (Datastore) API quota.

The total amount of data currently stored in the datastore for the app cannot exceed the Stored Data (billable) quota. This includes all entity properties and keys and the indexes necessary to support querying these entities. See How Entities and Indexes are Storedfor a complete breakdown of the metadata required to store entities and indexes at the Bigtable level.

The amount of CPU time consumed by datastore operations applies to the following quotas:

CPU Time (billable)
Datastore CPU Time

For more information on quotas, see Quotas, and the "Quota Details" section of the Admin Console.

In addition to quotas, the following limits apply to the use of the datastore:

Limit	Amount
maximum entity size	1 megabyte
maximum number of values in all indexes for an entity (1)	5,000 values
An entity uses one value in an index for every column × every row that refers to the entity, in all indexes. The number of index values for an entity can grow large if an indexed property has multiple values, requiring multiple rows with repeated values in the table.

Entities, Properties, and Keys

The App Engine datastore is best understood as an object database. Each data record is an entity, represented in code as an object. Each entity has a key that uniquely identifies the entity across all entities in the datastore. Each entity has one or more namedproperties, represented as attributes of the object. Keys and properties are the basis of many useful features of the datastore.

Overview

Objects in the App Engine datastore are known as entities. Each entity in the datastore has a key that uniquely identifies the entity across all entities. A key has several components:

The kind of the entity. (See Kinds, IDs, and Names.)
An identifier that is either a name assigned to the entity by the app or a numeric ID assigned by the datastore. (See Kinds, IDs, and Names.)
A path, an optional part that specifies another entity as this entity's parent. (See Entity Groups and Ancestor Paths.)

Each entity has one or more properties, named values of one of several data types. Supported data types include integers, floating point values, strings, dates, binary data, and others. A property can have one or more values, and a property with multiple values can have values of mixed types. For details, see Properties and Value Types.

An app can fetch an entity from the datastore using its key, or it can perform a query based on the entity's key or property values. After an entity has been created, its key can't be changed. For more information on queries, see the Queries and Indexes section.

An app can only access entities that it created itself. An app can't access data that belongs to other apps.

The Python SDK includes a data modeling library for representing datastore entities as instances of Python classes, and for storing and retrieving those instances in the datastore. The data modeling features of this API is described in Data Modeling in Python.

Here is a brief example of defining a data class using the data modeling API, creating an instance of this class, and storing it as an entity in the datastore:

import datetimefrom google.appengine.ext import db
class Employee(db.Model):
    first_name = db.StringProperty()
    last_name = db.StringProperty()
    hire_date = db.DateProperty()
    attended_hr_training = db.BooleanProperty()
# ...
        employee = Employee(first_name='Antonio',
                            last_name='Salieri')
        employee.hire_date = datetime.datetime.now().date()
        employee.attended_hr_training = True

        employee.put()

This example defines a class for modeling data entities. The class is a subclass of db.Model. Instances of this class represent entities of the kind 'Employee', where the name of the kind is derived from the name of the class.

The Employee class also declares four properties for the data model: 'first_name', 'last_name', 'hire_date', and'attended_hr_training'. The Model superclass ensures that the attributes of Employee objects conform to this model. For example, an attempt to assign a string value to the hire_date attribute would result in a runtime error, since the data model forhire_date was declared as a db.DateProperty.

Note: The datastore itself does not enforce any restrictions on the structure of entities, such as whether a given property has a value of a particular type. This task is left to the application code and the data modeling library. For more information, see Data Modeling in Python.

Kinds, IDs, and Names

Each datastore entity is of a particular kind, which is simply a name specified by the app. A kind categorizes the entity for the purpose of queries. For example, a Human Resources application might represent each employee at a company with an entity of the kind "Employee." Unlike rows in a table, two entities of the same kind need not have the same properties. If required, an app can establish such a restriction in its data model.

In addition to a kind, each entity has an identifier. The identifier is assigned in one of two ways:

An app can assign its own identifier (called the key name) for use in the key.
An app can have the datastore assign a numeric ID when the entity is first stored. The datastore will never pick the same ID for two entities with the same parent, or for two entities without a parent. Be aware, however, that the datastore is not guaranteed to avoid app-assigned IDs. It is possible, though unlikely, that the datastore will assign a numeric ID that will cause a conflict with an entity with an app-assigned ID. The only way to avoid this problem is to have your app use allocate_ids() to obtain a batch of IDs. The datastore's automatic ID generator will not use IDs that have been obtained using allocate_ids(), so your app can use these IDs without conflict.

Because the identifier is part of the key, you cannot change the key name (or datastore-assigned ID) once the entity has been created.

The kind of the entity is derived from the name of a Python class that you define in your app. This class must be a subclass ofdb.Model. In the example above, the Employee class defines a data model for entities of the kind 'Employee'.

You specify whether an entity ought to use an app-assigned key name string or a system-assigned numeric ID as its identifier when you create the object. To set a key name, provide the key_name named argument to the model class constructor:

        # Create an entity with the key Employee:'asalieri'.
        employee = Employee(key_name='asalieri')

To specify that the datastore should assign a unique numeric ID automatically, simply omit the argument:

        # Create an entity with a key such as Employee:8261.
        employee = Employee()

Note: In addition to arguments such as key_name, the Model constructor accepts initial values for properties as keyword arguments. This makes it inconvenient to have a property named key_name. For more information, see The Model Class: Disallowed Property Names.

Entity Groups and Ancestor Paths

When you use the App Engine datastore, every attempt to create, update, or delete an entity happens in a transaction. A transaction ensures that every change made to the entity is saved to the datastore, or, in the case of failure, none of the changes are made. This ensures consistency of data within an entity. This section discusses transactions briefly. Details on transactions are available in theTransactions section.

Note: If your app receives an exception when submitting a transaction, it does not always mean that the transaction failed. You can receive Timeout, TransactionFailedError, or InternalError exceptions in cases where transactions have been committed and eventually will be applied successfully. Whenever possible, make your datastore transactions idempotent so that if you repeat a transaction, the end result will be the same.

You can perform a single transaction with multiple entities as long as the entities belong to the same entity group. When you design your data model, you should determine which entities you want to be able to process in the same transaction. Then, when you create each entity, you declare that it belongs to the same group as another entity. This tells App Engine that the entities will be updated together, so they can be stored in a way that supports transactions. You define entity groups by specifying a hierarchy among entities. All the entities fetched, created, updated, or deleted in a single transaction must be in the same entity group.

In the High Replication Datastore, entity groups are also a unit of consistency. The only way to guarantee that the results of a query are strongly consistent is to use an ancestor query.

To create an entity in a group, you declare that another entity is the parent of the new entity when you create it. An entity created without a parent is a root entity. A root entity without any children is an entity group by itself. Each entity's key contains a path of entities starting with the entity group root, which is just the entity itself when the entity has no parent. This path is an essential part of the entity's complete key. A complete key can be represented by the kind and ID (or key name) of each entity in the path.

An entity that is a parent for another entity can also have a parent. A chain of parent entities from an entity up to the root is the path for the entity, and members of the path are the entity's ancestors. The parent of an entity is defined when the entity is created, and cannot be changed later.

To specify that an entity be created in an existing entity group, provide the parent argument to the model class constructor. The value of this argument can be the model class instance of the parent entity, or a db.Key value that represents the entity:

        employee = Employee()
        employee.put()

        # Create an Address in the same entity group as the Employee by setting
        # the Employee as the Address's parent.
        address = Address(parent=employee)

        # ...using the db.Key of the Employee.
        e_key = employee.key()
        address = Address(parent=e_key)

The complete key of an entity is the key of the parent, if any, followed by the kind of the entity, followed by the key name or system ID. The key of the parent may also include a parent, so the complete key represents the complete path of ancestors from the root entity for the group to the entity itself. In this example, if the key for the Employee entity is Employee:8261, the key for the address may be:

Employee:8261 / Address:1

Properties and Value Types

The data for each entity is stored in one or more properties. Each property has a name and at least one value. Each value is of one of several supported data types, such as Unicode string, integer, date-time, or byte string.

A property can have multiple values. Each value can be of a different type. Multiple values are useful, for example, when you perform queries with equality filters. When using equality filters, if any value of the property matches the filter, the query will return the entity. For more details on multi-valued properties, including issues you should be aware of, see Queries and Indexes.

The following table describes the distinct property value types supported by the datastore:

Value type	Python type	Sort order	Notes
Boolean	bool	`False` < `True`
Byte string, short	db.ByteString	byte order	Up to 500 bytes
Byte string, long	db.Blob	n/a	up to 1 megabyte; not indexed
category	db.Category	Unicode
Date and time		chronological
email address	db.Email	Unicode
floating point number	float	numeric	64-bit double precision, IEEE 754
geographical point	db.GeoPt	by latitude, then longitude
Google Accounts user	users.User	email address in Unicode order
integer	int or long	numeric	64-bit integer, signed
key, blobstore	blobstore.BlobKey	byte order
key, datastore	db.Key	by path elements (kind, ID or name, kind, ID or name...)
link	db.Link	Unicode
messaging handle	db.IM	Unicode
null	`None`	n/a
postal address	db.PostalAddress	Unicode
rating	db.Rating	numeric
telephone number	db.PhoneNumber	Unicode
Text string, short	`str`, `unicode`	Unicode (`str` stored as ASCII)	Up to 500 Unicode characters.
Text string, long	db.Text	n/a	up to 1 megabyte; not indexed

It's possible for two entities to have values of different types for the same property. The datastore uses a deterministic ordering for values of mixed types based on the internal representations:

Null values
integer, date-time, and rating
Boolean values
Byte string (short)
Unicode strings: text strings (short), category, email address, IM handle, link, telephone number, postal address
floating point numbers
geographical points
Google Account users
datastore keys
Blobstore keys

Long text strings and long byte strings are not indexed by the datastore, and so have no ordering defined.

Note: Integer values and floating point numbers are considered separate types in the datastore. If entities use a mix of integers and floats for the same property, all integers will be sorted before all floats: 7 < 3.2

Text Strings and Byte Strings

The datastore supports two value types for storing Unicode text: short text strings up to 500 Unicode characters in length, and long text strings up to 1 megabyte in length. Short strings are indexed and can be used in query filter conditions and sort orders. Long strings are not indexed and cannot be used in filter conditions or sort orders.

The datastore supports two similar value types for unencoded binary data: short byte strings up to 500 bytes in length, and long byte strings up to 1 megabyte in length. As with text strings, short byte strings are indexed, and can be used in query filters or sort orders; long byte strings are not indexed and cannot be used in query filters or sort orders.

Note: The long byte string type is referred to as "Blob" in the datastore API. This type is not related to blobs as used by theBlobstore API.

Saving, Getting, and Deleting Entities

Apps use the datastore API to create new entities, update existing entities, get entities, and delete entities. If the app knows the complete key for an entity (or can derive it from its parent key, kind, and ID), the app can act directly on the entity using the key. An app can also perform a query to determine the keys of entities whose properties meet certain criteria. See Queries and Indexes for more information about datastore queries.

In Python, you create a new entity by creating a new instance of a model class, then call the instance's put() method. If the object was not created with a key_name attribute passed to the constructor, calling the put() method populates the object's key with the system-generated ID.

        employee = Employee(first_name='Antonio',
                            last_name='Salieri')
        employee.hire_date = datetime.datetime.now().date()
        employee.attended_hr_training = True

        employee.put()

You can also call the db.put() function with a model instance to save the instance.

To get an entity with a given key, call the db.get() function with the Key object. You can produce the Key object using theKey.from_path() class method. The complete path is a sequence of entities in the ancestor path, with each entity represented by the kind (a string) followed by the key name string or numeric ID:

        address_k = db.Key.from_path('Employee', 'asalieri', 'Address', 1)

        address = db.get(address_k)

db.get() returns an instance of the appropriate model class. Be sure that the model class for the entity being fetched has been imported.

To update an existing entity, modify the attributes of the object, then call the put() method. The object data overwrites the existing entity. The entire object is sent to the datastore with every call to put().

Note: The datastore API does not distinguish between creating a new entity and updating an existing entity. If the object's key represents an entity that exists, calling its put() method overwrites the entity. You can use a transaction to test whether an entity with a given key exists before creating one. See also the get_or_insert() method of the Model class.

Tip: To delete a property, delete the attribute from the Python object, then save the object: del address.postal_code

You can delete an entity given its key by calling the db.delete() function, or by calling the delete() method of the model instance.

        address_k = db.Key.from_path('Employee', 'asalieri', 'Address', 1)
        db.delete(address_k)

        employee_k = db.Key.from_path('Employee', 'asalieri')
        employee = db.get(employee_k)

        # ...

        employee.delete()

Batch Operations

The db.put(), db.get() and db.delete() functions can accept multiple parameters to act on multiple entities in a single datastore call. A batch call to the datastore is faster than making separate calls for each entity because it only incurs the overhead of one service call.

        # A batch put.
        db.put([e1, e2, e3])

        # A batch get.
        entities = db.get([k1, k2, k3])

        # A batch delete.
        db.delete([k1, k2, k3])

A batch call to db.put() or db.delete() may succeed for some entities but not others. If it is important that the call succeed completely or fail completely, you must use a transaction, and all affected entities must be in the same entity group.

Deleting Entities in Bulk via the Admin Console

You can use the Datastore Admin tab of the Admin Console to delete all entities of a kind, or all entities of all kinds, in the default namespace. To enable this feature, simply include the datastore_admin builtin handler in app.yaml:

builtins:
- datastore_admin: on

Adding this builtin enables the Datastore Admin screen in the Data section of the Admin Console. From this screen, you can select the entity kind(s) to delete individually or in bulk, and delete them using the Delete Entities button.

Warning! Deleting entities in bulk happens within your application, and thus counts against your quota.

This feature is currently experimental. We believe this feature is currently the fastest way to bulk-delete data, but it is not yet stable and you may encounter occasional bugs.

Understanding Write Costs

When your application executes a datastore put(), the datastore must perform a number of writes to persist the entity. Your application is charged for each one of these writes. You can easily see the number of writes required to persist an entity by looking at the dataviewer in the SDK Dev Console. In this section we explain how App Engine calculates these values.

Built In Property Indexes

Every entity requires a minimum of 2 writes to persist: 1 for the entity itself and 1 for the built-in EntitiesByKind index, which is used by the query planner to service a variety of queries. There is no way to avoid these 2 index writes. All other index writes, however, are completely under your control.

In addition to the EntitiesByKind index, the datastore maintains 2 other built-in indexes: EntitiesByProperty and EntitiesByPropertyDesc. These indexes provide efficient scans of entities by single property values in ascending and descending order, respectively. Each indexed property value on your entity requires 1 write to each of these indexes. Let's take an entity with properties 'A', 'B', and 'C' as an example:

Key: 'Foo:1' (kind = 'Foo', id = 1, no parent)
A: 1
B: null
C: 'this', 'that'

Assuming we don't have any composite indexes for entities of this kind (more on this later), this entity requires 10 writes to persist. Why 10? Let's break it down. First we have the 2 unavoidable writes for the entity itself and the EntiteisByKind index. Then we have 2 writes for property 'A', 2 writes for property 'B' (a null value still requires a write), and 4 writes for property 'C' (2 for each value in the list), so 2 + 2 + 2 + 4 = 10.

Composite Indexes

Composite Indexes (the indexes you define and upload with your application) also require writes to maintain. Suppose you have the same entity we defined in the previous section along with the following composite index:

Kind: 'Foo'
A ▲, B ▼

When we put this entity we need 1 additional write for this composite index, bringing us to a total of 11. Now let's add property 'C' to this composite index:

Kind: 'Foo'
A ▲, B ▼, C ▼

In order to maintain this index we need to perform one write for every combination of 'A', 'B', and 'C', which is (1, null, 'this') and (1, null, 'that'), so 2 writes for the composite index instead of 1, giving us a grand total of 2 + 2 + 2 + 4 + 2 = 12. Maintaining indexes that involve multi-value properties can get particularly expensive if there is more than one multi-value property (or a single multi-value property is referenced more than once). These are called exploding indexes, and they can be very expensive to maintain.

Now let's update our composite index to include ancestors:

Kind: 'Foo'
A ▲, B ▼, C ▼
Ancestor: True

Creating our sample entity in the datastore with this index present requires the same amount of writes as creating our sample entity in the presence of the non-ancestor version of the index. However, if we give our sample entity a parent, or a grandparent, then the difference becomes clear:

Key: 'FooGrandpa:1/FooPa:1/Foo:1' (kind = 'Foo', id = 1, parent = 'FooGrandpa:1/FooPa:1')
A: 1
B: null
C: 'this', 'that'

In order to maintain the ancestor version of this composite index we need to write every permutation of 'A', 'B', and 'C' for every component of the parent path, which is (1, null, 'this', 'FooGrandpa:1'), (1, null, 'that', 'FooGrandpa:1), (1, null, 'this', 'FooGrandpa:1/FooPa:1'), (1, null, 'that', 'FooGrandpa:1/FooPa:1'), (1, null, 'this', 'FooGrandpa:1/FooPa:1/Foo:1'), and (1, null, 'that', 'FooGrandpa:1/FooPa:1/Foo:1'), so 6 writes for the composite index, giving us a grand total of 2 + 2 + 2 + 4 + 6 = 16.

Queries and Indexes

Overview

A query retrieves datastore entities that meet a specified set of conditions. The query specifies an entity kind, zero or more conditions based on entity property values (sometimes called "filters"), and zero or more sort order descriptions. When the query is executed, it fetches all entities of the given kind that meet all of the given conditions, sorted in the order described.

The Master/Slave Datastore and the High Replication Datastore have different guarantees when it comes to query consistency. By default:

The Master/Slave datastore is strongly consistent for all queries.
The High Replication datastore is strongly consistent by default for queries within an entity group. With the High Replication Datastore, non-ancestor queries are always eventually consistent.

For more information, see Setting the Read Policy and Datastore Call Deadline.

The datastore Python API provides two interfaces for preparing and executing queries: the Query interface, which uses methods to prepare the query, and the GqlQuery interface, which uses a SQL-like query language called GQL to prepare the query from a query string.

class Person(db.Model):
    first_name = db.StringProperty()
    last_name = db.StringProperty()
    city = db.StringProperty()
    birth_year = db.IntegerProperty()
    height = db.IntegerProperty()
# The Query interface prepares a query using instance methods.
q = Person.all()
q.filter("last_name =", "Smith")
q.filter("height <", 72)
q.order("-height")
# The GqlQuery interface prepares a query using a GQL query string.
q = db.GqlQuery("SELECT * FROM Person " +
                "WHERE last_name = :1 AND height < :2 " +
                "ORDER BY height DESC",
                "Smith", 72)
# The query is not executed until results are accessed.
results = q.fetch(5)
for p in results:
    print "%s %s, %d inches tall" % (p.first_name, p.last_name, p.height)

A filter includes a property name, a comparison operator, and a value. An entity passes the filter if it has a property of the given name and its value compares to the given value as described by the operator. The entity is a result for the query if it passes all of its filters.

The filter operator can be any of the following:

< less than
<= less than or equal to
= equal to
> greater than
>= greater than or equal to
!= not equal to
IN equal to any of the values in the provided list

The != operator actually performs two queries: one where all other filters are the same and the not-equal filter is replaced with a less-than filter, and one where the not-equal filter is replaced with a greater-than filter. The results are merged, in order. As described below in the discussion of inequality filters, a query can only have one not-equal filter, and such a query cannot have other inequality filters.

The IN operator also performs multiple queries, one for each item in the provided list value where all other filters are the same and the IN filter is replaced with an equal-to filter. The results are merged, in the order of the items in the list. If a query has more than oneIN filter, the query is performed as multiple queries, one for each combination of values in the IN filters.

A single query containing != or IN operators is limited to 30 sub-queries.

A query can also return just the keys of the result entities instead of the entities themselves.

query = db.Query(Person, keys_only=True)

Kindless Queries

Queries with no kind can be used to find all entities in your datastore. This includes entities created and managed by other App Engine features, including statistics entities (which exist for all apps) and Blobstore metadata entities (if any). When you include a key filter, kindless queries return all entities in a specific key range. Kindless queries cannot include filters on properties.

Ancestor Queries

You can filter your datastore queries to a specified ancestor, so that results contain only entities containing that ancestor. In other words, all of the results will have the ancestor as their parent, or parent's parent, or etc. Passing None as a parameter does not query for entities without ancestors and will return errors.

q = db.Query()
  
q.ancestor(ancestor_key)

Kindless Ancestor Queries

Using GQL or the Python query interface, you can perform queries for entities with a given ancestor regardless of kind. Such queries can also include filters, equality or inequality, on __key__. Kindless queries cannot include sort orders (and thus the results are unordered) or filters on properties.

To perform a kindless ancestor query using the Query class, call the constructor without a kind class:

q = db.Query()

q.ancestor(ancestor_key)
q.filter('__key__ >', last_seen_key)

To perform a kindless ancestor query using GQL (either in the Administrator Console or using the GqlQuery class), omit the FROM Kind clause:

q = db.GqlQuery('SELECT * WHERE ANCESTOR IS :1 AND __key__ > :2', ancestor_key, last_seen_key)

Kindless ancestor queries do not require custom indexes. You can use query_descendants() to return a query for all descendants of the model instance.

Restrictions on Queries

The nature of the index query mechanism imposes a few restrictions on what a query can do. These restrictions are described in this section.

Filtering Or Sorting On a Property Requires That the Property Exists

If a property has a query filter condition or sort order, the query returns only those datastore entities that have a value (including null) for that property.

Entities of a kind need not have the same properties. A filter on a property can only match an entity with a value for that property. If an entity has no value for a property used in a filter or sort order, that entity is omitted from the index built for the query.

No Use of Filters That Match Entities Missing a Property

It is not possible to query for entities that are missing a given property. One alternative is to create a fixed (modeled) property with a default value of None, then create a filter for entities with None as the property value.

Inequality Filters Are Allowed on One Property Only

A query may only use inequality filters (<, <=, >=, >, !=) on one property across all of its filters.

For example, this query is allowed:

SELECT * FROM Person WHERE birth_year >= :min
                       AND birth_year <= :max

However, the following query is not allowed, because it uses inequality filters on two different properties in the same query:

SELECT * FROM Person WHERE birth_year >= :min_year
                       AND height >= :min_height     # ERROR

Filters can combine equal (=) comparisons for different properties in the same query, including queries with one or more inequality conditions on a property. This is allowed:

SELECT * FROM Person WHERE last_name = :last_name
                       AND city = :city
                       AND birth_year >= :min_year

The query mechanism relies on all results for a query to be adjacent to one another in the index table, to avoid having to scan the entire table for results. A single index table cannot represent multiple inequality filters on multiple properties while maintaining that all results are consecutive in the table.

Properties in Inequality Filters Must Be Sorted before Other Sort Orders

If a query has both a filter with an inequality comparison and one or more sort orders, the query must include a sort order for the property used in the inequality, and the sort order must appear before sort orders on other properties.

This query is not valid, because it uses an inequality filter and does not order by the filtered property:

SELECT * FROM Person WHERE birth_year >= :min_year
                     ORDER BY last_name              # ERROR

Similarly, the following query is not valid because it does not order by the filtered property before ordering by other properties:

SELECT * FROM Person WHERE birth_year >= :min_year
                     ORDER BY last_name, birth_year  # ERROR

This query is valid:

SELECT * FROM Person WHERE birth_year >= :min_year
                     ORDER BY birth_year, last_name

To get all results that match an inequality filter, a query scans the index table for the first matching row, then returns all consecutive results until it finds a row that doesn't match. For the consecutive rows to represent the complete result set, the rows must be ordered by the inequality filter before other sort orders.

Properties With Multiple Values Can Have Surprising Behaviors

Due to the way they're indexed, properties with multiple values interact with query filters and sort orders in specific and sometimes surprising ways.

First, if a query has multiple inequality filters on a given property, an entity matches the query only if it has an individual value for that property that matches all of the inequality filters. For example, if an entity has the values [1, 2] for property x, it will not match the query WHERE x > 1 AND x < 2. Each filter does match one of x's values, but no single value matches both filters.

(Note that this does not apply to = filters. For example, the query WHERE x = 1 AND x = 2 will return the above entity.)

The != operator works as "value is other than" test. So, for example, the filter x != 1 matches an entity with value greater than or less than 1.

Similarly, the sort order for multiply valued properties is unusual:

If the entities are sorted by a multi-valued property in ascending order, the smallest value is used for ordering.
If the entities are sorted by a multi-valued property in descending order, the greatest value is used for ordering.
Other values do not affect the sort order, nor does the number of values.

This ordering takes place because multivalued properties appear in an index once for each unique value they contain, but the datastore removes duplicate values from the index, so the first value seen in the index determines its sort order.

This sort order has the unusual consequence that [1, 9] comes before [4, 5, 6, 7] in both ascending and descending order.

Sort Orders Are Ignored on Properties With Equality Filters

One important caveat is queries with both an equality filter and a sort order on a multi-valued property. In those queries, the sort order is disregarded. For single-valued properties, this is a simple optimization. Every result would have the same value for the property, so the results do not need to be sorted further.

However, multi-valued properties may have additional values. Since the sort order is disregarded, the query results may be returned in a different order than if the sort order were applied. (Restoring the dropped sort order would be expensive and require extra indices, and this use case is rare, so the query planner leaves it off.)

Query Ordering when Sort Order Is Unspecified

When a query does not specify an order, the datastore returns the results in the order of retrieval. As we make changes to the datastore implementation, the order of results in an unordered query may change as well. Therefore, if you require a specific sort ordering for your query results, be sure to specify it in the query. If you don't specify a sort order, the ordering of results is undefined and may change over time.

Queries Inside Transactions Must Include Ancestor Filters

Queries are only supported inside transactions if they include an ancestor filter. The query's ancestor must be in the same entity group as the other operations in the transaction. This preserves the restriction that a transaction can only operate on entities in a single entity group.

Fetching Results

After constructing your query, you can specify a number of fetch options to control which results get returned for your query.

If you want to return only a single entity matching your query, you can use the query method get(). This will return the first result that matches the query.

Key-only queries return just the keys to the entities that match your query from the datastore. Key-only queries run faster and consume less CPU than queries that return complete entities. To return only the keys, when constructing the query object, setkeys_only=True.

You can specify a limit and offset with your query to control the number and range of results returned in one batch. Specifying an integer limit returns up to that number of results that match the query. Using an integer offset will skip that number of results and return the rest, up to the limit specified. For example, to fetch the tallest five people from your datastore, you would construct your query as

# Prepare a query.
q = Person.all()
q.order("-height")
# The query is not executed until results are accessed.
results = q.fetch(5)
for p in results:
    print "%s %s, %d inches tall" % (p.first_name, p.last_name, p.height)

If you wanted the 6th through 10th tallest people, you would instead use:

# Prepare a query.
# The query is not executed until results are accessed.
results = q.fetch(limit=5, offset=5)

When iterating through the results of your query, the datastore fetches them in batches. By default, each batch contains 20 results, and you cannot change the size of this result set. You can continue iterating through query results until all are returned or the request times out.

Introduction to Indexes

Every datastore query uses an index, a table containing the results for the query in the desired order. An App Engine application defines its indexes in a configuration file named index.yaml. The development web server automatically adds suggestions to this file as it encounters queries that do not yet have indexes configured. You can tune indexes manually by editing the file before uploading the application.

Note: You can get the list of indexes for your application at runtime by calling get_indexes or get_indexes_async.

The index-based query mechanism supports most common kinds of queries, but it does not support some queries common in other database technologies. Restrictions on queries, and their explanations, are described below.

The datastore maintains an index for every query an application intends to make. As the application makes changes to datastore entities, the datastore updates the indexes with the correct results. When the application executes a query, the datastore fetches the results directly from the corresponding index.

The datastore executes a query using the following steps:

The datastore identifies the index that corresponds with the query's kind, filter properties, filter operators, and sort orders.
The datastore starts scanning the index at the first entity that meets all of the filter conditions using the query's filter values.
The datastore continues to scan the index, returning each entity, until it finds the next entity that does not meet the filter conditions, reaches the end of the index, or has collected the maximum number of results requested by the query.

An index table contains columns for every property used in a filter or sort order. The rows are sorted by the following aspects, in order:

ancestors
property values used in equality filters
property values used in inequality filters
property values used in sort orders

This puts all results for every possible query that uses this index in consecutive rows in the table.

An index contains an entity only if the index refers to every property in the entity. If the index does not reference a property of the entity, that entity will not appear in the index, and will never be a result for the query that uses the index.

Note that the App Engine datastore makes a distinction between an entity that does not possess a property and an entity that possesses a property with a value of None. If you want every entity of a kind to be a potential result for a query, you can use a data model that assigns a default value (such as None) to properties used by query filters.

Unindexed Properties

Queries can't find property values that aren't indexed. This includes properties that are marked as not indexed, as well as properties with values of the long text value type (Text) or the long binary value type (Blob).

A query with a filter or sort order on a property will never match an entity whose value for the property is a Text or Blob, or which was written with that property marked as not indexed. Properties with such values behave as if the property is not set with regard to query filters and sort orders.

If you have a property that you know you will not need to filter or sort by, make it an unindexed property. This will decrease the number of datastore writes needed, because the datastore will not need to maintain index entries for the property. To specify that a property not be indexed, set index=False in the Property constructor.

Mixed Types

When two entities have properties of the same name but of different value types, an index of the property sorts the entities first by value type, then by an order appropriate to the type. For example, if two entities each have a property named "age," one with an integer value and one with a string value, the entity with the integer value always appears before the entity with the string value when sorted by the "Age" property, regardless of the values themselves.

This is especially worth noting in the case of integers and floating point numbers, which are treated as separate types by the datastore. A property with the integer value 38 is sorted before a property with the floating point value 37.5, because all integers are sorted before floats.

Which Queries Need Indexes

App Engine builds indexes for several simple queries by default. For other queries, the application must specify the indexes it needs in a configuration file named index.yaml. If the application running under App Engine tries to perform a query for which there is no corresponding index (either provided by default or described in index.yaml), the query will fail NeedIndexError.

App Engine suggests automatic indexes for the following forms of queries:

queries using only equality and ancestor filters
queries using only inequality filters (which can only be on a single property)
kindless queries using only ancestor/key filters
queries with no filters and only one sort order on a property, either ascending or descending
queries using only equality filters on properties, and inequality filters on keys and ancestor filters

Other forms of queries require their indexes to be specified in index.yaml, including:

queries with multiple sort orders
queries with a sort order on keys in descending order
queries with one or more inequality filters on a property and one or more equality filters over other properties
queries with inequality filters and ancestor filters

Note: The App Engine SDK suggests indexes that are appropriate for most applications. Depending on your application's use of the datastore and the size and shape of your application's data, manual adjustments to your indexes may be warranted. For example, writing entities with multiple property values may result in an exploding index with high write costs. For more information, see Big Entities and Exploding Indexes.

The development web server makes managing index configuration easy: Instead of failing to execute a query that does not have an index and requires it, the development web server can generate configuration for an index that would allow the query to succeed. If your local testing of an application calls every possible query the application will make, using every combination of filter and sort order, the generated entries will represent a complete set of indexes. If your testing does not exercise every possible query form, you can review and adjust the index configuration before uploading the application.

Big Entities and Exploding Indexes

As described above, every property (that doesn't have a Text or Blob value) of every entity is added to at least one index table, including a simple index provided by default, and any indexes described in the application's index.yaml file that refer to the property. For an entity that has one value for each property, App Engine stores a property value once in its simple index, and once for each time the property is referred to in a custom index. Each of these index entries must be updated every time the value of the property changes, so the more indexes that refer to the property, the more datastore CPU time required to update the property.

The datastore limits the number of index entries that a single entity can have. The index entry limit is large, and most applications are not affected. However, there are some circumstances where you might encounter the limit. For example, an entity with a large number of single-value properties can exceed the index entry limit.

Properties with multiple values store each value as a separate entry in an index. An entity with a single property with a large number of values can exceed the index entry limit.

Custom indexes that refer to multiple properties with multiple values can get very large with only a few values. To completely record such properties, the index table must include a row for every permutation of the values of every property for the index.

For example, consider the following query:

SELECT * FROM MyModel WHERE x=1 AND y=2 ORDER BY date

This query causes the SDK to suggest the following index:

indexes:
- kind: MyModel
  properties:
  - name: x
  - name: y
  - name: date

The following code creates an entity with four values for the property x, three values for the property y, and sets the date to the current date:

class MyModel(db.Expando):
  pass

e2 = MyModel()
e2.x = [1, 2, 3, 4]
e2.y = ['red', 'green', 'blue']
e2.date = datetime.datetime.now()
e2.put()

To accurately represent multiple properties with multiple values, the index must store a value for each permutation of values contained in the properties x, y, and date, as modeled by the following equation:

index_value_count = |x| * |y| * |date|

These indexes are called "exploding indexes" because they can become very large with just a few values. Exploding indexes cause entity writes to cost more because you're writing more data. They can also very easily cause an entity to exceed the per-entity index value limit.

For example, the code snippet above generates the following values for x, y, and date:

x = [1, 2, 3, 4]
y = ['red', 'green', 'blue']
date = <now>

This index stores the following permutations:

Table 1. Property values saved to the datastore in an exploding index defined by the SDK.

`x` value	`y` value	`date` value
`1`	`'blue'`	`<now>`
`1`	`'green'`	`<now>`
`1`	`'red'`	`<now>`
`2`	`'blue'`	`<now>`
`2`	`'green'`	`<now>`
`2`	`'red'`	`<now>`
`3`	`'blue'`	`<now>`
`3`	`'green'`	`<now>`
`3`	`'red'`	`<now>`
`4`	`'blue'`	`<now>`
`4`	`'green'`	`<now>`
`4`	`'red'`	`<now>`
Total: 12 values saved to the datastore.

You can avoid exploding indexes by defining these indexes instead in index.yaml:

indexes:
- kind: MyModel
  properties:
  - name: x
  - mame: date- kind: MyModel
  properties:
  - name: y
  - mame: date

With the above index, the number of values stored for the same query is additive:

index_value_count = |x| * |date| + |y| * |date|

Table 2. Property values saved to the datastore in a normal index defined in index.yaml.

`x` value	`date` value	`y` value	`date` value
`1`	`<now>`	`'red'`	`<now>`
2	`<now>`	`'green'`	`<now>`
3	`<now>`	`'blue'`	`<now>`
4	`<now>`
Total: 7 values saved to the datastore.

The App Engine SDK can only detect exploding indexes when the same property is repeated multiple times. In this case, the SDK suggests an alternative index. However, in all other circumstances (such as indexes generated for queries like the one in this example), the SDK generates an exploding index. In this circumstance, you can manually configure an index in index.yaml to circumvent the exploding index.

If a put() would result in a number of index entries that exceeds the limit, the call fails with an exception. If you create a new index that would contain a number of index entries that exceeds the limit for any entity when built, queries against the index fail, and the index appears in the "Error" state in the Admin Console.

To handle "Error" indexes, first remove them from your index.yaml file and run appcfg.py vacuum_indexes. Then, either reformulate the index definition and corresponding queries or remove the entities that are causing the index to "explode." Finally, add the index back to index.yaml and run appcfg.py update_indexes.

You can avoid exploding indexes by avoiding queries that would require a custom index using a list property. As described above, this includes queries with multiple sort orders or queries with a mix of equality and inequality filters.

Queries and Indexes

An application has an index for each combination of kind, filter property, and operator, as well as sort order used in a query. For example, see the following query, stated in GQL:

SELECT * FROM Person WHERE last_name = "Smith"
                       AND height < 72
                  ORDER BY height DESC

The index for this query is a table of keys for entities of the kind Person, with columns for the values of the height andlast_name properties. The index is sorted by height in descending order.

Two queries of the same form but with different filter values use the same index. For example, the following query uses the same index as the query above:

SELECT * FROM Person WHERE last_name = "Jones"
                       AND height < 63
                     ORDER BY height DESC

The following two queries use the same index as well, despite their different forms:

SELECT * FROM Person WHERE last_name = "Friedkin"
                       AND first_name = "Damian"
                     ORDER BY height DESC

SELECT * FROM Person WHERE last_name = "Blair"
                  ORDER BY first_name, height DESC

Queries on Keys

Keys are ordered first by parent path, then by kind, then by key name or ID. Kinds and key names are strings and are ordered by byte value. IDs are integers and are ordered numerically. If entities of the same parent and kind use a mix of key name strings and numeric IDs, entities with numeric IDs are considered to be less than entities with key name strings. Elements of the parent path are compared similarly: by kind (string), then by key name (string), or ID (number).

Queries involving keys use indexes just like queries involving properties. Queries on keys require custom indexes in the same cases as with properties, with a couple of exceptions: inequality filters or an ascending sort order on __key__ do not require a custom index, but a descending sort order on Entity.KEY_RESERVED_PROPERTY__key__ does. As with all queries, the development server creates appropriate configuration entries in this file when a query that needs a custom index is tested.

Query Cursors

Query cursors allow an app to perform a query and retrieve a batch of results, then fetch additional results for the same query in a subsequent web request without the overhead of a query offset. After the app fetches some results for a query, it can ask for an encoded string that represents the location in the result set after the last result fetched (the "cursor"). The app can use the cursor to fetch additional results starting from that point at a later time.

A cursor is an opaque base64-encoded string that represents the next starting position of a query after a fetch operation. The app can embed the cursor in web pages as HTTP GET or POST parameters, or it can store the cursor in the datastore or memcache, or in a task queue task payload. A future request handler can perform the same query and include the cursor with the query to tell the datastore to start returning results from the location represented by the cursor. A cursor can only be used by the app that performed the original query, and can only be used to continue the same query.

Tip: It is generally safe to pass a datastore cursor to a client, such as in a web form, and accept a cursor value from a client. A client cannot change the cursor value to access results outside of the original query. However, the base64-encoded value can be decoded to expose information about result entities, such as the key (app ID, kind, key name or ID, and all ancestor keys) and properties used in the query (including filters and sort orders). If you don't want users to have access to that information, you can encrypt the cursor, or store it and provide the user with an opaque key.

Cursors and Data Updates

The cursor's position is defined as the location in the result list after the last result returned. A cursor is not a relative position in the list (it's not an offset); it's a marker to which the datastore can jump when starting an index scan for results. If the results for a query change between uses of a cursor, the query notices only changes that occur in results after the cursor. If a new result appears before the cursor's position for the query, it will not be returned when the results after the cursor are fetched. Similarly, if an entity is no longer a result for a query but had appeared before the cursor, the results that appear after the cursor do not change. If the last result returned is removed from the result set, the cursor still knows how to locate the next result.

An interesting application of cursors is to monitor entities for unseen changes. If the app sets a timestamp property with the current date and time every time an entity changes, the app can use a query sorted by the timestamp property, ascending, with a datastore cursor to check when entities are moved to the end of the result list. If an entity's timestamp is updated, the query with the cursor returns the updated entity. If no entities were updated since the last time the query was performed, no results are returned, and the cursor does not move.

When retrieving query results, you can use both a start cursor and an end cursor to return a continuous group of results from the datastore. When using a start and end cursor to retrieve the results, you are not guaranteed that the size of the results will be the same as when you generated the cursors. Entities may be added or deleted from the datastore between the time the cursors are generated and when they are used in a query.

In Python, the app gets the cursor after fetching results by calling the cursor() method of the Query object. To fetch additional results, the app prepares a similar query (with the same filters, sort orders and ancestors), and calls the with_cursor() method with the cursor before executing the query.

from google.appengine.api import memcachefrom google.appengine.ext import db
# class Person(db.Model): ...
# Start a query for all Person entities.
people = Person.all()
# If the app stored cursors during a previous request, use them.
start_cursor = memcache.get('person_start_cursor')
end_cursor = memcache.get('person_end_cursor')
if start_cursor:
    people.with_cursor(start_cursor=start_cursor)
if end_cursor:
    people.with_cursor(end_cursor=end_cursor)
# Iterate over the results.
for person in people:
  # Do something

Note that because of how the iterator interface fetches results in batches, getting a cursor may result in an additional call to the datastore to position the cursor where the iterator left off. If using only a start cursor, and if you know how many results you need ahead of time, it's faster to use fetch().

Limitations of Cursors

A few things to know about cursors:

You cannot use cursors with queries that use the IN or != filter operators.
To use a cursor, the app must perform the same query that provided the cursor, including the same kind, filters and filter values, ancestor filter, and sort orders. There is no way to fetch results using a cursor without setting up the same query.
A query with a cursor does not always work as expected when the query uses an inequality filter or a sort order on a property with multiple values. Logic that de-duplicates the multivalued property for the query does not persist between queries, and may return a result more than once.
Cursors depend on the index configuration that supports the query. In cases where multiple index configurations can support the same query, changing the index configuration for the query will invalidate cursors for that query. (For example, a query of only equality filters and no sort orders can be supported by the built-in indexes in most cases, and can also use a custom index if one is provided.) Cursors also depend on implementation details that may change with a new App Engine release that invalidates them. If an app attempts to use a cursor that is no longer valid, the datastore raises an exception. In Python, it raises a datastore_errors.BadRequestError.

Setting the Read Policy and Datastore Call Deadline

In order to increase data availability, you can set the datastore read policy so that all reads and queries are eventually consistent. While the API also allows you to explicitly set a strong consistency policy, in the High Replication Datastore non-ancestor queries are always eventually consistent. Setting the read policy to strong consistency for a non-ancestor query in the High Replication Datastore will have no effect.

When you select eventual consistency for a datastore query, the indexes used by the query to gather results are also accessed with eventual consistency. Eventual consistency queries occasionally return entities that don't match the query criteria, while strong consistency queries are always transactionally consistent. (You can use transactions to ensure a consistent result set if the query uses an ancestor filter.) See Transaction Isolation in App Engine for more information on how entities and indexes are updated.

You can set the read policy (strong consistency vs. eventual consistency) and the datastore call deadline when you execute a query. To do this, create a configuration object with these options set, then pass the object to the method that performs the query.

To create the configuration object, you call the function create_config() in the google.appengine.ext.db module. Theread_policy argument specifies the read policy, as either db.EVENTUAL_CONSISTENCY or db.STRONG_CONSISTENCY. (The default is strong consistency.) The deadline argument specifies the datastore call deadline, as a number of seconds.

config = db.create_config(deadline=5, read_policy=db.EVENTUAL_CONSISTENCY)

To use this configuration with the fetch() method, pass the object to the method as the config argument:

config = db.create_config(deadline=5, read_policy=db.EVENTUAL_CONSISTENCY)
results = Employee.all().fetch(10, config=config)

To use this configuration with the iterator interface, call the run() method with this object as the config argument to get the iterable:

config = db.create_config(deadline=5, read_policy=db.EVENTUAL_CONSISTENCY)
for result in Kind.all().run(config=config):
    # ...

The get() and count() methods of Query GqlQuery also support the config argument in this way.

A configuration object can be used any number of times.

Transactions

The App Engine datastore supports transactions. A transaction is an operation or set of operations that is atomic—either all of the operations in the transaction occur, or none of them occur. An application can perform multiple operations and calculations in a single transaction.

Using Transactions

A transaction is a set of datastore operations on one or more entity. Each transaction is guaranteed to be atomic, which means that transactions are never partially applied. Either all of the operations in the transaction are applied, or none of them are applied.

An operation may fail due to a high rate of contention when too many users try to modify an entity group at the same time. Or an operation may fail due to the application reaching a quota limit. Or there may be an internal error with the datastore. In all cases, the datastore API raises an exception.

Transactions are an optional feature of the datastore; you're not required to use transactions to perform datastore operations.

An application can execute a set of statements and datastore operations in a single transaction, such that if any statement or operation raises an exception, none of the datastore operations in the set are applied. The application defines the actions to perform in the transaction using a Python function. The application starts the transaction using one of the run_in_transaction methods, depending on whether the transaction accesses entities within a single entity group or whether the transaction is a cross-group (XG) transaction.(For XG transactions, see Using Cross-Group Transactions.) For transactions on entities within a single entity group, an app calls db.run_in_transaction() with the function as an argument:

from google.appengine.ext import db
class Accumulator(db.Model):
    counter = db.IntegerProperty()
def increment_counter(key, amount):
    obj = db.get(key)
    obj.counter += amount
    obj.put()

q = db.GqlQuery("SELECT * FROM Accumulator")
acc = q.get()

db.run_in_transaction(increment_counter, acc.key(), 5)

db.run_in_transaction() takes the function object, and positional and keyword arguments to pass to the function. If the function returns a value, db.run_in_transaction() returns that value.

If the function returns, the transaction is committed, and all effects of datastore operations are applied. If the function raises an exception, the transaction is "rolled back," and the effects are not applied. See the note above about exceptions.

Using Cross-Group (XG) Transactions

The XG transactions, which operate across multiple entity groups, behave similarly as the single-group transaction described above. The main difference is the use of the transaction options as shown in the following snippet:

  from google.appengine.ext import db

  xg_on = db.create_transaction_options(xg=True)

  def my_txn():
    x = MyModel(a=3)
    x.put()
    y = MyModel(a=7)
    y.put()

  db.run_in_transaction_options(xg_on, my_txn)

As shown above, to perform an XG transaction, create a transactions options object with the xg parameter set to true: xg_on = db.create_transaction_options(xg=true). Then, run your transaction withdb.run_in_transaction_options(xg_on, my_txn).

What Can Be Done In a Transaction

The datastore imposes several restrictions on what can be done inside a single transaction.

All datastore operations in a transaction must operate on entities in the same entity group if the transaction is a single group transaction, or on entities in a maximum of five entity groups if the transaction is a cross-group (XG) transaction. This includes querying for entities by ancestor, retrieving entities by key, updating entities, and deleting entities. Notice that each root entity belongs to a separate entity group, so a single transaction cannot create or operate on more than one root entity unless it is an XG transaction.

When two or more transactions simultaneously attempt to modify entities in one or more common entity groups, only one of those transactions can succeed. While an app is applying changes to entities in one or more entity group, all other attempts to update any entity in the group or groups fail at commit time. Because of this design, using entity groups limits the number of concurrent writes you can do on any entity in the groups. When a transaction starts, App Engine uses optimistic concurrency control by checking the last update time for the entity groups used in the transaction. Upon commiting a transaction for the entity groups, App Engine again checks the last update time for the entity groups used in the transaction. If it has changed since our initial check, App Engine throws an exception. For an explanation of entity groups, see Keys and Entity Groups.

An app can perform a query during a transaction, but only if it includes an ancestor filter. (You can actually perform a query without an ancestor filter, but the results won't reflect any particular transactionally consistent state). An app can also get datastore entities by key during a transaction. You can prepare keys prior to the transaction, or you can build keys inside the transaction with key names or IDs.

Isolation and Consistency

The datastore's isolation level outside of transactions is closest to READ_COMMITTED. Inside transactions, on the other hand, the isolation level is SERIALIZABLE, specifically a form of snapshot isolation. See the Transaction Isolation article for more information on isolation levels.

In a transaction, all reads reflect the current, consistent state of the datastore at the time the transaction started. This does not include previous puts and deletes inside the transaction. Queries and gets inside a transaction are guaranteed to see a single, consistent snapshot of the datastore as of the beginning of the transaction. Entities and index rows in the transaction's entity group are fully updated so that queries return the complete, correct set of result entities, without the false positives or false negatives described inTransaction Isolation that can occur in queries outside of transactions.

This consistent snapshot view also extends to reads after writes inside transactions. Unlike with most databases, queries and gets inside a datastore transaction do not see the results of previous writes inside that transaction. Specifically, if an entity is modified or deleted within a transaction, a query or get returns the original version of the entity as of the beginning of the transaction, or nothing if the entity did not exist then.

Uses for Transactions

This example demonstrates one use of transactions: updating an entity with a new property value relative to its current value.

def increment_counter(key, amount):
    obj = db.get(key)
    obj.counter += amount
    obj.put()

Warning! The above sample depicts transactionally incrementing a counter only for the sake of simplicity. If your app has counters that are updated frequently, you should not increment them transactionally, or even within a single entity. A best practice for working with counters is to use a technique known as counter-sharding.

This requires a transaction because the value may be updated by another user after this code fetches the object, but before it saves the modified object. Without a transaction, the user's request uses the value of count prior to the other user's update, and the save overwrites the new value. With a transaction, the application is told about the other user's update. If the entity is updated during the transaction, then the transaction is retried until all steps are completed without interruption.

Another common use for transactions is to update an entity with a named key, or create it if it doesn't yet exist:

class SalesAccount(db.Model):
    address = db.PostalAddressProperty()
    phone_number = db.PhoneNumberProperty()
def create_or_update(parent_obj, account_id, address, phone_number):
    obj = db.get(Key.from_path("SalesAccount", account_id, parent=parent_obj))
    if not obj:
        obj = SalesAccount(key_name=account_id,
                           parent=parent_obj,
                           address=address,
                           phone_number=phone_number)
    else:
        obj.address = address
        obj.phone_number = phone_number

    obj.put()

As before, a transaction is necessary to handle the case where another user is attempting to create or update an entity with the same string ID. Without a transaction, if the entity does not exist and two users attempt to create it, the second overwrites the first without knowing that it happened. With a transaction, the second attempt retries, notices that the entity now exists, and updates the entity instead.

When a transaction fails, you can have your app retry the transaction until it succeeds, or you can let your users deal with the error by propagating it to your app's user interface level. You do not have to create a retry loop around every transaction.

Create-or-update is so useful that there is a built-in method for it: Model.get_or_insert() takes a key name, an optional parent, and arguments to pass to the model constructor if an entity of that name and path does not exist. The get attempt and the create happen in one transaction, so (if the transaction is successful) the method always returns a model instance that represents an actual entity.

Tip: A transaction should happen as quickly as possible to reduce the likelihood that the entities used by the transaction will change, causing the transaction to fail. As much as possible, prepare data outside of the transaction, then execute the transaction to perform datastore operations that depend on a consistent state. The application should prepare keys for objects used outside the transaction, then fetch the entities inside the transaction.

Finally, you can use a transaction to read a consistent snapshot of the datastore. This can be useful when multiple reads are needed to render a page or export data that must be consistent. These kinds of transactions are often called read-only transactions, since they perform no writes. Read-only single-group transactions never fail due to concurrent modifications, so you don't have to implement retries upon failure. However, XG transactions can fail due to concurrent modifications, so these should have retries. Committing and rolling back a read-only transaction are both no-ops.

class Customer(db.Model):
    user = db.UserProperty()
class Account(db.Model):
    """An Account has a Customer as its parent."""
    address = db.PostalAddressProperty()
    balance = db.FloatProperty()
def get_all_accounts():
    """Returns a consistent view of the current user's accounts."""
    accounts = []
    for customer in Customer.all().filter('user =', users.get_current_user()):
        accounts.extend(Account.all().ancestor(customer))
    return accounts

Transactional Task Enqueuing

You can enqueue a task as part of a datastore transaction, such that the task is only enqueued—and guaranteed to be enqueued—if the transaction is committed successfully. If the transaction does not get committed, the task is guaranteed not to be enqueued. If the transaction does get committed, the task is guaranteed to be enqueued. Once enqueued, the task is not guaranteed to execute immediately, so the task is not atomic with the transaction. Still, once enqueued, the task will retry until it succeeds. This applies to any task enqueued during a run_in_transaction() function.

Transactional tasks are useful because they allow you to combine non-datastore actions to a transaction that depend on the transaction succeeding (such as sending an email to confirm a purchase). You can also tie datastore actions to the transaction, such as to commit changes to entity groups outside of the transaction if and only if the transaction succeeds.

An application cannot insert more than five transactional tasks into task queues during a single transaction. Transactional tasks must not have user-specified names.

def do_something_in_transaction(...)
  taskqueue.add(url='/path/to/my/worker', transactional=True)
  ...

db.run_in_transaction(do_something_in_transaction, ....)

Data Modeling in Python

Overview

Model Classes

Property Classes and Types

References

Overview

A datastore entity has a key and a set of properties. An application uses the datastore API to define data models, and create instances of those models to be stored as entities. Models provide a common structure to the entities created by the API, and can define rules for validating property values.

Model Classes

The Model Class

An application describes the kinds of data it uses with models. A model is a Python class that inherits from the Model class. The model class defines a new kind of datastore entity and the properties the kind is expected to take.

Model properties are defined using class attributes on the model class. Each class attribute is an instance of a subclass of theProperty class, usually one of the provided property classes. A property instance holds configuration for the property, such as whether or not the property is required for the instance to be valid, or a default value to use for the instance if none is provided.

from google.appengine.ext import db
class Pet(db.Model):
    name = db.StringProperty(required=True)
    type = db.StringProperty(required=True, choices=set(["cat", "dog", "bird"]))
    birthdate = db.DateProperty()
    weight_in_pounds = db.IntegerProperty()
    spayed_or_neutered = db.BooleanProperty()
    owner = db.UserProperty(required=True)

An entity of one of the defined entity kinds is represented in the API by an instance of the corresponding model class. The application can create a new entity by calling the constructor of the class. The application accesses and manipulates properties of the entity using attributes of the instance. The model instance constructor accepts initial values for properties as keyword arguments.

from google.appengine.api import users

pet = Pet(name="Fluffy",
          type="cat",
          owner=users.get_current_user())
pet.weight_in_pounds = 24

Note: The attributes of the model class are configuration for the model properties, whose values are Property instances. The attributes of the model instance are the actual property values, whose values are of the type accepted by the Property class.

The Model class uses the Property instances to validate values assigned to the model instance attributes. Property value validation occurs when a model instance is first constructed, and when an instance attribute is assigned a new value. This ensures that a property can never have an invalid value.

Because validation occurs when the instance is constructed, any property that is configured to be required must be initialized in the constructor. In this example, name, type and owner are all required values, so their initial values are specified in the constructor.weight_in_pounds is not required by the model, so it starts out unassigned, then is assigned a value later.

An instance of a model created using the constructor does not exist in the datastore until it is "put" for the first time.

Note: As with all Python class attributes, model property configuration is initialized when the script or module is first imported. Because App Engine caches imported modules between requests, module configuration may be initialized during a request for one user, and re-used during a request for another. Do not initialize model property configuration, such as default values, with data specific to the request or the current user. See App Caching for more information.

The Expando Class

A model defined using the Model class establishes a fixed set of properties that every instance of the class must have (perhaps with default values). This is a useful way to model data objects, but the datastore does not require that every entity of a given kind have the same set of properties.

Sometimes it is useful for an entity to have properties that aren't necessarily like the properties of other entities of the same kind. Such an entity is represented in the datastore API by an "expando" model. An expando model class subclasses the Expandosuperclass. Any value assigned to an attribute of an instance of an expando model becomes a property of the datastore entity, using the name of the attribute. These properties are known as dynamic properties. Properties defined using Property class instances in class attributes are fixed properties.

An expando model can have both fixed and dynamic properties. The model class simply sets class attributes with Property configuration objects for the fixed properties. The application creates dynamic properties when it assigns them values.

class Person(db.Expando):
    first_name = db.StringProperty()
    last_name = db.StringProperty()
    hobbies = db.StringListProperty()

p = Person(first_name="Albert", last_name="Johnson")
p.hobbies = ["chess", "travel"]

p.chess_elo_rating = 1350

p.travel_countries_visited = ["Spain", "Italy", "USA", "Brazil"]
p.travel_trip_count = 13

Because dynamic properties do not have model property definitions, dynamic properties are not validated. Any dynamic property can have a value of any of the datastore base types, including None. Two entities of the same kind can have different types of values for the same dynamic property, and one can leave a property unset that the other sets.

Unlike fixed properties, dynamic properties need not exist. A dynamic property with a value of None is different from a non-existent dynamic property. If an expando model instance does not have an attribute for a property, the corresponding data entity does not have that property. You can delete a dynamic property by deleting the attribute.

Attributes whose names begin with an underscore (_) are not saved to the datastore entity. This allows you to store values on the model instance for temporary internal use without affecting the data saved with the entity.

Note: Static properties will always be saved to the datastore entity regardless of whether it is Expando, Model, or begins with an underscore (_).

del p.chess_elo_rating

A query that uses a dynamic property in a filter returns only entities whose value for the property is of the same type as the value used in the query. Similarly, the query returns only entities with that property set.

p1 = Person()
p1.favorite = 42
p1.put()

p2 = Person()
p2.favorite = "blue"
p2.put()

p3 = Person()
p3.put()

people = db.GqlQuery("SELECT * FROM Person WHERE favorite < :1", 50)
# people has p1, but not p2 or p3

people = db.GqlQuery("SELECT * FROM Person WHERE favorite > :1", 50)
# people has no results

Note: The above sample uses queries across entity groups. In the High Replication datastore, queries across entity groups may return stale results. For strongly consistent results, use ancestor queries within entity groups.

The Expando class is a subclass of the Model class, and inherits all of its methods.

The PolyModel Class

The Python API includes another class for data modeling that allows you to define hierarchies of classes, and perform queries that can return entities of a given class or any of its subclasses. Such models and queries are called "polymorphic," because they allow instances of one class to be results for a query of a parent class.

The following example defines a Contact class, with the subclasses Person and Company:

from google.appengine.ext import dbfrom google.appengine.ext.db import polymodel
class Contact(polymodel.PolyModel):
    phone_number = db.PhoneNumberProperty()
    address = db.PostalAddressProperty()
class Person(Contact):
    first_name = db.StringProperty()
    last_name = db.StringProperty()
    mobile_number = db.PhoneNumberProperty()
class Company(Contact):
    name = db.StringProperty()
    fax_number = db.PhoneNumberProperty()

This model ensures that all Person entities and all Company entities have phone_number and address properties, and queries for Contact entities can return either Person or Company entities. Only Person entities have mobile_number properties.

The subclasses can be instantiated just like any other model class:

p = Person(phone_number='1-206-555-9234',
           address='123 First Ave., Seattle, WA, 98101',
           first_name='Alfred',
           last_name='Smith',
           mobile_number='1-206-555-0117')
p.put()

c = Company(phone_number='1-503-555-9123',
            address='P.O. Box 98765, Salem, OR, 97301',
            name='Data Solutions, LLC',
            fax_number='1-503-555-6622')
c.put()

A query for Contact entities can return instances of either Contact, Person, or Company. The following code prints information for both entities created above:

for contact in Contact.all():
    print 'Phone: %s\nAddress: %s\n\n'
          % (contact.phone,
             contact.address))

A query for Company entities returns only instances of Company:

for company in Company.all()
    # ...

For now, polymorphic models should not passed to the Query class constructor directly. Instead, use the all() method, as in the example above.

For more information on how to use polymorphic models, and how they are implemented, see The PolyModel Class.

Property Classes and Types

The datastore supports a fixed set of value types for entity properties, including Unicode strings, integers, floating point numbers, dates, entity keys, byte strings (blobs), and various GData types. Each of the datastore value types has a corresponding Property class provided by the google.appengine.ext.db module.

Types and Property Classes describes all of the supported value types and their corresponding Property classes. Several special value types are described below.

Strings and Blobs

The datastore supports two value types for storing text: short text strings up to 500 characters in length, and long text strings up to one megabyte in length. Short strings are indexed and can be used in query filter conditions and sort orders. Long strings are not indexed and cannot be used in filter conditions or sort orders.

A short string value can be either a unicode value or a str value. If the value is a str, an encoding of 'ascii' is assumed. To specify a different encoding for a str value, you can convert it to a unicode value with the unicode() type constructor, which takes the str and the name of the encoding as arguments. Short strings can be modeled using the StringProperty class.

class MyModel(db.Model):
    string = db.StringProperty()

obj = MyModel()
# Python Unicode literal syntax fully describes characters in a text string.
obj.string = u"kittens"
# unicode() converts a byte string to a Unicode string using the named codec.
obj.string = unicode("kittens", "latin-1")
# A byte string is assumed to be text encoded as ASCII (the 'ascii' codec).
obj.string = "kittens"
# Short string properties can be used in query filters.
results = db.GqlQuery("SELECT * FROM MyModel WHERE string = :1", u"kittens")

A long string value is represented by a db.Text instance. Its constructor takes either a unicode value, or a str value and optionally the name of the encoding used in the str. Long strings can be modeled using the TextProperty class.

class MyModel(db.Model):
    text = db.TextProperty()

obj = MyModel()
# Text() can take a Unicode string.
obj.text = u"lots of kittens"
# Text() can take a byte string and the name of an encoding.
obj.text = "lots of kittens", "latin-1"
# If no encoding is specified, a byte string is assumed to be ASCII text.
obj.text = "lots of kittens"
# Text properties can store large values.
obj.text = open("a_tale_of_two_cities.txt").read(), "utf-8"

The datastore also supports two similar types for non-text byte strings: db.ByteString and db.Blob. These values are strings of raw bytes, and are not treated as encoded text (such as UTF-8).

Like db.StringProperty values, db.ByteString values are indexed. Unlike db.TextProperty properties,db.ByteString values are limited to 500 bytes (not characters). A ByteString instance represents a short string of bytes, and takes a str value as an argument to its constructor. Byte strings are modeled using the ByteStringProperty class.

Like db.Text, a db.Blob value can be as large as one megabyte, but is not indexed, and cannot be used in query filters or sort orders. The db.Blob class takes a str value as an argument to its constructor, or you can assign the value directly. Blobs are modeled using the BlobProperty class.

class MyModel(db.Model):
    blob = db.BlobProperty()

obj = MyModel()

obj.blob = open("image.png").read()

Lists

A property can have multiple values, represented in the datastore API as a Python list. The list can contain values of any of the value types supported by the datastore. A single list property may even have values of different types.

Order is generally preserved, so when entities are returned by queries and get(), the list properties values are in the same order as when they were stored. There's one exception to this: Blob and Text values are moved to the end of the list; however, they retain their original order relative to each other.

The ListProperty class models a list, and enforces that all values in the list are of a given type. For convenience, the library also provides StringListProperty, similar to ListProperty(basestring).

class MyModel(db.Model):
    numbers = db.ListProperty(long)

obj = MyModel()
obj.numbers = [2, 4, 6, 8, 10]

obj.numbers = ["hello"]  # ERROR: MyModel.numbers must be a list of longs.

A query with filters on a list property tests each value in the list individually. The entity will match the query only if some value in the list passes all of the filters on that property. See Properties With Multiple Values for more.

# Get all entities where numbers contains a 6.
results = db.GqlQuery("SELECT * FROM MyModel WHERE numbers = 6")
# Get all entities where numbers contains at least one element less than 10.
results = db.GqlQuery("SELECT * FROM MyModel WHERE numbers < 10")

Query filters only operate on list members. There is no way to test two lists for similarity in a query filter.

Internally, the datastore represents a list property value as multiple values for the property. If a list property value is the empty list, then the property has no representation in the datastore. The datastore API treats this situation differently for static properties (with ListProperty) and dynamic properties:

A static ListProperty can be assigned the empty list as a value. The property does not exist in the datastore, but the model instance behaves as if the value is the empty list. A static ListProperty cannot have a value of None.

A dynamic property with a list value cannot be assigned an empty list value. However, it can have a value of None, and can be deleted (using del).

The ListProperty model tests that a value added to the list is of the correct type, and throws a BadValueError if it isn't. This test occurs (and potentially fails) even when a previously stored entity is retrieved and loaded into the model. Because str values are converted to unicode values (as ASCII text) prior to storage, ListProperty(str) is treated as ListProperty(basestring), the Python data type which accepts both str and unicode values. You can also use StringListProperty() for this purpose.

For storing non-text byte strings, use db.Blob values. The bytes of a blob string are preserved when they are stored and retrieved. You can declare a property that is a list of blobs as ListProperty(db.Blob).

List properties interact in unusual ways with sort orders. See Queries and Indexes: Sort Orders and Properties With Multiple Valuesfor details.

References

A property value can contain the key of another entity. The value is a Key instance.

The ReferenceProperty class models a key value, and enforces that all values refer to entities of a given kind. For convenience, the library also provides SelfReferenceProperty, equivalent to a ReferenceProperty that refers to the same kind as the entity with the property.

Assigning a model instance to a ReferenceProperty property automatically uses its key as the value.

class FirstModel(db.Model):
    prop = db.IntegerProperty()
class SecondModel(db.Model):
    reference = db.ReferenceProperty(FirstModel)

obj1 = FirstModel()
obj1.prop = 42
obj1.put()

obj2 = SecondModel()
# A reference value is the key of another entity.
obj2.reference = obj1.key()
# Assigning a model instance to a property uses the entity's key as the value.
obj2.reference = obj1
obj2.put()

A ReferenceProperty property value can be used as if it were the model instance of the referenced entity. If the referenced entity is not in memory, using the property as an instance automatically fetches the entity from the datastore. A ReferenceProperty also stores a key, but using the property causes the related entity to be loaded.

obj2.reference.prop = 999
obj2.reference.put()

results = db.GqlQuery("SELECT * FROM SecondModel")
another_obj = results.fetch(1)[0]
v = another_obj.reference.prop

If a key points to a non-existent entity, then accessing the property raises an error. If an application expects that a reference could be invalid, it can test for the existence of the object using a try/except block:

try:
  obj1 = obj2.referenceexcept db.ReferencePropertyResolveError:
  # Referenced entity was deleted or never existed.

ReferenceProperty has another handy feature: back-references. When a model has a ReferenceProperty to another model, each referenced entity gets a property whose value is a Query that returns all of the entities of the first model that refer to it.

# To fetch and iterate over every SecondModel entity that refers to the
# FirstModel instance obj1:
for obj in obj1.secondmodel_set:
    # ...

The name of the back-reference property defaults to modelname_set (with the name of the model class in lowercase letters, and "_set" added to the end), and can be adjusted using the collection_name argument to the ReferenceProperty constructor.

If you have multiple ReferenceProperty values that refer to the same model class, the default construction of the back-reference property raises an error:

class FirstModel(db.Model):
    prop = db.IntegerProperty()
# This class raises a DuplicatePropertyError with the message
# "Class Firstmodel already has property secondmodel_set"
class SecondModel(db.Model):
    reference_one = db.ReferenceProperty(FirstModel)
    reference_two = db.ReferenceProperty(FirstModel)

To avoid this error, you must explicitly set the collection_name argument:

class FirstModel(db.Model):
    prop = db.IntegerProperty()
# This class runs fine
class SecondModel(db.Model):
    reference_one = db.ReferenceProperty(FirstModel,
        collection_name="secondmodel_reference_one_set")
    reference_two = db.ReferenceProperty(FirstModel,
        collection_name="secondmodel_reference_two_set")

Automatic referencing and dereferencing of model instances, type checking and back-references are only available using the ReferenceProperty model property class. Keys stored as values of Expando dynamic properties or ListProperty values do not have these features.

GQL Reference

GQL is a SQL-like language for retrieving entities or keys from the App Engine scalable datastore. While GQL's features are different from those of a query language for a traditional relational database, the GQL syntax is similar to that of SQL.

The GQL syntax can be summarized as follows:

SELECT [* | __key__] FROM <kind>
    [WHERE <condition> [AND <condition> ...]]
    [ORDER BY <property> [ASC | DESC] [, <property> [ASC | DESC] ...]]
    [LIMIT [<offset>,]<count>]
    [OFFSET <offset>]

  <condition> := <property> {< | <= | > | >= | = | != } <value>
  <condition> := <property> IN <list>
  <condition> := ANCESTOR IS <entity or key>

As with SQL, GQL keywords are case insensitive. Kind and property names are case sensitive.

A GQL query returns zero or more entities or Keys of the requested kind. Every GQL query always begins with either SELECT * FROM or SELECT __key__ FROM, followed by the name of the kind. (A GQL query cannot perform a SQL-like "join" query.)

Tip: SELECT __key__ queries are faster and cost less CPU than SELECT * queries.

The optional WHERE clause filters the result set to those entities that meet one or more conditions. Each condition compares a property of the entity with a value using a comparison operator. If multiple conditions are given with the AND keyword, then an entity must meet all of the conditions to be returned by the query. GQL does not have an OR operator. However, it does have an INoperator, which provides a limited form of OR.

The IN operator compares value of a property to each item in a list. The IN operator is equivalent to many = queries, one for each value, that are ORed together. An entity whose value for the given property equals any of the values in the list can be returned for the query.

Note: The IN and != operators use multiple queries behind the scenes. For example, the IN operator executes a separate underlying datastore query for every item in the list. The entities returned are a result of the cross-product of all the underlying datastore queries and are de-duplicated. A maximum of 30 datastore queries are allowed for any single GQL query.

A condition can also test whether an entity has a given entity as an ancestor, using the ANCESTOR IS operator. The value is a model instance or Key for the ancestor entity. For more information on ancestors, see Keys and Entity Groups.

The left-hand side of a comparison is always a property name. Property names are typically comprised of alphanumeric characters optionally mixed with underscores. In other words, they match the regular expression [a-zA-Z0-9_]. Property names containing other printable characters must be quoted with double-quotes. For example: "first.name". Spaces or non-printable characters in property names are not supported.

The right-hand side of a comparison can be one of the following (as appropriate for the property's data type):

a str literal, as a single-quoted string. Single-quote characters in the string must be escaped as ''. For example: 'Joe''s Diner'
an integer or floating point number literal. For example: 42.7
a Boolean literal, as TRUE or FALSE.
the NULL literal, which represents the null value (None in Python).
a datetime, date, or time literal, with either numeric values or a string representation, in the following forms:
- DATETIME(year, month, day, hour, minute, second)
- DATETIME('YYYY-MM-DD HH:MM:SS')
- DATE(year, month, day)
- DATE('YYYY-MM-DD')
- TIME(hour, minute, second)
- TIME('HH:MM:SS')
an entity key literal, with either a string-encoded key or a complete path of kinds and key names/IDs:
- KEY('encoded key')
- KEY('kind', 'name'/ID [, 'kind', 'name'/ID...])
a User object literal, with the user's email address:
USER('email-address')
a GeoPt literal, with the latitude and longitude as floating point values:
GEOPT(lat, long)
a bound parameter value. In the query string, positional parameters are referenced by number: title = :1 Keyword parameters are referenced by name: title = :mytitle

Note: conditions of the form property = NULL (which are equivalent) check to see whether a null value is explicitly stored in the datastore for that property. This is not the same as checking to see if the entity lacks any value for the property! Datastore queries which refer to a property never return entities which don't have some value for that property.

Bound parameters can be bound as positional arguments or keyword arguments passed to the GqlQuery constructor or a Model class's gql() method. Property data types that do not have corresponding value literal syntax must be specified using parameter binding, including the list data type. Parameter bindings can be re-bound with new values during the lifetime of the GqlQuery instance (such as to efficiently reuse a query) using the bind() method.

The optional ORDER BY clause indicates that results should be returned sorted by the given properties, in either ascending (ASC) or descending (DESC) order. The ORDER BY clause can specify multiple sort orders as a comma-delimited list, evaluated from left to right. If the direction is not specified, it defaults to ASC. If no ORDER BY clause is specified, the order of the results is undefined and may change over time.

An optional LIMIT clause causes the query to stop returning results after the first <count> entities. The LIMIT clause can also include an <offset>, to skip that many results to find the first result to return. An optional OFFSET clause can specify an<offset>, if no LIMIT clause is present.

Note: Like the offset parameter for the fetch() method, an OFFSET in a GQL query string does not reduce the number of entities fetched from the datastore. It only affects which results are returned by the fetch() method. A query with an offset has performance characteristics that correspond linearly with the offset size plus the limit size.

For information on executing GQL queries, binding parameters, and accessing results, see the GqlQuery class, and the Model.gql()class method.

vineri, 21 octombrie 2011

Datastore1

Datastore Overview

Introducing the Datastore

Introducing the Python Datastore API

Entities and Properties

Queries and Indexes

Transactions and Entity Groups

Cross-Group Transactions

Differences From SQL

Understanding Datastore Writes: Commit, Apply, and Data Visibility

Data Visibility after Datastore Writes

Datastore Statistics

Quotas and Limits

Entities, Properties, and Keys

Overview

Kinds, IDs, and Names

Entity Groups and Ancestor Paths

Properties and Value Types

Text Strings and Byte Strings

Saving, Getting, and Deleting Entities

Batch Operations

Deleting Entities in Bulk via the Admin Console

Understanding Write Costs

Built In Property Indexes

Composite Indexes

Queries and Indexes

Overview

Kindless Queries

Ancestor Queries

Kindless Ancestor Queries

Restrictions on Queries

Filtering Or Sorting On a Property Requires That the Property Exists

No Use of Filters That Match Entities Missing a Property

Inequality Filters Are Allowed on One Property Only

Properties in Inequality Filters Must Be Sorted before Other Sort Orders

Properties With Multiple Values Can Have Surprising Behaviors

Sort Orders Are Ignored on Properties With Equality Filters

Query Ordering when Sort Order Is Unspecified

Queries Inside Transactions Must Include Ancestor Filters

Fetching Results

Introduction to Indexes

Unindexed Properties

Mixed Types

Which Queries Need Indexes

Big Entities and Exploding Indexes

Queries and Indexes

Queries on Keys

Query Cursors

Cursors and Data Updates

Limitations of Cursors

Setting the Read Policy and Datastore Call Deadline

Transactions

Using Transactions

Using Cross-Group (XG) Transactions

What Can Be Done In a Transaction

Isolation and Consistency

Uses for Transactions

Transactional Task Enqueuing

Data Modeling in Python

Overview

Model Classes

The Model Class

The Expando Class

The PolyModel Class

Property Classes and Types

Strings and Blobs

Lists

References

GQL Reference

Niciun comentariu:

Trimiteți un comentariu