Querying API¶

Constructing queries¶

Queries in peewee are constructed one piece at a time.

The “pieces” of a peewee query are generally representative of clauses you might find in a SQL query. Most methods are chainable, so you build your query up one clause at a time. This way, rather complex queries are possible.

Here is a barebones select query:

>>> user_q = User.select() # <-- query is not executed
>>> user_q
<peewee.SelectQuery object at 0x7f6b0810c610>

>>> [u.username for u in user_q] # <-- query is evaluated here
[u'admin', u'staff', u'editor']

We can build up the query by adding some clauses to it:

>>> user_q = user_q.where(username__in=['admin', 'editor'])
>>> user_q = user_q.order_by(('username', 'desc'))
>>> [u.username for u in user_q] # <-- query is re-evaluated here
[u'editor', u'admin']

Django-style queries¶

If you are already familiar with the Django ORM, you can construct SelectQuery instances using the familiar “double-underscore” syntax to generate the proper JOINs and WHERE clauses.

Comparing the two methods of querying¶

Get active users:

User.select().where(active=True)

User.filter(active=True)

Get users who are either staff or superusers:

User.select().where(Q(is_staff=True) | Q(is_superuser=True))

User.filter(Q(is_staff=True) | Q(is_superuser=True))

Get tweets by user named “charlie”:

Tweet.select().join(User).where(username='charlie')

Tweet.filter(user__username='charlie')

Get tweets by staff or superusers (assumes FK relationship):

Tweet.select().join(User).where(
    Q(is_staff=True) | Q(is_superuser=True)
)

Tweet.filter(Q(user__is_staff=True) | Q(user__is_superuser=True))

Where clause¶

All queries except InsertQuery support the where() method. If you are familiar with Django’s ORM, it is analagous to the filter() method.

>>> User.select().where(is_staff=True).sql()
('SELECT * FROM user WHERE is_staff = ?', [1])

Note

User.select() is equivalent to SelectQuery(User).

The where() method acts on the Model that is the current “query context”. This is either:

the model the query class was initialized with
the model most recently JOINed on

Here is an example using JOINs:

>>> User.select().where(is_staff=True).join(Blog).where(status=LIVE)

This query grabs all staff users who have a blog that is “LIVE”. This does the opposite, grabs all the blogs that are live whose author is a staffer:

>>> Blog.select().where(status=LIVE).join(User).where(is_staff=True)

Note

to join() from one model to another there must be a ForeignKeyField linking the two.

Another way to write the above query would be:

>>> Blog.select().where(
...     status=LIVE,
...     user__in=User.select().where(is_staff=True)
... )

The above bears a little bit of explanation. First off the SQL generated will not perform any explicit JOIN - it will rather use a subquery in the WHERE clause:

# using subqueries
SELECT * FROM blog
WHERE (
    status = ? AND
    user_id IN (
        SELECT t1.id FROM user AS t1 WHERE t1.is_staff = ?
    )
)

And here it is using joins:

# using joins
SELECT t1.* FROM blog AS t1
INNER JOIN user AS t2
    ON t1.user_id = t2.id
WHERE
    t1.status = ? AND
    t2.is_staff = ?

Column lookups¶

The other bit that’s unique about the query is that it specifies "user__in". Users familiar with Django will recognize this syntax - lookups other than “=” are signified by a double-underscore followed by the lookup type. The following lookup types are available in peewee:

__eq:: x = y, the default
__lt:: x < y
__lte:: x <= y
__gt:: x > y
__gte:: x >= y
__ne:: x != y
__is:: x IS y, used for testing against NULL values
__contains:: case-sensitive check for substring
__icontains:: case-insensitive check for substring
__in:: x IN y, where y is either a list of values or a SelectQuery

Performing advanced queries¶

As you may have noticed, all the examples up to now have shown queries that combine multiple clauses with “AND”. Taking another page from Django’s ORM, peewee allows the creation of arbitrarily complex queries using a special notation called Q objects.

>>> sq = User.select().where(Q(is_staff=True) | Q(is_superuser=True))
>>> print sq.sql()[0]
SELECT * FROM user WHERE (is_staff = ? OR is_superuser = ?)

Q objects can be combined using the bitwise “or” and “and” operators. In order to negate a Q object, use the bitwise “invert” operator:

>>> staff_users = User.select().where(is_staff=True)
>>> Blog.select().where(~Q(user__in=staff_users))

This query generates the following SQL:

SELECT * FROM blog
WHERE
    NOT user_id IN (
        SELECT t1.id FROM user AS t1 WHERE t1.is_staff = ?
    )

Rather complex lookups are possible:

>>> sq = User.select().where(
...     (Q(is_staff=True) | Q(is_superuser=True)) &
...     (Q(join_date__gte=datetime(2009, 1, 1)) | Q(join_date__lt=datetime(2005, 1 1)))
... )
>>> print sq.sql()[0] # cleaned up
SELECT * FROM user
WHERE (
    (is_staff = ? OR is_superuser = ?) AND
    (join_date >= ? OR join_date < ?)
)

This query selects all staff or super users who joined after 2009 or before 2005.

Note

If you need more power, check out RawQuery

Comparing against column data¶

Suppose you have a model that looks like the following:

class WorkerProfiles(Model):
    salary = IntegerField()
    desired = IntegerField()

What if we want to query WorkerProfiles to find all the rows where “salary” is greater than “desired” (maybe you want to find out who may be looking for a raise)?

To solve this problem, peewee borrows the notion of F objects from the django orm. An F object allows you to query against arbitrary data present in another column:

WorkerProfile.select().where(salary__gt=F('desired'))

That’s it. If the other column exists on a model that is accessed via a JOIN, you will need to specify that model as the second argument to the F object. Let’s supposed that the “desired” salary exists on a separate model:

WorkerProfile.select().join(Desired).where(desired_salary__lt=F('salary', WorkerProfile))

Atomic updates¶

The F object also works for updating data. Suppose you cache counts of tweets for every user in a special table to avoid an expensive COUNT() query. You want to update the cache table every time a user tweets, but do so atomically:

cache_row = CacheCount.get(user=some_user)
update_query = cache_row.update(tweet_count=F('tweet_count') + 1)
update_query.execute()

Aggregating records¶

Suppose you have some blogs and want to get a list of them along with the count of entries in each. First I will show you the shortcut:

query = Blog.select().annotate(Entry)

This is equivalent to the following:

query = Blog.select({
    Blog: ['*'],
    Entry: [Count('id')],
}).group_by(Blog).join(Entry)

The resulting query will return Blog objects with all their normal attributes plus an additional attribute ‘count’ which will contain the number of entries. By default it uses an inner join if the foreign key is not nullable, which means blogs without entries won’t appear in the list. To remedy this, manually specify the type of join to include blogs with 0 entries:

query = Blog.select().join(Entry, 'left outer').annotate(Entry)

You can also specify a custom aggregator:

query = Blog.select().annotate(Entry, peewee.Max('pub_date', 'max_pub_date'))

Conversely, sometimes you want to perform an aggregate query that returns a scalar value, like the “max id”. Queries like this can be executed by using the aggregate() method:

max_id = Blog.select().aggregate(Max('id'))

SQL Functions, “Raw expressions” and the R() object¶

If you’ve been reading in order, you will have already seen the Q and F objects. The R object is the final query helper and its purpose is to allow you to express arbitrary expressions as part of your structured query without having to result to using a RawQuery.

Selecting users whose username begins with “a”:

# select the users' id, username and the first letter of their username, lower-cased
query = User.select(['id', 'username', R('LOWER(SUBSTR(username, 1, 1))', 'first_letter')])

# now filter this list to include only users whose username begins with "a"
a_users = query.where(R('first_letter=%s', 'a'))

>>> for user in a_users:
...    print user.first_letter, user.username

a alpha
A Alton

This same functionality could be easily exposed as part of the where clause, the only difference being that the first letter is not selected and therefore not an attribute of the model instance:

a_users = User.filter(R('LOWER(SUBSTR(username, 1, 1)) = %s', 'a'))

We can query for multiple values using R objects, for example selecting users whose usernames begin with a range of letters “b” through “d”:

letters =  ('b', 'c', 'd')
bcd_users = User.filter(R('LOWER(SUBSTR(username, 1, 1)) IN (%s, %s, %s)', *letters))

We can write subqueries as part of a SelectQuery, for example counting the number of entries on a blog:

entry_query = R('(SELECT COUNT(*) FROM entry WHERE entry.blog_id=blog.id)', 'entry_count')
blogs = Blog.select(['id', 'title', entry_query]).order_by(('entry_count', 'desc'))

for blog in blogs:
    print blog.title, blog.entry_count

It is also possible to use subqueries as part of a where clause, for example finding blogs that have no entries:

no_entry_query = R('NOT EXISTS (SELECT * FROM entry WHERE entry.blog_id=blog.id)')
blogs = Blog.filter(no_entry_query)

for blog in blogs:
    print blog.title, ' has no entries'

Speeding up simple select queries¶

Simple select queries can get a performance boost (especially when iterating over large result sets) by calling naive(). This method simply patches all attributes directly from the cursor onto the model. For simple queries this should have no noticeable impact. The main difference is when multiple tables are queried, as in the previous example:

# above example
entries = Entry.select({
    Entry: ['*'],
    Blog: ['*'],
}).order_by(('pub_date', 'desc')).join(Blog)

for entry in entries.limit(10):
    print '%s, posted on %s' % (entry.title, entry.blog.title)

And here is how you would do the same if using a naive query:

# very similar query to the above -- main difference is we're
# aliasing the blog title to "blog_title"
entries = Entry.select({
    Entry: ['*'],
    Blog: [('title', 'blog_title')],
}).order_by(('pub_date', 'desc')).join(Blog)

entries = entries.naive()

# now instead of calling "entry.blog.title" the blog's title
# is exposed directly on the entry model as "blog_title" and
# no blog instance is created
for entry in entries.limit(10):
    print '%s, posted on %s' % (entry.title, entry.blog_title)

Query evaluation¶

In order to execute a query, it is always necessary to call the execute() method.

To get a better idea of how querying works let’s look at some example queries and their return values:

>>> dq = User.delete().where(active=False) # <-- returns a DeleteQuery
>>> dq
<peewee.DeleteQuery object at 0x7fc866ada4d0>
>>> dq.execute() # <-- executes the query and returns number of rows deleted
3

>>> uq = User.update(active=True).where(id__gt=3) # <-- returns an UpdateQuery
>>> uq
<peewee.UpdateQuery object at 0x7fc865beff50>
>>> uq.execute() # <-- executes the query and returns number of rows updated
2

>>> iq = User.insert(username='new user') # <-- returns an InsertQuery
>>> iq
<peewee.InsertQuery object at 0x7fc865beff10>
>>> iq.execute() # <-- executes query and returns the new row's PK
3

>>> sq = User.select().where(active=True) # <-- returns a SelectQuery
>>> sq
<peewee.SelectQuery object at 0x7fc865b7a510>
>>> qr = sq.execute() # <-- executes query and returns a QueryResultWrapper
>>> qr
<peewee.QueryResultWrapper object at 0x7fc865b7a6d0>
>>> [u.id for u in qr]
[1, 2, 3, 4, 7, 8]
>>> [u.id for u in qr] # <-- re-iterating over qr does not re-execute query
[1, 2, 3, 4, 7, 8]

>>> [u.id for u in sq] # <-- as a shortcut, you can iterate directly over
>>>                    #     a SelectQuery (which uses a QueryResultWrapper
>>>                    #     behind-the-scenes)
[1, 2, 3, 4, 7, 8]

Note

Iterating over a SelectQuery will cause it to be evaluated, but iterating over it multiple times will not result in the query being executed again.

QueryResultWrapper¶

As I hope the previous bit showed, Delete, Insert and Update queries are all pretty straightforward. Select queries are a little bit tricky in that they return a special object called a QueryResultWrapper. The sole purpose of this class is to allow the results of a query to be iterated over efficiently. In general it should not need to be dealt with explicitly.

The preferred method of iterating over a result set is to iterate directly over the SelectQuery, allowing it to manage the QueryResultWrapper internally.

SelectQuery¶

class SelectQuery¶

By far the most complex of the 4 query classes available in peewee. It supports JOIN operations on other tables, aggregation via GROUP BY and HAVING clauses, ordering via ORDER BY, and can be iterated and sliced to return only a subset of results.

__init__(model, query=None)¶

Parameters:	model – a `Model` class to perform query on query – either a dictionary, keyed by model with a list of columns, or a string of columns

If no query is provided, it will default to '*'. this parameter can be either a dictionary or a string:

>>> sq = SelectQuery(Blog, {Blog: ['id', 'title']})
>>> sq = SelectQuery(Blog, {
...     Blog: ['*'],
...     Entry: [peewee.Count('id')]
... }).group_by('id').join(Entry)
>>> print sq.sql()[0] # formatted
SELECT t1.*, COUNT(t2.id) AS count
FROM blog AS t1
INNER JOIN entry AS t2
    ON t1.id = t2.blog_id
GROUP BY t1.id

>>> sq = SelectQuery(Blog, 'id, title')
>>> print sq.sql()[0]
SELECT id, title FROM blog

filter(*args, **kwargs)¶

Parameters:	args – a list of `Q` or `Node` objects kwargs – a mapping of column + lookup to value, e.g. “age__gt=55”
Return type:	a `SelectQuery` instance

Provides a django-like syntax for building a query. The key difference between filter() and where() is that filter supports traversing joins using django’s “double-underscore” syntax:

>>> sq = SelectQuery(Entry).filter(blog__title='Some Blog')

This method is chainable:

>>> base_q = User.filter(active=True)
>>> some_user = base_q.filter(username='charlie')

get(*args, **kwargs)¶

Parameters:	args – a list of `Q` or `Node` objects kwargs – a mapping of column + lookup to value, e.g. “age__gt=55”
Return type:	`Model` instance or raises `DoesNotExist` exception

Get a single row from the database that matches the given query. Raises a <model-class>.DoesNotExist if no rows are returned:

>>> active = User.select().where(active=True)
>>> try:
...     user = active.get(username=username, password=password)
... except User.DoesNotExist:
...     user = None

This method is also exposed via the Model api:

>>> user = User.get(username=username, password=password)

where(*args, **kwargs)¶

Parameters:	args – a list of `Q` or `Node` objects kwargs – a mapping of column + lookup to value, e.g. “age__gt=55”
Return type:	a `SelectQuery` instance

Calling where() will act on the model that is currently the query context. Unlike filter(), only columns from the current query context are exposed:

>>> sq = SelectQuery(Blog).where(title='some title', author=some_user)
>>> sq = SelectQuery(Blog).where(Q(title='some title') | Q(title='other title'))

Note

where() calls are chainable

join(model, join_type=None, on=None, alias=None)¶

Parameters:

model – the model to join on. there must be a ForeignKeyField between the current query context and the model passed in.
join_type – allows the type of JOIN used to be specified explicitly
on – if multiple foreign keys exist between two models, this parameter is a string containing the name of the ForeignKeyField to join on.
alias – if provided, will be the name used to alias columns from this table in query

Return type:

a SelectQuery instance

Generate a JOIN clause from the current query context to the model passed in, and establishes model as the new query context.

>>> sq = SelectQuery(Blog).join(Entry).where(title='Some Entry')
>>> sq = SelectQuery(User).join(Relationship, on='to_user_id').where(from_user=self)

naive()¶

Return type:	`SelectQuery`

indicates that this query should only attempt to reconstruct a single model instance for every row returned by the cursor. if multiple tables were queried, the columns returned are patched directly onto the single model instance.

Note

this can provide a significant speed improvement when doing simple iteration over a large result set.

switch(model)¶

Parameters:	model – model to switch the `query context` to.
Return type:	a `SelectQuery` instance

Switches the query context to the given model. Raises an exception if the model has not been selected or joined on previously.

>>> sq = SelectQuery(Blog).join(Entry).switch(Blog).where(title='Some Blog')

count()¶

Return type:	an integer representing the number of rows in the current query

>>> sq = SelectQuery(Blog)
>>> sq.count()
45 # <-- number of blogs
>>> sq.where(status=DELETED)
>>> sq.count()
3 # <-- number of blogs that are marked as deleted

exists()¶

Return type:	boolean whether the current query will return any rows. uses an optimized lookup, so use this rather than `get()`.

>>> sq = User.select().where(active=True)
>>> if sq.where(username=username, password=password).exists():
...     authenticated = True

annotate(related_model, aggregation=None)¶

Parameters:	related_model – related `Model` on which to perform aggregation, must be linked by `ForeignKeyField`. aggregation – the type of aggregation to use, e.g. `Max('pub_date', 'max_pub')`
Return type:	`SelectQuery`

Annotate a query with an aggregation performed on a related model, for example, “get a list of blogs with the number of entries on each”:

>>> Blog.select().annotate(Entry)

if aggregation is None, it will default to Count(related_model, 'count'), but can be anything:

>>> blog_with_latest = Blog.select().annotate(Entry, Max('pub_date', 'max_pub'))

Note

If the ForeignKeyField is nullable, then a LEFT OUTER join will be used, otherwise the join is an INNER join. If an INNER join is used, in the above example blogs with no entries would not be returned. To avoid this, you can explicitly join before calling annotate():

>>> Blog.select().join(Entry, 'left outer').annotate(Entry)

aggregate(aggregation)¶

Parameters:	aggregation – a function specifying what aggregation to perform, for example `Max('id')`. This can be a 3-tuple if you would like to perform a custom aggregation: `("Max", "id", "max_id")`.

Method to look at an aggregate of rows using a given function and return a scalar value, such as the count of all rows or the average value of a particular column.

group_by(clause)¶

Parameters:	clause – either a single field name or a list of field names, in which case it takes its context from the current query_context. it can also be a model class, in which case all that models fields will be included in the `GROUP BY` clause
Return type:	`SelectQuery`

>>> # get a list of blogs with the count of entries each has
>>> sq = Blog.select({
...     Blog: ['*'],
...     Entry: [Count('id')]
... }).group_by('id').join(Entry)

>>> # slightly more complex, get a list of blogs ordered by most recent pub_date
>>> sq = Blog.select({
...     Blog: ['*'],
...     Entry: [Max('pub_date', 'max_pub_date')],
... }).join(Entry)
>>> # now, group by the entry's blog id, followed by all the blog fields
>>> sq = sq.group_by('blog_id').group_by(Blog)
>>> # finally, order our results by max pub date
>>> sq = sq.order_by(peewee.desc('max_pub_date'))

having(clause)¶

Parameters:	clause – Expression to use as the `HAVING` clause
Return type:	`SelectQuery`

>>> sq = Blog.select({
...     Blog: ['*'],
...     Entry: [Count('id', 'num_entries')]
... }).group_by('id').join(Entry).having('num_entries > 10')

order_by(*clauses)¶

Parameters:	clauses – Expression(s) to use as the `ORDER BY` clause, see notes below
Return type:	`SelectQuery`

Note

Adds the provided clause (a field name or alias) to the query’s ORDER BY clause. It can be either a single field name, in which case it will apply to the current query context, or a 2- or 3-tuple.

The 2-tuple can be either (Model, 'field_name') or ('field_name', 'ASC'/'DESC').

The 3-tuple is (Model, 'field_name', 'ASC'/'DESC').

If the field is not found on the model evaluated against, it will be treated as an alias.

example:

>>> sq = Blog.select().order_by('title')
>>> sq = Blog.select({
...     Blog: ['*'],
...     Entry: [Max('pub_date', 'max_pub')]
... }).join(Entry).order_by(desc('max_pub'))

slightly more complex example:

>>> sq = Entry.select().join(Blog).order_by(
...     (Blog, 'title'), # order by blog title ascending
...     (Entry, 'pub_date', 'DESC'), # then order by entry pub date desc
... )

check out how the query context applies to ordering:

>>> blog_title = Blog.select().order_by('title').join(Entry)
>>> print blog_title.sql()[0]
SELECT t1.* FROM blog AS t1
INNER JOIN entry AS t2
    ON t1.id = t2.blog_id
ORDER BY t1.title

>>> entry_title = Blog.select().join(Entry).order_by('title')
>>> print entry_title.sql()[0]
SELECT t1.* FROM blog AS t1
INNER JOIN entry AS t2
    ON t1.id = t2.blog_id
ORDER BY t2.title # <-- note that it's using the title on Entry this time

paginate(page_num, paginate_by=20)¶

Parameters:	page_num – a 1-based page number to use for paginating results paginate_by – number of results to return per-page
Return type:	`SelectQuery`

applies a LIMIT and OFFSET to the query.

>>> Blog.select().order_by('username').paginate(3, 20) # <-- get blogs 41-60

distinct()¶

Return type:	`SelectQuery`

indicates that this query should only return distinct rows. results in a SELECT DISTINCT query.

execute()¶

Return type:	`QueryResultWrapper`

Executes the query and returns a QueryResultWrapper for iterating over the result set. The results are managed internally by the query and whenever a clause is added that would possibly alter the result set, the query is marked for re-execution.

__iter__()¶

Executes the query:

>>> for user in User.select().where(active=True):
...     print user.username

UpdateQuery¶

class UpdateQuery¶

Used for updating rows in the database.

__init__(model, **kwargs)¶

Parameters:	model – `Model` class on which to perform update kwargs – mapping of field/value pairs containing columns and values to update

>>> uq = UpdateQuery(User, active=False).where(registration_expired=True)
>>> print uq.sql()
('UPDATE user SET active=? WHERE registration_expired = ?', [0, True])

>>> atomic_update = UpdateQuery(User, message_count=F('message_count') + 1).where(id=3)
>>> print atomic_update.sql()
('UPDATE user SET message_count=(message_count + 1) WHERE id = ?', [3])

where(*args, **kwargs)¶

Parameters:	args – a list of `Q` or `Node` objects kwargs – a mapping of column + lookup to value, e.g. “age__gt=55”
Return type:	a `UpdateQuery` instance

Note

where() calls are chainable

execute()¶

Return type:	Number of rows updated

Performs the query

DeleteQuery¶

class DeleteQuery¶

Deletes rows of the given model.

Note

It will not traverse foreign keys or ensure that constraints are obeyed, so use it with care.

__init__(model)¶

creates a DeleteQuery instance for the given model:

>>> dq = DeleteQuery(User).where(active=False)
>>> print dq.sql()
('DELETE FROM user WHERE active = ?', [0])

where(*args, **kwargs)¶

Parameters:	args – a list of `Q` or `Node` objects kwargs – a mapping of column + lookup to value, e.g. “age__gt=55”
Return type:	a `DeleteQuery` instance

Note

where() calls are chainable

execute()¶

Return type:	Number of rows deleted

Performs the query

InsertQuery¶

class InsertQuery¶

Creates a new row for the given model.

__init__(model, **kwargs)¶

creates an InsertQuery instance for the given model where kwargs is a dictionary of field name to value:

>>> iq = InsertQuery(User, username='admin', password='test', active=True)
>>> print iq.sql()
('INSERT INTO user (username, password, active) VALUES (?, ?, ?)', ['admin', 'test', 1])

execute()¶

Return type:	primary key of the new row

Performs the query

RawQuery¶

class RawQuery¶

Allows execution of an arbitrary SELECT query and returns instances of the model via a QueryResultsWrapper.

__init__(model, query, *params)¶

creates a RawQuery instance for the given model which, when executed, will run the given query with the given parameters and return model instances:

>>> rq = RawQuery(User, 'SELECT * FROM users WHERE username = ?', 'admin')
>>> for obj in rq.execute():
...     print obj
<User: admin>

execute()¶

Return type:	a `QueryResultWrapper` for iterating over the result set. The results are instances of the given model.

Performs the query

Querying API¶

Constructing queries¶

Django-style queries¶

Comparing the two methods of querying¶

Where clause¶

Column lookups¶

Performing advanced queries¶

Comparing against column data¶

Atomic updates¶

Aggregating records¶

SQL Functions, “Raw expressions” and the R() object¶

Speeding up simple select queries¶

Query evaluation¶

QueryResultWrapper¶

SelectQuery¶

UpdateQuery¶

DeleteQuery¶

InsertQuery¶

RawQuery¶

Table Of Contents

Previous topic

Next topic

This Page

Querying API¶

Constructing queries¶

Django-style queries¶

Comparing the two methods of querying¶

Where clause¶

Column lookups¶

Performing advanced queries¶

Comparing against column data¶

Atomic updates¶

Aggregating records¶

SQL Functions, “Raw expressions” and the R() object¶

Saving Queries by Selecting Related Models¶

Speeding up simple select queries¶

Query evaluation¶

QueryResultWrapper¶

SelectQuery¶

UpdateQuery¶

DeleteQuery¶

InsertQuery¶

RawQuery¶