Deferring Execution

The Problem

Dealing with Blocking Code

When coding I/O based programs - networking code, databases, file access - there are many APIs that are blocking, and many methods where the common idiom is to block until a result is gotten.

class Getter:

    def getData(self, x):
        self.blockUntilResult(x)
        return result

g = Getter()
print g.getData(3)

Don't Call Us, We'll Call You

Twisted can not support blocking calls in most of its code, since it is single threaded, and event based. The solution for this issue is to refactor the code, so that instead of blocking until data is available, we return immediately, and use a callback to notify the requester once the data eventually arrives. Looking at how this is usually implemented will help us understand the necessity for Deferreds.

class Getter:

    def getData(self, x, callback):
        self.callback = callback
        # this call does not block, it ensure self.gotResult is called
        # when we have the result
        self.onResult(x, self.gotResult)
    
    def gotResult(self, result):
        self.callback(result)

def gotData(d):
    print d

g = Getter()
g.getData(3, gotData)

There are several things missing in this simple example. There is no way to know if the data never comes back; no mechanism for handling errors. There is no way to distinguish between different calls to gotData from different sessions. Deferred solves these problems, by creating a single, unified way to defer execution of code that depends on blocking calls.

Deferreds

A twisted.internet.defer.Deferred is a promise that a function will at some point have a result. We can attach callback functions to a Deferred, and once it gets a result these callbacks will be called. In addition Deferreds allow the developer to register a callback for an error, with the default behavior of logging the error. This is an asynchronous equivalent of the common idiom of blocking until a result is returned or an exception it raised.

As we said, multiple callbacks can be added to a Deferred. The first callback in the Deferred's callback chain will be called with the result, the second with the result of the first callback, and so on. Why do we need this? Well, consider a Deferred returned by twisted.enterprise.adbapi - the result of a SQL query. A web widget might add a callback that converts this result into HTML, and pass the Deferred onwards, where the callback will be used by twisted to return the result to the HTTP client.

import sys
from twisted.internet import defer

class Getter:

    def getResult(self, x):
        self.d = defer.Deferred()
        self.doNonblockingStuff(x)
        return self.d
    
    def gotResult(self, result):
        """Called when we get some info from somewhere via the event loop.

        E.g. this may be called because we got a chunk of data off a socket.
        """
        if self.goodResult(result):
            # tell the Deferred that we have a result for it
            self.d.callback(result)
        else:
            # tell the Deferred that we have an error
            self.d.errback("An error has occured.")

def printData(d): sys.stdout.write(d)
def printError(e): sys.stderr.write(e)

g = Getter()
d = g.getResult(3) # notice how this is similar to the blocking version
d.addCallback(printData) # printData will be called when a result is available
d.addErrback(printError) # printError will be called on an error

# run main event loop here
from twisted.internet import reactor
reactor.run()

Visual Explanation

Requesting method (data sink) requests data, gets Deferred object.
Requesting method attaches callbacks to Deferred object.

When the result is ready, give it to the Deferred object. .callback(result) if the operation succeeded, .errback(failure) if it failed. Note that failure is typically an instance of a twisted.python.failure.Failure instance.
Deferred object triggers previously-added (call/err)back with the result or failure. Execution then follows the following rules, going down the chain of callbacks to be processed.
- Result of the callback is always passed as the first argument to the next callback, creating a chain of processors.
- If a callback raises an exception, switch to errback.
- An unhandled failure gets passed down the line of errbacks, this creating an asynchronous analog to a series to a series of except: statements.
- If an errback doesn't raise an exception or return a twisted.python.failure.Failure instance, switch to callback.

More about callbacks

You add multiple callbacks to a Deferred:

g = Getter()
d = g.getResult(3)
d.addCallback(processResult)
d.addCallback(printResult)

Each callback feeds its return value into the next callback (callbacks will be called in the order you add them). Thus in the previous example, processResult's return value will be passed to printResult, instead of the value initially passed into the callback. This gives you a flexible way to chain results together, possibly modifying values along the way, (for example, you may wish to pre-processed database query results).

More about errbacks

Deferred's error handling is modeled after Python's exception handling. In the case that no errors occur, all the callbacks run, one after the other, as described above.

If the errback is called instead of the callback (e.g. because a DB query raised an error), then a twisted.python.failure.Failure is passed into the first errback (you can add multiple errbacks, just like with callbacks). You can think of your errbacks as being like except blocks of ordinary Python code.

Unless you explicitly raise an error in except block, the Exception is caught and stops propagating, and normal execution continues. The same thing happens with errbacks: unless you explicitly return a Failure or (re-)raise an exception, the error stops propagating, and normal callbacks continue executing from that point (using the value returned from the errback). If the errback does returns a Failure or raise an exception, then that is passed to the next errback, and so on.

Note: If an errback doesn't return anything, then it effectively returns None, meaning that callbacks will continue to be executed after this errback. This may not be what you expect to happen, so be careful. Make sure your errbacks return a Failure (probably the one that was passed to it), or a meaningful return value for the next callback.

Also, twisted.python.failure.Failure instances have a useful method called trap, allowing you to effectively do the equivalent of:

try:
    # code that may throw an exception
    cookSpamAndEggs()
except (SpamException, EggException):
    # Handle SpamExceptions and EggExceptions
    ...

You do this by:

def errorHandler(failure):
    failure.trap(SpamException, EggException)
    # Handle SpamExceptions and EggExceptions

d.addCallback(cookSpamAndEggs)
d.addErrback(errorHandler)

If none of arguments passed to failure.trap match the error encapsulated in that Failure, then it re-raises the error.

There's another potential gotcha here. There's a convenience method twisted.internet.defer.Deferred.addCallbacks which is similar to, but not exactly the same as, addCallback followed by addErrback. In particular, consider these two cases:

# Case 1
d = getDeferredFromSomewhere()
d.addCallback(callback1)
d.addErrback(errback1)
d.addCallback(callback2)
d.addErrback(errback2)

# Case 2
d = getDeferredFromSomewhere()
d.addCallbacks(callback1, errback1)
d.addCallbacks(callback2, errback2)

If an error occurs in callback1, then for Case 1 errback1 will be called with the failure. For Case 2, errback2 will be called. Be careful with your callbacks and errbacks.

Unhandled Errors

If a Deferred is garbage-collected with an unhandled error (i.e. it would call the next errback if there was one), then Twisted will write the error's traceback to the log file. This means that you can typically get away with not adding errbacks and still get errors logged. Be careful though; if you keep a reference to the Deferred around, preventing it from being garbage-collected, then you may never see the error (and your callbacks will mysteriously seem to have never been called). If unsure, you should explicitly add an errback after your callbacks, even if all you do is:

# Make sure errors get logged
from twisted.python import log
d.addErrback(log.err)

Class Overview

This is the overview API reference for Deferred. It is not meant to be a substitute for the docstrings in the Deferred class, but can provide guidelines for its use.

Basic Callback Functions

addCallbacks(self, callback[, errback, callbackArgs, errbackArgs, errbackKeywords, asDefaults])
This is the method with which you will use to interact with Deferred. It adds a pair of callbacks parallel to each other (see diagram above) in the list of callbacks made when the Deferred is called back to. The signature of a method added using addCallbacks should be myMethod(result, *methodArgs, **methodKeywords). If your method is passed in the callback slot, for example, all arguments in the tuple callbackArgs will be passed as *methodArgs to your method.

There exist various convenience methods that are derivative of addCallbacks. I will not cover them in detail here, but it is important to know about them in order to create concise code.
- addCallback(callback, *callbackArgs, **callbackKeywords)
  Adds your callback at the next point in the processing chain, while adding an errback that will re-raise its first argument, not affecting further processing in the error case.
- addErrback(errback, *errbackArgs, **errbackKeywords)
  Adds your errback at the next point in the processing chain, while adding a callback that will return its first argument, not affecting further processing in the success case.
- addBoth(callbackOrErrback, *callbackOrErrbackArgs, **callbackOrErrbackKeywords)
  This method adds the same callback into both sides of the processing chain at both points. Keep in mind that the type of the first argument is indeterminate if you use this method! Use it for finally: style blocks.
callback(result)
Run success callbacks with the given result. This can only be run once. Later calls to this or errback will raise twisted.internet.defer.AlreadyCalledError. If further callbacks or errbacks are added after this point, addCallbacks will run the callbacks immediately.
errback(failure)
Run error callbacks with the given failure. This can only be run once. Later calls to this or callback will raise twisted.internet.defer.AlreadyCalledError. If further callbacks or errbacks are added after this point, addCallbacks will run the callbacks immediately.

Chaining Deferreds

If you need one Deferred to wait on another, all you need to do is return a Deferred from a method added to addCallbacks. Specifically, if you return Deferred B from a method added to Deferred A using A.addCallbacks, Deferred A's processing chain will stop until Deferred B's .callback() method is called; at that point, the next callback in A will be passed the result of the last callback in Deferred B's processing chain at the time.

If this seems confusing, don't worry about it right now -- when you run into a situation where you need this behavior, you will probably recognize it immediately and realize why this happens. If you want to chain deferreds manually, there is also a convenience method to help you.

chainDeferred(otherDeferred)
Add otherDeferred to the end of this Deferred's processing chain. When self.callback is called, the result of my processing chain up to this point will be passed to otherDeferred.callback. Further additions to my callback chain do not affect otherDeferred

This is the same as self.addCallbacks(otherDeferred.callback, otherDeferred.errback)

Automatic Error Conditions

setTimeout(seconds[, timeoutFunc])
Set a timeout function to be triggered if this Deferred is not called within that time period. By default, this will raise a TimeoutError after seconds.

A Brief Interlude: Technical Details

While deferreds greatly simplify the process of writing asynchronous code by providing a standard for registering callbacks, there are some subtle and sometimes confusing rules that you need to follow if you are going to use them. This mostly applies to people who are writing new systems that use Deferreds internally, and not writers of applications that just add callbacks to Deferreds produced and processed by other systems. Nevertheless, it is good to know.

Deferreds are one-shot. A generalization of the Deferred API to generic event-sources is in progress -- watch this space for updates! -- but Deferred itself is only for events that occur once. You can only call Deferred.callback or Deferred.errback once. The processing chain continues each time you add new callbacks to an already-called-back-to Deferred.

The important consequence of this is that sometimes, addCallbacks will call its argument synchronously, and sometimes it will not. In situations where callbacks modify state, it is highly desirable for the chain of processing to halt until all callbacks are added. (For the curious: the code for twisted.web.widgets has a textbook example of this.) For this, it is possible to pause and unpause a Deferred's processing chain while you are adding lots of callbacks.

Be careful when you use these methods! If you pause a Deferred, it is your responsibility to make sure that you unpause it; code that calls callback or errback should never call unpause, as this would negate its usefulness!

Advanced Processing Chain Control

pause()
Cease calling any methods as they are added, and do not respond to callback, until self.unpause() is called.
unpause()
If callback has been called on this Deferred already, call all the callbacks that have been added to this Deferred since pause was called.

Whether it was called or not, this will put this Deferred in a state where further calls to addCallbacks or callback will work as normal.

DeferredList

Sometimes you want to be notified after several different events have all happened, rather than individually waiting for each one. For example, you may want to wait for all the connections in a list to close. twisted.internet.defer.DeferredList is the way to do this.

To create a DeferredList from multiple Deferreds, you simply pass a list of the Deferreds you want it to wait for:

# Creates a DeferredList
dl = defer.DeferredList([deferred1, deferred2, deferred3])

You can also add the Deferreds later:

dl.addDeferred(deferred4)

You can now treat the DeferredList like an ordinary Deferred; you can call addCallbacks and so on. The DeferredList will call its callback when all the deferreds have completed. The callback will be called with a list of the results of the Deferreds it contains, like so:

def printResult(result):
    print result
deferred1 = defer.Deferred()
deferred2 = defer.Deferred()
deferred3 = defer.Deferred()
dl = defer.DeferredList([deferred1, deferred2, deferred3])
dl.addCallback(printResult)
deferred1.callback('one')
deferred2.errback('bang!')
deferred3.callback('three')
# At this point, dl will fire its callback, printing:
#     [(1, 'one'), (0, 'bang!'), (1, 'three')]
# (note that defer.SUCCESS == 1, and defer.FAILURE == 0)

A standard DeferredList will never call errback.

If you want to apply callbacks to the individual Deferreds that go into the DeferredList, you should be careful about when those callbacks are added. The act of adding a Deferred to a DeferredList inserts a callback into that Deferred (when that callback is run, it checks to see if the DeferredList has been completed yet). The important thing to remember is that it is this callback which records the value that goes into the result list handed to the DeferredList's callback.

Therefore, if you add a callback to the Deferred after adding the Deferred to the DeferredList, the value returned by that callback will not be given to the DeferredList's callback. To avoid confusion, we recommend not adding callbacks to a Deferred once it has been used in a DeferredList.

def printResult(result):
    print result
def addTen(result):
    return result + " ten"

# Deferred gets callback before DeferredList is created
deferred1 = defer.Deferred()
deferred2 = defer.Deferred()
deferred1.addCallback(addTen)
dl = defer.DeferredList([deferred1, deferred2])
dl.addCallback(printResult)
deferred1.callback("one") # fires addTen, checks DeferredList, stores "one ten"
deferred2.callback("two")
# At this point, dl will fire its callback, printing:
#     [(1, 'one ten'), (1, 'two')]

# Deferred gets callback after DeferredList is created
deferred1 = defer.Deferred()
deferred2 = defer.Deferred()
dl = defer.DeferredList([deferred1, deferred2])
deferred1.addCallback(addTen) # will run *after* DeferredList gets its value
dl.addCallback(printResult)
deferred1.callback("one") # checks DeferredList, stores "one", fires addTen
deferred2.callback("two")
# At this point, dl will fire its callback, printing:
#     [(1, 'one), (1, 'two')]

Other behaviours

DeferredList accepts two keywords arguments that modify its behaviour: fireOnOneCallback and fireOnOneErrback. If fireOnOneCallback is set, the DeferredList will immediately call its callback as soon as any of its Deferreds call their callback. Similarly, fireOnOneErrback will call errback as soon as any of the Deferreds call their errback. Note that DeferredList is still one-shot, like ordinary Deferreds, so after a callback or errback has been called the DeferredList will do nothing further (it will just silently ignore any other results from its Deferreds).

The fireOnOneErrback option is particularly useful when you want to wait for all the results if everything succeeds, but also want to know immediately if something fails.