uarray¶
uarray
is built around a back-end protocol, and overridable multimethods.
It is necessary to define multimethods for back-ends to be able to override them.
See the documentation of generate_multimethod
on how to write multimethods.
Let’s start with the simplest:
__ua_domain__
defines the back-end domain. The domain consists of period-
separated string consisting of the modules you extend plus the submodule. For
example, if a submodule module2.submodule
extends module1
(i.e., it exposes dispatchables marked as types available in module1
),
then the domain string should be "module1.module2.submodule"
.
For the purpose of this demonstration, we’ll be creating an object and setting its attributes directly. However, note that you can use a module or your own type as a backend as well.
>>> class Backend: pass
>>> be = Backend()
>>> be.__ua_domain__ = "ua_examples"
It might be useful at this point to sidetrack to the documentation of
generate_multimethod
to find out how to generate a multimethod
overridable by uarray
. Needless to say, writing a backend and
creating multimethods are mostly orthogonal activities, and knowing
one doesn’t necessarily require knowledge of the other, although it
is certainly helpful. We expect core API designers/specifiers to write the
multimethods, and implementors to override them. But, as is often the case,
similar people write both.
Without further ado, here’s an example multimethod:
>>> import uarray as ua
>>> from uarray import Dispatchable
>>> def override_me(a, b):
... return Dispatchable(a, int),
>>> def override_replacer(args, kwargs, dispatchables):
... return (dispatchables[0], args[1]), {}
>>> overridden_me = ua.generate_multimethod(
... override_me, override_replacer, "ua_examples"
... )
Next comes the part about overriding the multimethod. This requires
the __ua_function__
protocol, and the __ua_convert__
protocol. The __ua_function__
protocol has the signature
(method, args, kwargs)
where method
is the passed
multimethod, args
/kwargs
specify the arguments and dispatchables
is the list of converted dispatchables passed in.
>>> def __ua_function__(method, args, kwargs):
... return method.__name__, args, kwargs
>>> be.__ua_function__ = __ua_function__
The other protocol of interest is the __ua_convert__
protocol. It has the
signature (dispatchables, coerce)
. When coerce
is False
, conversion
between the formats should ideally be an O(1)
operation, but it means that
no memory copying should be involved, only views of the existing data.
>>> def __ua_convert__(dispatchables, coerce):
... for d in dispatchables:
... if d.type is int:
... if coerce and d.coercible:
... yield str(d.value)
... else:
... yield d.value
>>> be.__ua_convert__ = __ua_convert__
Now that we have defined the backend, the next thing to do is to call the multimethod.
>>> with ua.set_backend(be):
... overridden_me(1, "2")
('override_me', (1, '2'), {})
Note that the marked type has no effect on the actual type of the passed object. We can also coerce the type of the input.
>>> with ua.set_backend(be, coerce=True):
... overridden_me(1, "2")
... overridden_me(1.0, "2")
('override_me', ('1', '2'), {})
('override_me', ('1.0', '2'), {})
Another feature is that if you remove __ua_convert__
, the arguments are not
converted at all and it’s up to the backend to handle that.
>>> del be.__ua_convert__
>>> with ua.set_backend(be):
... overridden_me(1, "2")
('override_me', (1, '2'), {})
You also have the option to return NotImplemented
, in which case processing moves on
to the next back-end, which in this case, doesn’t exist. The same applies to
__ua_convert__
.
>>> be.__ua_function__ = lambda *a, **kw: NotImplemented
>>> with ua.set_backend(be):
... overridden_me(1, "2")
Traceback (most recent call last):
...
uarray.BackendNotImplementedError: ...
The last possibility is if we don’t have __ua_convert__
, in which case the job is left
up to __ua_function__
, but putting things back into arrays after conversion will not be
possible.
Functions
|
Marks all unmarked arguments as a given type. |
|
Creates a decorator for generating multimethods. |
|
Generates a multimethod. |
|
Creates a utility function to mark something as a specific type. |
|
A context manager that sets the preferred backend. |
|
This utility method replaces the default backend for permanent use. |
|
This utility method sets registers backend for permanent use. |
|
This utility method clears registered backends. |
|
A context manager that allows one to skip a given backend from processing entirely. |
|
Wraps a |
Returns an opaque object containing the current state of all the backends. |
|
|
A context manager that sets the state of the backends to one returned by |
Returns a context manager that resets all state once exited. |
|
|
Set the backend to the first active backend that supports |
|
Set a backend supporting all |
Classes
|
A utility class which marks an argument with a specific dispatch type. |
Exceptions
An exception that is thrown when no compatible backend is found for a method. |
Design Philosophies¶
The following section discusses the design philosophies of uarray
, and the
reasoning behind some of these philosophies.
Modularity¶
uarray
(and its sister modules unumpy
and others to come) were designed
from the ground-up to be modular. This is part of why uarray
itself holds
the core backend and dispatch machinery, and unumpy
holds the actual
multimethods. Also, unumpy
can be developed completely separately to
uarray
, although the ideal place to have it would be NumPy itself.
However, the benefit of having it separate is that it could span multiple NumPy versions, even before NEP-18 (or even NEP-13) was available. Another benefit is that it can have a faster release cycle to help it achieve this.
Separate Imports¶
Code wishing to use the backend machinery for NumPy (as an example) will
use the statement import unumpy as np
instead of the usual
import numpy as np
. This is deliberate: it makes dispatching opt-in
instead of being forced to use it, and the overhead associated with it.
However, a package is free to define its main methods as the dispatchable
versions, thereby allowing dispatch on the default implementation.
Extensibility and Choice¶
If some effort is put into the dispatch machinery, it’s possible to dispatch over arbitrary objects — including arrays, dtypes, and so on. A method defines the type of each dispatchable argument, and backends are only passed types they know how to dispatch over when deciding whether or not to use that backend. For example, if a backend doesn’t know how to dispatch over dtypes, it won’t be asked to decide based on that front.
Methods can have a default implementation in terms of other methods, but they’re still overridable.
This means that only one framework is needed to, for example, dispatch
over ufunc
s, arrays, dtypes and all other primitive objects in NumPy,
while keeping the core uarray
code independent of NumPy and even
unumpy
.
Backends can span modules, so SciPy could jump in and define its own methods on NumPy objects and make them overridable within the NumPy backend.
User Choice¶
The users of unumpy
or uarray
can choose which backend they want
to prefer with a simple context manager. They also have the ability to
force a backend, and to skip a backend. This is useful for array-like
objects that provide other array-like objects by composing them. For
example, Dask could perform all its blockwise function calls with the
following psuedocode (obviously, this is simplified):
in_arrays = extract_inner_arrays(input_arrays)
out_arrays = []
for input_arrays_single in in_arrays:
args, kwargs = blockwise_function.replace_args_kwargs(
args, kwargs, input_arrays_single)
with ua.skip_backend(DaskBackend):
out_arrays_single = blockwise_function(*args, **kwargs)
out_arrays.append(out_arrays_single)
return combine_arrays(out_arrays)
A user would simply do the following:
with ua.use_backend(DaskBackend):
# Write all your code here
# It will prefer the Dask backend
There is no default backend, to unumpy
, NumPy is just another backend. One
can register backends, which will all be tried in indeterminate order when no
backend is selected.
Addressing past flaws¶
The progress on NumPy’s side for defining an override mechanism has been slow, with
NEP-13 being first introduced in 2013, and with the wealth of dispatchable objects
(including arrays, ufuns, and dtypes), and with the advent of libraries like Dask,
CuPy, Xarray, PyData/Sparse, and XND, it has become clear that the need for alternative
array-like implementations is growing. There are even other libraries like PyTorch, and
TensorFlow that’d be possible to express in NumPy API-like terms. Another example
includes the Keras API, for which an overridable ukeras
could be created, similar
to unumpy
.
uarray
is intended to have fast development to fill the need posed by these
communities, while keeping itself as general as possible, and quickly reach maturity,
after which backward compatibility will be guaranteed.
Performance considerations will come only after such a state has been reached.