0

I am trying to build a small scale deep learning framework. I defined my tensor class as follows


class Tensor:
    __slots__ = (
        "_backend",
        "_data",
        "_requires_grad",
        "_node",
        "_grad"
    )

    def __init__(
        self,
        data: BackendArray,
        backend: Backend | None = None,
        requires_grad: bool = False,
    ) -> None:
        # Data
        self._backend = backend if backend else get_backend()
        self._data = self._backend.as_array(data)

        # Data for autograd
        self._requires_grad = requires_grad
        self._node: Node | None = None  # For function (autograd graph) nodes
        self._grad: "Tensor | None" = None  # Leaf gradient accumulator

    def backward(self):
        self._grad = backward(self)
    
    def _apply(self, f: Type[Function], *args, **kwargs):
        return _apply(self, f, *args, **kwargs)

The functions backward and _apply are supposed to call into my actual autograd engine. The _apply function constructs the autograd graph eagerly during the forward pass, and the backward function orchestrates the backward pass. I tried doing it this way to make Tensor quite a thin class that is supposed to just hold data.

Later on in the same file, I provide the implementations for backward and _apply as functions, not class methods


def _apply(t: Tensor, f: Type[Function], *args, **kwargs):
    """
    Apply a Function to a tensor.
    Build the computation graph eagerly. 
    """

    ctx = Context(t._backend)
    # We have to unwrap the args to get the raw data, else we create an infinite recursion
    # due to the way that functions are currently implemented
    raw_args = [
        a._data if isinstance(a, Tensor) else a for a in args
    ]
    out_data: BackendArray = f.forward(ctx, *raw_args, **kwargs)

    # determine requires_grad: True if any tensor arg requires grad
    requires_grad = any(isinstance(a, Tensor) and a._requires_grad for a in args)
    out = Tensor(out_data, requires_grad=requires_grad)

    # If requires_grad, add to the graph for backprop
    if requires_grad:
        parents: list[Edge] = []
        for idx, a in enumerate(args):
            if isinstance(a, Tensor) and a._requires_grad:
                if a._requires_grad:
                    parents.append(Edge(
                        a,
                        idx
                    ))
        out._node = Node(
            grad_buffer = None,
            op = f,
            ctx = ctx,
            parents = parents,
            # Placeholder - will be computed just before doing backward as some nodes may not participate
            in_degree = None
        )

    return out
def backward(t: Tensor) -> Tensor:
    if t._node is None:
        raise RuntimeError("Tensor not attached to graph.")
    # Initial gradient
    root_grad = F.ones(t.shape, requires_grad=True, backend=t._backend).data
    t._node.grad_buffer = root_grad
    _backward(t._node)
    # return a Tensor wrapping the original root grad
    return Tensor(root_grad, backend=t._backend, requires_grad=False)

There is now a natural circular dependency within my code - the tensor needs to know about backward and _apply, and the backward and _apply implementations need to know about Tensor.

I want to move out my actual autograd engine logic into a different file and not have it all in one class, but this natural circular dependency is stopping me from doing it.

The reason why I want the Tensor class to have these as methods is because I want to enable natural syntax like calling x.backward() if x is a tensor instead of backward(x). Indeed, if I was willing to not have that syntax then I wouldn't have this circular dependency problem, because Tensor wouldn't need handles to the functions that contain the autograd logic anymore.

So what is a pythonic way to solve this problem? My end goal is to separate out the tensor (a thin wrapper class around some data) from the core autograd logic, yet still have the ability to have syntax like x.backward() if x is a tensor, for example.

1
  • Reading through the intermediate functions, it seems like the real core logic is the _backward function and the function parameter passed to apply. To me it looks like the two intermediate functions you show are so closely tied to the Tensor object that they should just be class methods, but if you can replace _backward and that function parameter then you can still use different backends. Is that thinking consistent with what you've built, and does it help you move code around? Commented yesterday

2 Answers 2

0

As your code is written, your backward and _apply functions only work for your Tensor objects. One way to remove the circular logic would be to remove that requirement. To that end, you could create a placeholder class to deal with the _apply and backward logic, without tying it to Tensor:

class AutoGradObject:
    """An object on which the gradient can be computed automatically."""
    def backward(self):
        self._grad = backward(self)
    
    def _apply(self, f: Type[Function], *args, **kwargs):
        return _apply(self, f, *args, **kwargs)

This class definition could be added to your autograd module. Then, you could change your tensor definition:

from custom_autograd import AutoGradObject

class Tensor(AutoGradOject):
    def __init__(self, *args, **kwargs):
        ...

Disclaimer : I did not test this solution.

Sign up to request clarification or add additional context in comments.

Comments

0

I've reviewed this, and I think the problem may be how you're importing it. You may consider importing inside the methods.

i.e, instead of importing backward and _apply at the top of your file. why don't you try the import from inside the tensor method

class Tensor:
    pass

    def backward(self):
        from autograd import backward
        self._grad = backward(self)
    
    def _apply(self, f: Type[function], *args, **kwargs):
        from autograd import backward
        return _apply(self, f, *args, **kwargs)
New contributor
KHALIFA HARBY is a new contributor to this site. Take care in asking for clarification, commenting, and answering. Check out our Code of Conduct.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.