Skip to content

Tensor

Core data structures for multi-dimensional tensors.

Op

Base class for all tensor operations.

This class defines the interface that all tensor operations must implement. Operations are callable objects that can be applied to tensors to create new tensors in the computation graph.

Methods:

  • __call__

    Apply the operation to the given arguments.

  • compute

    Compute the actual operation on the underlying arrays.

  • gradient

    Compute the gradient of the operation.

__call__(*args)

Apply the operation to the given arguments.

Parameters:

  • *args (Tensor, default: () ) –

    Input tensors to the operation.

Returns:

  • Tensor

    Result of applying the operation to the inputs.

compute(*args)

Compute the actual operation on the underlying arrays.

Parameters:

  • *args (tuple[NDArray], default: () ) –

    Input arrays to the operation.

Returns:

  • NDArray

    Result of the operation.

Raises:

  • NotImplementedError

    This method must be implemented by subclasses.

gradient(out_grad, out_node)

Compute the gradient of the operation.

Parameters:

  • out_grad (Tensor) –

    Gradient of the output with respect to the final result.

  • out_node (Tensor) –

    The output tensor of this operation.

Returns:

  • Tensor or tuple[Tensor]

    Gradient(s) with respect to the input(s) of this operation.

Raises:

  • NotImplementedError

    This method must be implemented by subclasses.

Tensor

Tensor is the fundamental data structure in tiny_pytorch. It is a multi-dimensional array of numerical values used to represent inputs, outputs, and intermediate results in a computation graph.

Attributes:

  • cached_data (list[object]) –

    The cached data of the tensor.

  • inputs (list[Tensor]) –

    The input tensors to the operation that produced this tensor.

  • op (Op) –

    The operation that produced this tensor.

  • requires_grad (bool) –

    If True, the tensor will track gradients.

data property writable

Returns a detached Tensor with the original data.

device property

Returns the device on which the tensor is stored.

Returns:

  • device ( Device ) –

    The device on which the tensor is stored.

dtype property

Returns the data type of the tensor.

Returns:

  • dtype ( dtype ) –

    The data type of the tensor.

ndim property

Returns the number of dimensions of the tensor.

Returns:

  • int

    Number of dimensions of the tensor.

shape property

Returns the shape of the tensor as a tuple.

Returns:

  • tuple

    Shape of the tensor.

__add__(other)

Add another tensor or scalar to this tensor.

Parameters:

  • other (Tensor or scalar) –

    The tensor or scalar to add.

Returns:

  • Tensor

    Result of the addition operation.

__init__(array, *, device=None, dtype=None, requires_grad=True)

Construct a Tensor by copying array.

Parameters:

  • array (object) –

    The array to be copied.

  • device (Device, default: None ) –

    The device on which to place the tensor. Default is None.

  • dtype (str, default: None ) –

    The data type of the tensor. Default is None.

  • requires_grad (bool, default: True ) –

    If True, the tensor will track gradients. Default is True.

__matmul__(other)

Matrix multiplication with another tensor.

Parameters:

  • other (Tensor) –

    The tensor to multiply with.

Returns:

  • Tensor

    Result of the matrix multiplication.

__mul__(other)

Multiply this tensor by another tensor or scalar.

Parameters:

  • other (Tensor or scalar) –

    The tensor or scalar to multiply by.

Returns:

  • Tensor

    Result of the multiplication operation.

__neg__()

Negate this tensor.

Returns:

__pow__(other)

Raise this tensor to the power of another tensor or scalar.

Parameters:

  • other (Tensor or scalar) –

    The exponent.

Returns:

  • Tensor

    Result of the power operation.

__repr__()

String representation of the tensor.

Returns:

  • str

    String representation showing the tensor data.

__str__()

String representation of the tensor.

Returns:

  • str

    String representation of the tensor data.

__sub__(other)

Subtract another tensor or scalar from this tensor.

Parameters:

  • other (Tensor or scalar) –

    The tensor or scalar to subtract.

Returns:

  • Tensor

    Result of the subtraction operation.

__truediv__(other)

Divide this tensor by another tensor or scalar.

Parameters:

  • other (Tensor or scalar) –

    The tensor or scalar to divide by.

Returns:

  • Tensor

    Result of the division operation.

backward(out_grad=None)

Computes the gradients of the tensor with respect to the output gradient.

Parameters:

  • out_grad (Tensor, default: None ) –

    The gradient of the output with respect to which the gradients are computed. If None, a tensor of ones is used.

Returns:

  • None

    This method updates the grad attribute of the tensor and its dependencies with the computed gradients.

broadcast_to(shape)

Broadcasts the tensor to the specified shape.

Parameters:

  • shape (tuple of ints) –

    The new shape of the tensor.

Returns:

  • Tensor

    A new tensor with the specified shape.

detach()

Returns a new Tensor with no history (detached from the computation graph). The returned Tensor will share the same data with the original one.

from_constant(data, requires_grad=False) staticmethod

Creates a leaf node Tensor from the given data.

from_operation(op, inputs) staticmethod

Creates a node Tensor by applying the op operation on the inputs Tensors.

is_leaf()

All Tensors that have requires_grad set to False OR they were created by the user and were not the result of an operation are considered leaf Tensors.

numpy()

Returns Tensor as Numpy ndarray. The underlying data will be shared between Tensor and the Numpy ndarray.

realize_cached_data()

Run computation to get the output if the LAZY MODE is on, else return cached data.

reshape(shape)

Reshapes the tensor to the specified shape.

Parameters:

  • shape (tuple of ints) –

    The new shape of the tensor.

Returns:

  • Tensor

    A new tensor with the specified shape.

sum(axes=None)

Returns the sum of elements over specified axes.

Parameters:

  • axes (None or int or tuple of ints, default: None ) –

    Axis or axes along which a sum is performed. The default is to sum all of the elements of the input tensor.

Returns:

  • Tensor

    A new tensor with the sum of elements over specified axes.

transpose(axes=None)

Transposes the tensor according to the specified axes.

Parameters:

  • axes (tuple of ints, default: None ) –

    By default, reverse the dimensions, otherwise permute the axes according to the values given.

Returns:

  • Tensor

    A new tensor with the specified axes transposed.

compute_gradients(out_tensor, out_grad)

Compute gradients for all nodes in the computation graph.

This function implements reverse-mode automatic differentiation by traversing the computation graph in reverse topological order and computing gradients for each node.

Parameters:

  • out_tensor (Tensor) –

    The output tensor for which gradients are computed.

  • out_grad (Tensor) –

    The gradient of the output with respect to the final result.

Notes

This function modifies the grad attribute of tensors in the computation graph. It stores the computed result in the grad field of each tensor.

find_topo_sort(node_list)

Find topological sort of nodes in the computation graph.

Given a list of nodes, return a topological sort list of nodes ending in them. A simple algorithm is to do a post-order DFS traversal on the given nodes, going backwards based on input edges. Since a node is added to the ordering after all its predecessors are traversed due to post-order DFS, we get a topological sort.

Parameters:

  • node_list (list[Tensor]) –

    List of tensors to sort topologically.

Returns:

  • list[Tensor]

    Topologically sorted list of tensors.