Tensor

Core data structures for multi-dimensional tensors.

`Op`

Base class for all tensor operations.

This class defines the interface that all tensor operations must implement. Operations are callable objects that can be applied to tensors to create new tensors in the computation graph.

Methods:

__call__ –

Apply the operation to the given arguments.
compute –

Compute the actual operation on the underlying arrays.
gradient –

Compute the gradient of the operation.

`call(*args)`

Apply the operation to the given arguments.

Parameters:

*args (Tensor, default: () ) –

Input tensors to the operation.

Returns:

Tensor –

Result of applying the operation to the inputs.

`compute(*args)`

Compute the actual operation on the underlying arrays.

Parameters:

*args (tuple[NDArray], default: () ) –

Input arrays to the operation.

Returns:

NDArray –

Result of the operation.

Raises:

NotImplementedError –

This method must be implemented by subclasses.

`gradient(out_grad, out_node)`

Compute the gradient of the operation.

Parameters:

out_grad (Tensor) –

Gradient of the output with respect to the final result.
out_node (Tensor) –

The output tensor of this operation.

Returns:

Tensor or tuple[Tensor] –

Gradient(s) with respect to the input(s) of this operation.

Raises:

NotImplementedError –

This method must be implemented by subclasses.

`Tensor`

Tensor is the fundamental data structure in tiny_pytorch. It is a multi-dimensional array of numerical values used to represent inputs, outputs, and intermediate results in a computation graph.

Attributes:

cached_data (list[object]) –

The cached data of the tensor.
inputs (list[Tensor]) –

The input tensors to the operation that produced this tensor.
op (Op) –

The operation that produced this tensor.
requires_grad (bool) –

If True, the tensor will track gradients.

`data` `property` `writable`

Returns a detached Tensor with the original data.

`device` `property`

Returns the device on which the tensor is stored.

Returns:

device ( Device ) –

The device on which the tensor is stored.

`dtype` `property`

Returns the data type of the tensor.

Returns:

dtype ( dtype ) –

The data type of the tensor.

`ndim` `property`

Returns the number of dimensions of the tensor.

Returns:

int –

Number of dimensions of the tensor.

`shape` `property`

Returns the shape of the tensor as a tuple.

Returns:

tuple –

Shape of the tensor.

`add(other)`

Add another tensor or scalar to this tensor.

Parameters:

other (Tensor or scalar) –

The tensor or scalar to add.

Returns:

Tensor –

Result of the addition operation.

`init(array, *, device=None, dtype=None, requires_grad=True)`

Construct a Tensor by copying array.

Parameters:

array (object) –

The array to be copied.
device (Device, default: None ) –

The device on which to place the tensor. Default is None.
dtype (str, default: None ) –

The data type of the tensor. Default is None.
requires_grad (bool, default: True ) –

If True, the tensor will track gradients. Default is True.

`matmul(other)`

Matrix multiplication with another tensor.

Parameters:

other (Tensor) –

The tensor to multiply with.

Returns:

Tensor –

Result of the matrix multiplication.

`mul(other)`

Multiply this tensor by another tensor or scalar.

Parameters:

other (Tensor or scalar) –

The tensor or scalar to multiply by.

Returns:

Tensor –

Result of the multiplication operation.

`neg()`

Negate this tensor.

Returns:

Tensor –

Negated tensor.

`pow(other)`

Raise this tensor to the power of another tensor or scalar.

Parameters:

other (Tensor or scalar) –

The exponent.

Returns:

Tensor –

Result of the power operation.

`repr()`

String representation of the tensor.

Returns:

str –

String representation showing the tensor data.

`str()`

String representation of the tensor.

Returns:

str –

String representation of the tensor data.

`sub(other)`

Subtract another tensor or scalar from this tensor.

Parameters:

other (Tensor or scalar) –

The tensor or scalar to subtract.

Returns:

Tensor –

Result of the subtraction operation.

`truediv(other)`

Divide this tensor by another tensor or scalar.

Parameters:

other (Tensor or scalar) –

The tensor or scalar to divide by.

Returns:

Tensor –

Result of the division operation.

`backward(out_grad=None)`

Computes the gradients of the tensor with respect to the output gradient.

Parameters:

out_grad (Tensor, default: None ) –

The gradient of the output with respect to which the gradients are computed. If None, a tensor of ones is used.

Returns:

None –

This method updates the grad attribute of the tensor and its dependencies with the computed gradients.

`broadcast_to(shape)`

Broadcasts the tensor to the specified shape.

Parameters:

shape (tuple of ints) –

The new shape of the tensor.

Returns:

Tensor –

A new tensor with the specified shape.

`detach()`

Returns a new Tensor with no history (detached from the computation graph). The returned Tensor will share the same data with the original one.

`from_constant(data, requires_grad=False)` `staticmethod`

Creates a leaf node Tensor from the given data.

`from_operation(op, inputs)` `staticmethod`

Creates a node Tensor by applying the op operation on the inputs Tensors.

`is_leaf()`

All Tensors that have requires_grad set to False OR they were created by the user and were not the result of an operation are considered leaf Tensors.

`numpy()`

Returns Tensor as Numpy ndarray. The underlying data will be shared between Tensor and the Numpy ndarray.

`realize_cached_data()`

Run computation to get the output if the LAZY MODE is on, else return cached data.

`reshape(shape)`

Reshapes the tensor to the specified shape.

Parameters:

shape (tuple of ints) –

The new shape of the tensor.

Returns:

Tensor –

A new tensor with the specified shape.

`sum(axes=None)`

Returns the sum of elements over specified axes.

Parameters:

axes (None or int or tuple of ints, default: None ) –

Axis or axes along which a sum is performed. The default is to sum all of the elements of the input tensor.

Returns:

Tensor –

A new tensor with the sum of elements over specified axes.

`transpose(axes=None)`

Transposes the tensor according to the specified axes.

Parameters:

axes (tuple of ints, default: None ) –

By default, reverse the dimensions, otherwise permute the axes according to the values given.

Returns:

Tensor –

A new tensor with the specified axes transposed.

`compute_gradients(out_tensor, out_grad)`

Compute gradients for all nodes in the computation graph.

This function implements reverse-mode automatic differentiation by traversing the computation graph in reverse topological order and computing gradients for each node.

Parameters:

out_tensor (Tensor) –

The output tensor for which gradients are computed.
out_grad (Tensor) –

The gradient of the output with respect to the final result.

Notes

This function modifies the grad attribute of tensors in the computation graph. It stores the computed result in the grad field of each tensor.

`find_topo_sort(node_list)`

Find topological sort of nodes in the computation graph.

Given a list of nodes, return a topological sort list of nodes ending in them. A simple algorithm is to do a post-order DFS traversal on the given nodes, going backwards based on input edges. Since a node is added to the ordering after all its predecessors are traversed due to post-order DFS, we get a topological sort.

Parameters:

node_list (list[Tensor]) –

List of tensors to sort topologically.

Returns:

list[Tensor] –

Topologically sorted list of tensors.

Tensor

Op

__call__(*args)

compute(*args)

gradient(out_grad, out_node)

Tensor

data property writable

device property

dtype property

ndim property

shape property

__add__(other)

__init__(array, *, device=None, dtype=None, requires_grad=True)

__matmul__(other)

__mul__(other)

__neg__()

__pow__(other)

__repr__()

__str__()

__sub__(other)

__truediv__(other)

backward(out_grad=None)

broadcast_to(shape)

detach()

from_constant(data, requires_grad=False) staticmethod

from_operation(op, inputs) staticmethod

is_leaf()

numpy()

realize_cached_data()

reshape(shape)

sum(axes=None)

transpose(axes=None)

compute_gradients(out_tensor, out_grad)

find_topo_sort(node_list)

`Op`

`call(*args)`

`compute(*args)`

`gradient(out_grad, out_node)`

`Tensor`

`data` `property` `writable`

`device` `property`

`dtype` `property`

`ndim` `property`

`shape` `property`

`add(other)`

`init(array, *, device=None, dtype=None, requires_grad=True)`

`matmul(other)`

`mul(other)`

`neg()`

`pow(other)`

`repr()`

`str()`

`sub(other)`

`truediv(other)`

`backward(out_grad=None)`

`broadcast_to(shape)`

`detach()`

`from_constant(data, requires_grad=False)` `staticmethod`

`from_operation(op, inputs)` `staticmethod`

`is_leaf()`

`numpy()`

`realize_cached_data()`

`reshape(shape)`

`sum(axes=None)`

`transpose(axes=None)`

`compute_gradients(out_tensor, out_grad)`

`find_topo_sort(node_list)`