Tensor
Core data structures for multi-dimensional tensors.
Op
Base class for all tensor operations.
This class defines the interface that all tensor operations must implement. Operations are callable objects that can be applied to tensors to create new tensors in the computation graph.
Methods:
-
__call__
–Apply the operation to the given arguments.
-
compute
–Compute the actual operation on the underlying arrays.
-
gradient
–Compute the gradient of the operation.
__call__(*args)
compute(*args)
Compute the actual operation on the underlying arrays.
Parameters:
-
*args
(tuple[NDArray]
, default:()
) –Input arrays to the operation.
Returns:
-
NDArray
–Result of the operation.
Raises:
-
NotImplementedError
–This method must be implemented by subclasses.
Tensor
Tensor is the fundamental data structure in tiny_pytorch. It is a multi-dimensional array of numerical values used to represent inputs, outputs, and intermediate results in a computation graph.
Attributes:
-
cached_data
(list[object]
) –The cached data of the tensor.
-
inputs
(list[Tensor]
) –The input tensors to the operation that produced this tensor.
-
op
(Op
) –The operation that produced this tensor.
-
requires_grad
(bool
) –If True, the tensor will track gradients.
data
property
writable
Returns a detached Tensor with the original data.
device
property
Returns the device on which the tensor is stored.
Returns:
-
device
(Device
) –The device on which the tensor is stored.
dtype
property
Returns the data type of the tensor.
Returns:
-
dtype
(dtype
) –The data type of the tensor.
ndim
property
Returns the number of dimensions of the tensor.
Returns:
-
int
–Number of dimensions of the tensor.
shape
property
Returns the shape of the tensor as a tuple.
Returns:
-
tuple
–Shape of the tensor.
__add__(other)
__init__(array, *, device=None, dtype=None, requires_grad=True)
Construct a Tensor by copying array
.
Parameters:
-
array
(object
) –The array to be copied.
-
device
(Device
, default:None
) –The device on which to place the tensor. Default is None.
-
dtype
(str
, default:None
) –The data type of the tensor. Default is None.
-
requires_grad
(bool
, default:True
) –If True, the tensor will track gradients. Default is True.
__matmul__(other)
__mul__(other)
__neg__()
__pow__(other)
__repr__()
String representation of the tensor.
Returns:
-
str
–String representation showing the tensor data.
__str__()
String representation of the tensor.
Returns:
-
str
–String representation of the tensor data.
__sub__(other)
__truediv__(other)
backward(out_grad=None)
Computes the gradients of the tensor with respect to the output gradient.
Parameters:
-
out_grad
(Tensor
, default:None
) –The gradient of the output with respect to which the gradients are computed. If None, a tensor of ones is used.
Returns:
-
None
–This method updates the
grad
attribute of the tensor and its dependencies with the computed gradients.
broadcast_to(shape)
Broadcasts the tensor to the specified shape.
Parameters:
-
shape
(tuple of ints
) –The new shape of the tensor.
Returns:
-
Tensor
–A new tensor with the specified shape.
detach()
Returns a new Tensor with no history (detached from the computation graph). The returned Tensor will share the same data with the original one.
from_constant(data, requires_grad=False)
staticmethod
Creates a leaf node Tensor from the given data
.
from_operation(op, inputs)
staticmethod
Creates a node Tensor by applying the op
operation on the inputs
Tensors.
is_leaf()
All Tensors that have requires_grad
set to False
OR they were
created by the user and were not the result of an operation are
considered leaf Tensors.
numpy()
Returns Tensor
as Numpy ndarray. The underlying data will be shared
between Tensor and the Numpy ndarray.
realize_cached_data()
Run computation to get the output if the LAZY MODE is on, else return cached data.
reshape(shape)
Reshapes the tensor to the specified shape.
Parameters:
-
shape
(tuple of ints
) –The new shape of the tensor.
Returns:
-
Tensor
–A new tensor with the specified shape.
sum(axes=None)
Returns the sum of elements over specified axes.
Parameters:
-
axes
(None or int or tuple of ints
, default:None
) –Axis or axes along which a sum is performed. The default is to sum all of the elements of the input tensor.
Returns:
-
Tensor
–A new tensor with the sum of elements over specified axes.
transpose(axes=None)
Transposes the tensor according to the specified axes.
Parameters:
-
axes
(tuple of ints
, default:None
) –By default, reverse the dimensions, otherwise permute the axes according to the values given.
Returns:
-
Tensor
–A new tensor with the specified axes transposed.
compute_gradients(out_tensor, out_grad)
Compute gradients for all nodes in the computation graph.
This function implements reverse-mode automatic differentiation by traversing the computation graph in reverse topological order and computing gradients for each node.
Parameters:
-
out_tensor
(Tensor
) –The output tensor for which gradients are computed.
-
out_grad
(Tensor
) –The gradient of the output with respect to the final result.
Notes
This function modifies the grad
attribute of tensors in the computation
graph. It stores the computed result in the grad field of each tensor.
find_topo_sort(node_list)
Find topological sort of nodes in the computation graph.
Given a list of nodes, return a topological sort list of nodes ending in them. A simple algorithm is to do a post-order DFS traversal on the given nodes, going backwards based on input edges. Since a node is added to the ordering after all its predecessors are traversed due to post-order DFS, we get a topological sort.
Parameters:
-
node_list
(list[Tensor]
) –List of tensors to sort topologically.
Returns:
-
list[Tensor]
–Topologically sorted list of tensors.