Operators
Tensor operations module for tiny-pytorch implementation.
This module provides a comprehensive collection of fundamental tensor operations that form the building blocks of the computational graph in tiny-pytorch. Each operation is implemented as a class that inherits from the TensorOp base class, with corresponding helper functions for easier usage.
The module includes element-wise operations, matrix operations, reduction operations, activation functions, and various mathematical functions commonly used in deep learning and neural network computations.
Key Features
- Automatic differentiation support through gradient methods
- Element-wise and scalar operations
- Matrix operations (multiplication, transpose)
- Reduction operations (summation, log-sum-exp)
- Activation functions (ReLU, tanh)
- Shape manipulation (reshape, broadcast, stack, split)
- Convolutional operations
- Memory-efficient operations with strided arrays
Classes:
-
TensorOp–Base class for all tensor operations.
-
TensorTupleOp–Base class for operations that return tensor tuples.
-
ScalarAdd–Addition of a scalar to a tensor.
-
EWiseAdd–Element-wise addition of two tensors.
-
ScalarMul–Multiplication of a tensor by a scalar.
-
EWiseMul–Element-wise multiplication of two tensors.
-
Negate–Negation of a tensor.
-
ScalarPower–Raising tensor elements to a scalar power.
-
EWisePower–Element-wise power operation between two tensors.
-
ScalarDivide–Division of a tensor by a scalar.
-
EWiseDivide–Element-wise division of two tensors.
-
Reshape–Reshaping a tensor to a new shape.
-
Summation–Summing tensor elements along specified axes.
-
BroadcastTo–Broadcasting a tensor to a larger shape.
-
Transpose–Transposing a tensor along specified axes.
-
MatMul–Matrix multiplication between two tensors.
-
Log–Natural logarithm of tensor elements.
-
Exp–Exponential of tensor elements.
-
ReLU–Rectified Linear Unit activation function.
-
LogSumExp–Log-sum-exp operation, commonly used in softmax computation.
-
Tanh–Hyperbolic tangent activation function.
-
Stack–Stack a sequence of arrays along a new axis.
-
Split–Split a tensor along a specified axis.
-
Flip–Reverse the order of elements along specified axes.
-
Dilate–Insert zeros between elements along specified axes.
-
UnDilate–Remove zeros inserted by dilation along specified axes.
-
Conv–2D convolution operation.
Functions:
-
add_scalar–Add a scalar to a tensor.
-
add–Add two tensors element-wise.
-
mul_scalar–Multiply a tensor by a scalar.
-
multiply–Multiply two tensors element-wise.
-
negate–Negate a tensor.
-
power_scalar–Raise tensor elements to a scalar power.
-
power–Element-wise power operation.
-
divide_scalar–Divide a tensor by a scalar.
-
divide–Element-wise division of tensors.
-
reshape–Reshape a tensor.
-
summation–Sum tensor elements along specified axes.
-
broadcast_to–Broadcast tensor to a larger shape.
-
transpose–Transpose tensor axes.
-
matmul–Matrix multiplication.
-
log–Natural logarithm.
-
exp–Exponential function.
-
relu–ReLU activation function.
-
logsumexp–Log-sum-exp operation.
-
tanh–Hyperbolic tangent function.
-
stack–Stack a sequence of arrays along a new axis.
-
split–Split a tensor along a specified axis.
-
flip–Reverse the order of elements along specified axes.
-
dilate–Insert zeros between elements along specified axes.
-
undilate–Remove zeros inserted by dilation along specified axes.
-
conv–2D convolution operation.
Notes
All operations support automatic differentiation through their gradient methods, making them suitable for building and training neural networks. The operations are designed to work efficiently with the NDArray backend system and support multiple devices (CPU, CUDA, NumPy).
Examples:
>>> import tiny_pytorch as tp
>>> x = tp.Tensor([1, 2, 3])
>>> y = tp.Tensor([4, 5, 6])
>>> z = tp.ops.add(x, y) # Element-wise addition
>>> w = tp.ops.matmul(x, y) # Matrix multiplication
BroadcastTo
Bases: TensorOp
Broadcast a tensor to a larger shape.
Parameters:
-
shape(tuple) –Target shape to broadcast to.
Methods:
-
compute–Compute the broadcast operation.
-
gradient–Compute the gradient of the operation.
Conv
Bases: TensorOp
2D convolution operation between input tensor and kernel.
This operation performs 2D convolution between an input tensor and a kernel tensor. The input is expected to be in NHWC format (batch, height, width, channels) and the kernel in KKCC format (kernel_height, kernel_width, input_channels, output_channels).
Parameters:
-
stride(int, default:1) –The stride of the convolution. Default is 1.
-
padding(int, default:0) –The amount of padding to apply to the input. Default is 0.
Methods:
-
compute–Compute the 2D convolution operation using im2col and matrix multiplication.
-
gradient–Compute the gradient with respect to both input and kernel tensors.
Notes
- Input tensor A should have shape (N, H, W, C_in)
- Kernel tensor B should have shape (K, K, C_in, C_out) where K is the kernel size
- Output tensor will have shape (N, out_H, out_W, C_out)
- Uses im2col transformation for efficient computation
Dilate
Bases: TensorOp
Dilate a tensor by inserting zeros between elements along specified axes.
This operation inserts zeros between elements along the specified axes, effectively increasing the size of the tensor in those dimensions. This is commonly used in convolutional neural networks for dilated convolutions.
Parameters:
-
axes(tuple[int, ...]) –The axes along which to apply dilation. Each axis index must be valid for the tensor's dimensions.
-
dilation(int) –The dilation factor. For each element in the original tensor,
dilationzeros will be inserted after it along the specified axes.
Methods:
-
compute–Compute the dilation operation on the input NDArray.
-
gradient–Compute the gradient of the dilation operation (returns undilated gradient).
Examples:
>>> x = Tensor([[1, 2], [3, 4]])
>>> Dilate((0,), 1)(x)
Tensor([[1, 2], [0, 0], [3, 4]])
>>> Dilate((1,), 1)(x)
Tensor([[1, 0, 2], [3, 0, 4]])
>>> Dilate((0, 1), 1)(x)
Tensor([[1, 0, 2], [0, 0, 0], [3, 0, 4]])
EWiseAdd
Bases: TensorOp
Element-wise addition of two tensors.
Methods:
-
compute–Compute element-wise addition.
-
gradient–Compute the gradient with respect to both inputs.
EWiseDivide
Bases: TensorOp
Element-wise division of two tensors.
Methods:
-
compute–Compute element-wise division.
-
gradient–Compute the gradient with respect to both inputs.
EWiseMul
Bases: TensorOp
Element-wise multiplication of two tensors.
Methods:
-
compute–Compute element-wise multiplication.
-
gradient–Compute the gradient with respect to both inputs.
EWisePower
Bases: TensorOp
Element-wise power operation between two tensors.
Methods:
-
compute–Compute element-wise power operation.
-
gradient–Compute the gradient with respect to both inputs.
Exp
Bases: TensorOp
Exponential of tensor elements.
Methods:
-
compute–Compute exponential.
-
gradient–Compute the gradient of the operation.
Flip
Bases: TensorOp
Reverse (flip) the order of elements in a tensor along the specified axes.
Parameters:
-
axes(tuple[int, ...] or None, default:None) –Axes along which to flip the tensor. Each axis index must be valid for the tensor's dimensions. If None, flip over all axes (reverse the tensor in every dimension).
Methods:
-
compute–Compute the flip operation on the input NDArray.
-
gradient–Compute the gradient of the flip operation (flip the gradient along the same axes).
Raises:
-
AxisError–If the number of axes is greater than the number of dimensions, or if any axis is out of bounds.
Examples:
>>> x = Tensor([[1, 2], [3, 4]])
>>> Flip((0,))(x)
Tensor([[3, 4], [1, 2]])
>>> Flip((1,))(x)
Tensor([[2, 1], [4, 3]])
>>> Flip((0, 1))(x)
Tensor([[4, 3], [2, 1]])
>>> Flip()(x)
Tensor([[4, 3], [2, 1]])
Log
Bases: TensorOp
Natural logarithm of tensor elements.
Methods:
-
compute–Compute natural logarithm.
-
gradient–Compute the gradient of the operation.
LogSumExp
Bases: TensorOp
Log-sum-exp operation, commonly used in softmax computation.
Parameters:
-
axes(tuple or None, default:None) –Axes along which to perform the operation. If None, use all axes.
Methods:
-
compute–Compute log-sum-exp operation.
-
gradient–Compute the gradient of the operation.
MatMul
Bases: TensorOp
Matrix multiplication between two tensors.
Methods:
-
compute–Compute matrix multiplication.
-
gradient–Compute the gradient with respect to both inputs.
Negate
Bases: TensorOp
Negate a tensor element-wise.
Methods:
-
compute–Compute the negation operation.
-
gradient–Compute the gradient of the operation.
ReLU
Bases: TensorOp
Rectified Linear Unit activation function.
Methods:
-
compute–Compute ReLU activation.
-
gradient–Compute the gradient of the operation.
Reshape
Bases: TensorOp
Reshape a tensor to a new shape.
Parameters:
-
shape(tuple) –The target shape for the tensor.
Methods:
-
compute–Compute the reshape operation.
-
gradient–Compute the gradient of the operation.
ScalarAdd
Bases: TensorOp
Add a scalar to a tensor.
Parameters:
-
scalar(float) –The scalar value to add to the tensor.
Methods:
-
compute–Compute the scalar addition operation.
-
gradient–Compute the gradient of the operation.
ScalarDivide
Bases: TensorOp
Divide a tensor by a scalar.
Parameters:
-
scalar(float) –The scalar value to divide by.
Methods:
-
compute–Compute the scalar division.
-
gradient–Compute the gradient of the operation.
ScalarMul
Bases: TensorOp
Multiply a tensor by a scalar.
Parameters:
-
scalar(float) –The scalar value to multiply with the tensor.
Methods:
-
compute–Compute the scalar multiplication.
-
gradient–Compute the gradient of the operation.
ScalarPower
Bases: TensorOp
Raise tensor elements to a scalar power.
Parameters:
-
scalar(float) –The power to raise tensor elements to.
Methods:
-
compute–Compute the power operation.
-
gradient–Compute the gradient of the operation.
Split
Bases: TensorTupleOp
Split a tensor along an axis into a tuple of tensors.
This operation is the inverse of Stack. It splits a tensor along a specified axis into multiple tensors, each with one less dimension than the input tensor.
Parameters:
-
axis(int) –The axis along which to split the tensor. The axis dimension will be removed from each resulting tensor.
Methods:
-
compute–Split the input array along the specified axis.
-
gradient–Compute the gradient of the split operation (returns stack of out_grad tensors).
Stack
Bases: TensorOp
Stack a sequence of arrays along a new axis.
Parameters:
-
axis(int) –The axis along which to stack. The new axis will be inserted at this position in the result array shape.
Methods:
-
compute–Stack the input arrays along the specified axis.
-
gradient–Compute the gradient of the stack operation (returns split of out_grad along axis).
Summation
Bases: TensorOp
Sum tensor elements along specified axes.
Parameters:
-
axes(tuple or None, default:None) –Axes along which to perform summation. If None, sum over all axes.
Methods:
-
compute–Compute the summation operation.
-
gradient–Compute the gradient of the operation.
Tanh
Bases: TensorOp
Hyperbolic tangent activation function.
Methods:
-
compute–Compute hyperbolic tangent.
-
gradient–Compute the gradient of the operation.
Transpose
Bases: TensorOp
Transpose a tensor along specified axes.
Parameters:
-
axes(tuple or None, default:None) –Permutation of the dimensions. If None, reverse the last two dimensions.
Methods:
-
compute–Compute the transpose operation.
-
gradient–Compute the gradient of the operation.
UnDilate
Bases: TensorOp
Undilate a tensor by removing zeros inserted by dilation along specified axes.
This operation is the inverse of Dilate. It removes the zeros that were inserted during dilation, effectively reducing the size of the tensor in those dimensions. This is commonly used in convolutional neural networks for dilated convolutions.
Parameters:
-
axes(tuple[int, ...]) –The axes along which to apply undilation. Each axis index must be valid for the tensor's dimensions.
-
dilation(int) –The dilation factor that was used in the original Dilate operation.
Methods:
-
compute–Compute the undilation operation on the input NDArray.
-
gradient–Compute the gradient of the undilation operation (returns dilated gradient).
Examples:
>>> x = Tensor([[1, 0, 2], [0, 0, 0], [3, 0, 4]])
>>> UnDilate((0,), 1)(x)
Tensor([[1, 2], [3, 4]])
>>> UnDilate((1,), 1)(x)
Tensor([[1, 2], [3, 4]])
>>> UnDilate((0, 1), 1)(x)
Tensor([[1, 2], [3, 4]])
add(a, b)
add_scalar(a, scalar)
broadcast_to(a, shape)
conv(a, b, stride=1, padding=1)
Perform 2D convolution between input tensor and kernel.
This function performs 2D convolution between an input tensor and a kernel tensor. The input is expected to be in NHWC format (batch, height, width, channels) and the kernel in KKCC format (kernel_height, kernel_width, input_channels, output_channels).
Parameters:
-
a(Tensor) –Input tensor with shape (N, H, W, C_in) in NHWC format.
-
b(Tensor) –Kernel tensor with shape (K, K, C_in, C_out) where K is the kernel size.
-
stride(int, default:1) –The stride of the convolution. Default is 1.
-
padding(int, default:1) –The amount of padding to apply to the input. Default is 1.
Returns:
-
Tensor–Convolved tensor with shape (N, out_H, out_W, C_out) where: - out_H = (H + 2padding - K) // stride + 1 - out_W = (W + 2padding - K) // stride + 1
Notes
- Uses im2col transformation for efficient computation
- Supports automatic differentiation through gradient computation
- Kernel must be square (K x K)
- Input and kernel channel dimensions must match
Examples:
>>> x = Tensor.randn(1, 32, 32, 3) # 1 batch, 32x32 image, 3 channels
>>> kernel = Tensor.randn(3, 3, 3, 16) # 3x3 kernel, 3 input channels, 16 output channels
>>> result = conv(x, kernel, stride=1, padding=1)
>>> result.shape # (1, 32, 32, 16)
(1, 32, 32, 16)
dilate(a, axes, dilation)
Dilate a tensor by inserting zeros between elements along specified axes.
This function inserts zeros between elements along the specified axes, effectively increasing the size of the tensor in those dimensions. This is commonly used in convolutional neural networks for dilated convolutions.
Parameters:
-
a(Tensor) –Input tensor to be dilated.
-
axes(tuple[int, ...]) –The axes along which to apply dilation. Each axis index must be valid for the tensor's dimensions.
-
dilation(int) –The dilation factor. For each element in the original tensor,
dilationzeros will be inserted after it along the specified axes.
Returns:
-
Tensor–A dilated tensor with zeros inserted along the specified axes.
Examples:
>>> x = Tensor([[1, 2], [3, 4]])
>>> dilate(x, (0,), 1)
Tensor([[1, 2], [0, 0], [3, 4]])
>>> dilate(x, (1,), 1)
Tensor([[1, 0, 2], [3, 0, 4]])
>>> dilate(x, (0, 1), 1)
Tensor([[1, 0, 2], [0, 0, 0], [3, 0, 4]])
divide(a, b)
divide_scalar(a, scalar)
exp(a)
flip(a, axes=None)
Reverse (flip) the order of elements in a tensor along the specified axes.
Parameters:
-
a(Tensor) –Input tensor to be flipped.
-
axes(tuple[int, ...] or None, default:None) –Axes along which to flip the tensor. Each axis index must be valid for the tensor's dimensions. If None, flip over all axes (reverse the tensor in every dimension).
Returns:
-
Tensor–A tensor with the entries reversed along the specified axes.
Raises:
-
AxisError–If the number of axes is greater than the number of dimensions, or if any axis is out of bounds.
Examples:
>>> x = Tensor([[1, 2], [3, 4]])
>>> flip(x, (0,))
Tensor([[3, 4], [1, 2]])
>>> flip(x, (1,))
Tensor([[2, 1], [4, 3]])
>>> flip(x, (0, 1))
Tensor([[4, 3], [2, 1]])
>>> flip(x)
Tensor([[4, 3], [2, 1]])
log(a)
logsumexp(a, axes=None)
matmul(a, b)
mul_scalar(a, scalar)
multiply(a, b)
negate(a)
power(a, b)
power_scalar(a, scalar)
relu(a)
reshape(a, shape)
split(a, axis)
Split a tensor along an axis into a tuple of tensors.
This function splits a tensor along a specified axis into multiple tensors. Each resulting tensor has one less dimension than the input tensor.
Parameters:
-
a(Tensor) –Input tensor to split.
-
axis(int) –The axis along which to split the tensor. The axis dimension will be removed from each resulting tensor.
Returns:
-
TensorTuple–A tuple of tensors, each with the specified axis dimension removed. The number of tensors in the tuple equals the size of the input tensor along the specified axis.
Examples:
>>> x = Tensor([[1, 2, 3], [4, 5, 6]])
>>> result = split(x, axis=0)
>>> len(result) # Returns 2 tensors
2
>>> result[0].shape # Each tensor has shape (3,)
(3,)
stack(arrays, axis)
Stack a sequence of tensors along a new axis.
Parameters:
-
arrays(list of Tensor) –Sequence of tensors to stack. All tensors must have the same shape.
-
axis(int) –The axis along which to stack. The new axis will be inserted at this position in the result tensor shape.
Returns:
-
Tensor–The stacked tensor with one more dimension than the input tensors.
summation(a, axes=None)
tanh(a)
transpose(a, axes=None)
undilate(a, axes, dilation)
Undilate a tensor by removing zeros inserted by dilation along specified axes.
This function is the inverse of dilate. It removes the zeros that were inserted during dilation, effectively reducing the size of the tensor in those dimensions. This is commonly used in convolutional neural networks for dilated convolutions.
Parameters:
-
a(Tensor) –Input tensor to be undilated.
-
axes(tuple[int, ...]) –The axes along which to apply undilation. Each axis index must be valid for the tensor's dimensions.
-
dilation(int) –The dilation factor that was used in the original dilate operation.
Returns:
-
Tensor–An undilated tensor with zeros removed along the specified axes.
Examples:
>>> x = Tensor([[1, 0, 2], [0, 0, 0], [3, 0, 4]])
>>> undilate(x, (0,), 1)
Tensor([[1, 2], [3, 4]])
>>> undilate(x, (1,), 1)
Tensor([[1, 2], [3, 4]])
>>> undilate(x, (0, 1), 1)
Tensor([[1, 2], [3, 4]])