Skip to content

Operators

Tensor operations module for tiny-pytorch implementation.

This module provides a comprehensive collection of fundamental tensor operations that form the building blocks of the computational graph in tiny-pytorch. Each operation is implemented as a class that inherits from the TensorOp base class, with corresponding helper functions for easier usage.

The module includes element-wise operations, matrix operations, reduction operations, activation functions, and various mathematical functions commonly used in deep learning and neural network computations.

Key Features
  • Automatic differentiation support through gradient methods
  • Element-wise and scalar operations
  • Matrix operations (multiplication, transpose)
  • Reduction operations (summation, log-sum-exp)
  • Activation functions (ReLU, tanh)
  • Shape manipulation (reshape, broadcast, stack, split)
  • Convolutional operations
  • Memory-efficient operations with strided arrays

Classes:

  • TensorOp

    Base class for all tensor operations.

  • TensorTupleOp

    Base class for operations that return tensor tuples.

  • ScalarAdd

    Addition of a scalar to a tensor.

  • EWiseAdd

    Element-wise addition of two tensors.

  • ScalarMul

    Multiplication of a tensor by a scalar.

  • EWiseMul

    Element-wise multiplication of two tensors.

  • Negate

    Negation of a tensor.

  • ScalarPower

    Raising tensor elements to a scalar power.

  • EWisePower

    Element-wise power operation between two tensors.

  • ScalarDivide

    Division of a tensor by a scalar.

  • EWiseDivide

    Element-wise division of two tensors.

  • Reshape

    Reshaping a tensor to a new shape.

  • Summation

    Summing tensor elements along specified axes.

  • BroadcastTo

    Broadcasting a tensor to a larger shape.

  • Transpose

    Transposing a tensor along specified axes.

  • MatMul

    Matrix multiplication between two tensors.

  • Log

    Natural logarithm of tensor elements.

  • Exp

    Exponential of tensor elements.

  • ReLU

    Rectified Linear Unit activation function.

  • LogSumExp

    Log-sum-exp operation, commonly used in softmax computation.

  • Tanh

    Hyperbolic tangent activation function.

  • Stack

    Stack a sequence of arrays along a new axis.

  • Split

    Split a tensor along a specified axis.

  • Flip

    Reverse the order of elements along specified axes.

  • Dilate

    Insert zeros between elements along specified axes.

  • UnDilate

    Remove zeros inserted by dilation along specified axes.

  • Conv

    2D convolution operation.

Functions:

  • add_scalar

    Add a scalar to a tensor.

  • add

    Add two tensors element-wise.

  • mul_scalar

    Multiply a tensor by a scalar.

  • multiply

    Multiply two tensors element-wise.

  • negate

    Negate a tensor.

  • power_scalar

    Raise tensor elements to a scalar power.

  • power

    Element-wise power operation.

  • divide_scalar

    Divide a tensor by a scalar.

  • divide

    Element-wise division of tensors.

  • reshape

    Reshape a tensor.

  • summation

    Sum tensor elements along specified axes.

  • broadcast_to

    Broadcast tensor to a larger shape.

  • transpose

    Transpose tensor axes.

  • matmul

    Matrix multiplication.

  • log

    Natural logarithm.

  • exp

    Exponential function.

  • relu

    ReLU activation function.

  • logsumexp

    Log-sum-exp operation.

  • tanh

    Hyperbolic tangent function.

  • stack

    Stack a sequence of arrays along a new axis.

  • split

    Split a tensor along a specified axis.

  • flip

    Reverse the order of elements along specified axes.

  • dilate

    Insert zeros between elements along specified axes.

  • undilate

    Remove zeros inserted by dilation along specified axes.

  • conv

    2D convolution operation.

Notes

All operations support automatic differentiation through their gradient methods, making them suitable for building and training neural networks. The operations are designed to work efficiently with the NDArray backend system and support multiple devices (CPU, CUDA, NumPy).

Examples:

>>> import tiny_pytorch as tp
>>> x = tp.Tensor([1, 2, 3])
>>> y = tp.Tensor([4, 5, 6])
>>> z = tp.ops.add(x, y)  # Element-wise addition
>>> w = tp.ops.matmul(x, y)  # Matrix multiplication

BroadcastTo

Bases: TensorOp

Broadcast a tensor to a larger shape.

Parameters:

  • shape (tuple) –

    Target shape to broadcast to.

Methods:

  • compute

    Compute the broadcast operation.

  • gradient

    Compute the gradient of the operation.

Conv

Bases: TensorOp

2D convolution operation between input tensor and kernel.

This operation performs 2D convolution between an input tensor and a kernel tensor. The input is expected to be in NHWC format (batch, height, width, channels) and the kernel in KKCC format (kernel_height, kernel_width, input_channels, output_channels).

Parameters:

  • stride (int, default: 1 ) –

    The stride of the convolution. Default is 1.

  • padding (int, default: 0 ) –

    The amount of padding to apply to the input. Default is 0.

Methods:

  • compute

    Compute the 2D convolution operation using im2col and matrix multiplication.

  • gradient

    Compute the gradient with respect to both input and kernel tensors.

Notes
  • Input tensor A should have shape (N, H, W, C_in)
  • Kernel tensor B should have shape (K, K, C_in, C_out) where K is the kernel size
  • Output tensor will have shape (N, out_H, out_W, C_out)
  • Uses im2col transformation for efficient computation

Dilate

Bases: TensorOp

Dilate a tensor by inserting zeros between elements along specified axes.

This operation inserts zeros between elements along the specified axes, effectively increasing the size of the tensor in those dimensions. This is commonly used in convolutional neural networks for dilated convolutions.

Parameters:

  • axes (tuple[int, ...]) –

    The axes along which to apply dilation. Each axis index must be valid for the tensor's dimensions.

  • dilation (int) –

    The dilation factor. For each element in the original tensor, dilation zeros will be inserted after it along the specified axes.

Methods:

  • compute

    Compute the dilation operation on the input NDArray.

  • gradient

    Compute the gradient of the dilation operation (returns undilated gradient).

Examples:

>>> x = Tensor([[1, 2], [3, 4]])
>>> Dilate((0,), 1)(x)
Tensor([[1, 2], [0, 0], [3, 4]])
>>> Dilate((1,), 1)(x)
Tensor([[1, 0, 2], [3, 0, 4]])
>>> Dilate((0, 1), 1)(x)
Tensor([[1, 0, 2], [0, 0, 0], [3, 0, 4]])

EWiseAdd

Bases: TensorOp

Element-wise addition of two tensors.

Methods:

  • compute

    Compute element-wise addition.

  • gradient

    Compute the gradient with respect to both inputs.

EWiseDivide

Bases: TensorOp

Element-wise division of two tensors.

Methods:

  • compute

    Compute element-wise division.

  • gradient

    Compute the gradient with respect to both inputs.

EWiseMul

Bases: TensorOp

Element-wise multiplication of two tensors.

Methods:

  • compute

    Compute element-wise multiplication.

  • gradient

    Compute the gradient with respect to both inputs.

EWisePower

Bases: TensorOp

Element-wise power operation between two tensors.

Methods:

  • compute

    Compute element-wise power operation.

  • gradient

    Compute the gradient with respect to both inputs.

Exp

Bases: TensorOp

Exponential of tensor elements.

Methods:

  • compute

    Compute exponential.

  • gradient

    Compute the gradient of the operation.

Flip

Bases: TensorOp

Reverse (flip) the order of elements in a tensor along the specified axes.

Parameters:

  • axes (tuple[int, ...] or None, default: None ) –

    Axes along which to flip the tensor. Each axis index must be valid for the tensor's dimensions. If None, flip over all axes (reverse the tensor in every dimension).

Methods:

  • compute

    Compute the flip operation on the input NDArray.

  • gradient

    Compute the gradient of the flip operation (flip the gradient along the same axes).

Raises:

  • AxisError

    If the number of axes is greater than the number of dimensions, or if any axis is out of bounds.

Examples:

>>> x = Tensor([[1, 2], [3, 4]])
>>> Flip((0,))(x)
Tensor([[3, 4], [1, 2]])
>>> Flip((1,))(x)
Tensor([[2, 1], [4, 3]])
>>> Flip((0, 1))(x)
Tensor([[4, 3], [2, 1]])
>>> Flip()(x)
Tensor([[4, 3], [2, 1]])

Log

Bases: TensorOp

Natural logarithm of tensor elements.

Methods:

  • compute

    Compute natural logarithm.

  • gradient

    Compute the gradient of the operation.

LogSumExp

Bases: TensorOp

Log-sum-exp operation, commonly used in softmax computation.

Parameters:

  • axes (tuple or None, default: None ) –

    Axes along which to perform the operation. If None, use all axes.

Methods:

  • compute

    Compute log-sum-exp operation.

  • gradient

    Compute the gradient of the operation.

MatMul

Bases: TensorOp

Matrix multiplication between two tensors.

Methods:

  • compute

    Compute matrix multiplication.

  • gradient

    Compute the gradient with respect to both inputs.

Negate

Bases: TensorOp

Negate a tensor element-wise.

Methods:

  • compute

    Compute the negation operation.

  • gradient

    Compute the gradient of the operation.

ReLU

Bases: TensorOp

Rectified Linear Unit activation function.

Methods:

  • compute

    Compute ReLU activation.

  • gradient

    Compute the gradient of the operation.

Reshape

Bases: TensorOp

Reshape a tensor to a new shape.

Parameters:

  • shape (tuple) –

    The target shape for the tensor.

Methods:

  • compute

    Compute the reshape operation.

  • gradient

    Compute the gradient of the operation.

ScalarAdd

Bases: TensorOp

Add a scalar to a tensor.

Parameters:

  • scalar (float) –

    The scalar value to add to the tensor.

Methods:

  • compute

    Compute the scalar addition operation.

  • gradient

    Compute the gradient of the operation.

ScalarDivide

Bases: TensorOp

Divide a tensor by a scalar.

Parameters:

  • scalar (float) –

    The scalar value to divide by.

Methods:

  • compute

    Compute the scalar division.

  • gradient

    Compute the gradient of the operation.

ScalarMul

Bases: TensorOp

Multiply a tensor by a scalar.

Parameters:

  • scalar (float) –

    The scalar value to multiply with the tensor.

Methods:

  • compute

    Compute the scalar multiplication.

  • gradient

    Compute the gradient of the operation.

ScalarPower

Bases: TensorOp

Raise tensor elements to a scalar power.

Parameters:

  • scalar (float) –

    The power to raise tensor elements to.

Methods:

  • compute

    Compute the power operation.

  • gradient

    Compute the gradient of the operation.

Split

Bases: TensorTupleOp

Split a tensor along an axis into a tuple of tensors.

This operation is the inverse of Stack. It splits a tensor along a specified axis into multiple tensors, each with one less dimension than the input tensor.

Parameters:

  • axis (int) –

    The axis along which to split the tensor. The axis dimension will be removed from each resulting tensor.

Methods:

  • compute

    Split the input array along the specified axis.

  • gradient

    Compute the gradient of the split operation (returns stack of out_grad tensors).

Stack

Bases: TensorOp

Stack a sequence of arrays along a new axis.

Parameters:

  • axis (int) –

    The axis along which to stack. The new axis will be inserted at this position in the result array shape.

Methods:

  • compute

    Stack the input arrays along the specified axis.

  • gradient

    Compute the gradient of the stack operation (returns split of out_grad along axis).

Summation

Bases: TensorOp

Sum tensor elements along specified axes.

Parameters:

  • axes (tuple or None, default: None ) –

    Axes along which to perform summation. If None, sum over all axes.

Methods:

  • compute

    Compute the summation operation.

  • gradient

    Compute the gradient of the operation.

Tanh

Bases: TensorOp

Hyperbolic tangent activation function.

Methods:

  • compute

    Compute hyperbolic tangent.

  • gradient

    Compute the gradient of the operation.

Transpose

Bases: TensorOp

Transpose a tensor along specified axes.

Parameters:

  • axes (tuple or None, default: None ) –

    Permutation of the dimensions. If None, reverse the last two dimensions.

Methods:

  • compute

    Compute the transpose operation.

  • gradient

    Compute the gradient of the operation.

UnDilate

Bases: TensorOp

Undilate a tensor by removing zeros inserted by dilation along specified axes.

This operation is the inverse of Dilate. It removes the zeros that were inserted during dilation, effectively reducing the size of the tensor in those dimensions. This is commonly used in convolutional neural networks for dilated convolutions.

Parameters:

  • axes (tuple[int, ...]) –

    The axes along which to apply undilation. Each axis index must be valid for the tensor's dimensions.

  • dilation (int) –

    The dilation factor that was used in the original Dilate operation.

Methods:

  • compute

    Compute the undilation operation on the input NDArray.

  • gradient

    Compute the gradient of the undilation operation (returns dilated gradient).

Examples:

>>> x = Tensor([[1, 0, 2], [0, 0, 0], [3, 0, 4]])
>>> UnDilate((0,), 1)(x)
Tensor([[1, 2], [3, 4]])
>>> UnDilate((1,), 1)(x)
Tensor([[1, 2], [3, 4]])
>>> UnDilate((0, 1), 1)(x)
Tensor([[1, 2], [3, 4]])

add(a, b)

Add two tensors element-wise.

Parameters:

  • a (Tensor) –

    First input tensor.

  • b (Tensor) –

    Second input tensor.

Returns:

  • Tensor

    Element-wise sum of the input tensors.

add_scalar(a, scalar)

Add a scalar value to a tensor.

Parameters:

  • a (Tensor) –

    Input tensor.

  • scalar (float) –

    Scalar value to add.

Returns:

  • Tensor

    A new tensor with the scalar added to each element.

broadcast_to(a, shape)

Broadcast a tensor to a larger shape.

Parameters:

  • a (Tensor) –

    Input tensor.

  • shape (tuple) –

    Target shape to broadcast to.

Returns:

  • Tensor

    Broadcasted tensor with the specified shape.

conv(a, b, stride=1, padding=1)

Perform 2D convolution between input tensor and kernel.

This function performs 2D convolution between an input tensor and a kernel tensor. The input is expected to be in NHWC format (batch, height, width, channels) and the kernel in KKCC format (kernel_height, kernel_width, input_channels, output_channels).

Parameters:

  • a (Tensor) –

    Input tensor with shape (N, H, W, C_in) in NHWC format.

  • b (Tensor) –

    Kernel tensor with shape (K, K, C_in, C_out) where K is the kernel size.

  • stride (int, default: 1 ) –

    The stride of the convolution. Default is 1.

  • padding (int, default: 1 ) –

    The amount of padding to apply to the input. Default is 1.

Returns:

  • Tensor

    Convolved tensor with shape (N, out_H, out_W, C_out) where: - out_H = (H + 2padding - K) // stride + 1 - out_W = (W + 2padding - K) // stride + 1

Notes
  • Uses im2col transformation for efficient computation
  • Supports automatic differentiation through gradient computation
  • Kernel must be square (K x K)
  • Input and kernel channel dimensions must match

Examples:

>>> x = Tensor.randn(1, 32, 32, 3)  # 1 batch, 32x32 image, 3 channels
>>> kernel = Tensor.randn(3, 3, 3, 16)  # 3x3 kernel, 3 input channels, 16 output channels
>>> result = conv(x, kernel, stride=1, padding=1)
>>> result.shape  # (1, 32, 32, 16)
(1, 32, 32, 16)

dilate(a, axes, dilation)

Dilate a tensor by inserting zeros between elements along specified axes.

This function inserts zeros between elements along the specified axes, effectively increasing the size of the tensor in those dimensions. This is commonly used in convolutional neural networks for dilated convolutions.

Parameters:

  • a (Tensor) –

    Input tensor to be dilated.

  • axes (tuple[int, ...]) –

    The axes along which to apply dilation. Each axis index must be valid for the tensor's dimensions.

  • dilation (int) –

    The dilation factor. For each element in the original tensor, dilation zeros will be inserted after it along the specified axes.

Returns:

  • Tensor

    A dilated tensor with zeros inserted along the specified axes.

Examples:

>>> x = Tensor([[1, 2], [3, 4]])
>>> dilate(x, (0,), 1)
Tensor([[1, 2], [0, 0], [3, 4]])
>>> dilate(x, (1,), 1)
Tensor([[1, 0, 2], [3, 0, 4]])
>>> dilate(x, (0, 1), 1)
Tensor([[1, 0, 2], [0, 0, 0], [3, 0, 4]])

divide(a, b)

Divide two tensors element-wise.

Parameters:

  • a (Tensor) –

    Numerator tensor.

  • b (Tensor) –

    Denominator tensor.

Returns:

  • Tensor

    Element-wise division of the input tensors.

divide_scalar(a, scalar)

Divide a tensor by a scalar value.

Parameters:

  • a (Tensor) –

    Input tensor.

  • scalar (float) –

    Scalar value to divide by.

Returns:

  • Tensor

    A new tensor with each element divided by the scalar.

exp(a)

Compute the exponential of tensor elements.

Parameters:

  • a (Tensor) –

    Input tensor.

Returns:

  • Tensor

    Exponential of input tensor elements.

flip(a, axes=None)

Reverse (flip) the order of elements in a tensor along the specified axes.

Parameters:

  • a (Tensor) –

    Input tensor to be flipped.

  • axes (tuple[int, ...] or None, default: None ) –

    Axes along which to flip the tensor. Each axis index must be valid for the tensor's dimensions. If None, flip over all axes (reverse the tensor in every dimension).

Returns:

  • Tensor

    A tensor with the entries reversed along the specified axes.

Raises:

  • AxisError

    If the number of axes is greater than the number of dimensions, or if any axis is out of bounds.

Examples:

>>> x = Tensor([[1, 2], [3, 4]])
>>> flip(x, (0,))
Tensor([[3, 4], [1, 2]])
>>> flip(x, (1,))
Tensor([[2, 1], [4, 3]])
>>> flip(x, (0, 1))
Tensor([[4, 3], [2, 1]])
>>> flip(x)
Tensor([[4, 3], [2, 1]])

log(a)

Compute the natural logarithm of tensor elements.

Parameters:

  • a (Tensor) –

    Input tensor.

Returns:

  • Tensor

    Natural logarithm of input tensor elements.

logsumexp(a, axes=None)

Compute log-sum-exp along specified axes.

Parameters:

  • a (Tensor) –

    Input tensor.

  • axes (tuple or None, default: None ) –

    Axes along which to perform the operation. If None, use all axes.

Returns:

  • Tensor

    Result of log-sum-exp operation.

matmul(a, b)

Perform matrix multiplication between two tensors.

Parameters:

  • a (Tensor) –

    First input tensor.

  • b (Tensor) –

    Second input tensor.

Returns:

  • Tensor

    Result of matrix multiplication.

mul_scalar(a, scalar)

Multiply a tensor by a scalar value.

Parameters:

  • a (Tensor) –

    Input tensor.

  • scalar (float) –

    Scalar value to multiply with.

Returns:

  • Tensor

    A new tensor with each element multiplied by the scalar.

multiply(a, b)

Multiply two tensors element-wise.

Parameters:

  • a (Tensor) –

    First input tensor.

  • b (Tensor) –

    Second input tensor.

Returns:

  • Tensor

    Element-wise product of the input tensors.

negate(a)

Negate a tensor element-wise.

Parameters:

  • a (Tensor) –

    Input tensor.

Returns:

  • Tensor

    A new tensor with each element negated.

power(a, b)

Raise elements of one tensor to powers specified by another tensor.

Parameters:

  • a (Tensor) –

    Base tensor.

  • b (Tensor) –

    Exponent tensor.

Returns:

  • Tensor

    Element-wise power operation result.

power_scalar(a, scalar)

Raise tensor elements to a scalar power.

Parameters:

  • a (Tensor) –

    Input tensor.

  • scalar (float) –

    Power to raise elements to.

Returns:

  • Tensor

    A new tensor with each element raised to the given power.

relu(a)

Apply Rectified Linear Unit (ReLU) activation function.

Parameters:

  • a (Tensor) –

    Input tensor.

Returns:

  • Tensor

    Tensor with ReLU activation applied.

reshape(a, shape)

Reshape a tensor to a new shape.

Parameters:

  • a (Tensor) –

    Input tensor.

  • shape (tuple) –

    Target shape for the tensor.

Returns:

  • Tensor

    A new tensor with the specified shape.

split(a, axis)

Split a tensor along an axis into a tuple of tensors.

This function splits a tensor along a specified axis into multiple tensors. Each resulting tensor has one less dimension than the input tensor.

Parameters:

  • a (Tensor) –

    Input tensor to split.

  • axis (int) –

    The axis along which to split the tensor. The axis dimension will be removed from each resulting tensor.

Returns:

  • TensorTuple

    A tuple of tensors, each with the specified axis dimension removed. The number of tensors in the tuple equals the size of the input tensor along the specified axis.

Examples:

>>> x = Tensor([[1, 2, 3], [4, 5, 6]])
>>> result = split(x, axis=0)
>>> len(result)  # Returns 2 tensors
2
>>> result[0].shape  # Each tensor has shape (3,)
(3,)

stack(arrays, axis)

Stack a sequence of tensors along a new axis.

Parameters:

  • arrays (list of Tensor) –

    Sequence of tensors to stack. All tensors must have the same shape.

  • axis (int) –

    The axis along which to stack. The new axis will be inserted at this position in the result tensor shape.

Returns:

  • Tensor

    The stacked tensor with one more dimension than the input tensors.

summation(a, axes=None)

Sum tensor elements along specified axes.

Parameters:

  • a (Tensor) –

    Input tensor.

  • axes (tuple or None, default: None ) –

    Axes along which to perform summation. If None, sum over all axes.

Returns:

  • Tensor

    Sum of elements along specified axes.

tanh(a)

Compute the hyperbolic tangent of tensor elements.

Parameters:

  • a (Tensor) –

    Input tensor.

Returns:

  • Tensor

    Hyperbolic tangent of input tensor elements.

transpose(a, axes=None)

Transpose a tensor along specified axes.

Parameters:

  • a (Tensor) –

    Input tensor.

  • axes (tuple or None, default: None ) –

    Permutation of the dimensions. If None, reverse the dimensions.

Returns:

  • Tensor

    Transposed tensor.

undilate(a, axes, dilation)

Undilate a tensor by removing zeros inserted by dilation along specified axes.

This function is the inverse of dilate. It removes the zeros that were inserted during dilation, effectively reducing the size of the tensor in those dimensions. This is commonly used in convolutional neural networks for dilated convolutions.

Parameters:

  • a (Tensor) –

    Input tensor to be undilated.

  • axes (tuple[int, ...]) –

    The axes along which to apply undilation. Each axis index must be valid for the tensor's dimensions.

  • dilation (int) –

    The dilation factor that was used in the original dilate operation.

Returns:

  • Tensor

    An undilated tensor with zeros removed along the specified axes.

Examples:

>>> x = Tensor([[1, 0, 2], [0, 0, 0], [3, 0, 4]])
>>> undilate(x, (0,), 1)
Tensor([[1, 2], [3, 4]])
>>> undilate(x, (1,), 1)
Tensor([[1, 2], [3, 4]])
>>> undilate(x, (0, 1), 1)
Tensor([[1, 2], [3, 4]])