Element-wise operations
An element-wise operation operates on corresponding elements between tensors.
Two tensors must have the same shape in order to perform element-wise operations on them.
Suppose we have the following two tensors(Both of these tensors are rank-2 tensors with a shape of 2 \(\times\) 2):
t1 = torch.tensor([ [1, 2], [3, 4]], dtype=torch.float32)t2 = torch.tensor([ [9, 8], [7, 6]], dtype=torch.float32)
The elements of the first axis are arrays and the elements of the second axis are numbers.
# Example of the first axis> print(t1[0])tensor([1., 2.])# Example of the second axis> print(t1[0][0])tensor(1.)
Addition is an element-wise operation.
> t1 + t2tensor([[10., 10.], [10., 10.]])
In fact, all the arithmetic operations, add, subtract, multiply, and divide are element-wise operations. There are two ways we can do this:
1) Using these symbolic operations:
> t + 2tensor([[3., 4.], [5., 6.]])> t - 2tensor([[-1., 0.], [1., 2.]])> t * 2tensor([[2., 4.], [6., 8.]])> t / 2tensor([[0.5000, 1.0000], [1.5000, 2.0000]])
2) Or equivalently, these built-in tensor methods:
> t.add(2)tensor([[3., 4.], [5., 6.]])> t.sub(2)tensor([[-1., 0.], [1., 2.]])> t.mul(2)tensor([[2., 4.], [6., 8.]])> t.div(2)tensor([[0.5000, 1.0000], [1.5000, 2.0000]])
Broadcasting tensors
Broadcasting is the concept whose implementation allows us to add scalars to higher dimensional tensors.
We can see what the broadcasted scalar value looks like using the broadcast_to()
Numpy function:
> np.broadcast_to(2, t.shape)array([[2, 2], [2, 2]])//This means the scalar value is transformed into a rank-2 tensor just like t, and //just like that, the shapes match and the element-wise rule of having the same //shape is back in play.
Trickier example of broadcasting
t1 = torch.tensor([ [1, 1], [1, 1]], dtype=torch.float32)t2 = torch.tensor([2, 4], dtype=torch.float32)
Even through these two tensors have differing shapes, the element-wise operation is possible, and broadcasting is what makes the operation possible.
> np.broadcast_to(t2.numpy(), t1.shape)array([[2., 4.], [2., 4.]], dtype=float32)>t1 + t2tensor([[3., 5.], [3., 5.]])
When do we actually use broadcasting? We often need to use broadcasting when we are preprocessing our data, and especially during normalization routines.
Comparison operations are element-wise. For a given comparison operation between tensors, a new tensor of the same shape is returned with each element containing either a 0 or a 1.
> t = torch.tensor([ [0, 5, 0], [6, 0, 7], [0, 8, 0]], dtype=torch.float32)
Let's check out some of the comparison operations.
> t.eq(0)tensor([[1, 0, 1], [0, 1, 0], [1, 0, 1]], dtype=torch.uint8)> t.ge(0)tensor([[1, 1, 1], [1, 1, 1], [1, 1, 1]], dtype=torch.uint8)> t.gt(0)tensor([[0, 1, 0], [1, 0, 1], [0, 1, 0]], dtype=torch.uint8)> t.lt(0)tensor([[0, 0, 0], [0, 0, 0], [0, 0, 0]], dtype=torch.uint8)> t.le(7)tensor([[1, 1, 1], [1, 1, 1], [1, 0, 1]], dtype=torch.uint8)
Element-wise operations using functions
Here are some examples:
> t.abs() tensor([[0., 5., 0.], [6., 0., 7.], [0., 8., 0.]])> t.sqrt()tensor([[0.0000, 2.2361, 0.0000], [2.4495, 0.0000, 2.6458], [0.0000, 2.8284, 0.0000]])> t.neg()tensor([[-0., -5., -0.], [-6., -0., -7.], [-0., -8., -0.]])> t.neg().abs()tensor([[0., 5., 0.], [6., 0., 7.], [0., 8., 0.]])