Discovering activation functions

Introduction to Deep Learning with PyTorch

Jasmin Ludolf

Senior Data Science Content Developer, DataCamp

Activation functions

Activation functions add non-linearity to the network
- Sigmoid for binary classification
- Softmax for multi-class classification
A network can learn more complex relationships with non-linearity
"Pre-activation" output passed to the activation function

Diagram of a neural network with linear layers and an activation function

Meet the sigmoid function

Diagram of part of a neural network with input, linear layers, and activation function

Mammal or not?

An illustration of a lemur

Meet the sigmoid function

Diagram of part of a neural network with inputs and linear layers

Mammal or not?

An illustration of a lemur

Input:
- Limbs: 4
- Eggs: 0
- Hair: 1

Meet the sigmoid function

Diagram of part of a neural network with inputs and the number 6 as output to the linear layers

Mammal or not?

An illustration of a lemur

Output to the linear layers is 6

Meet the sigmoid function

Diagram of part of a neural network with inputs, the number 6 as output to the linear layers, and a sigmoid activation function

Mammal or not?

An illustration of a lemur

We take the pre-activation output (6) and pass it to the sigmoid function

Meet the sigmoid function

Diagram of part of a neural network with inputs, the number 6 as output to the linear layers, a sigmoid activation function, and an output

Mammal or not?

An illustration of a lemur

We take the pre-activation output (6) and pass it to the sigmoid function
Obtain a value between 0 and 1
If output is > 0.5, class label = 1 (mammal)
If output is <= 0.5, class label = 0 (not mammal)

Meet the sigmoid function

import torch
import torch.nn as nn

input_tensor = torch.tensor([[6]])
sigmoid = nn.Sigmoid()

output = sigmoid(input_tensor)
print(output)

tensor([[0.9975]])

Activation as the last layer

model = nn.Sequential(
  nn.Linear(6, 4), # First linear layer
  nn.Linear(4, 1), # Second linear layer
  nn.Sigmoid() # Sigmoid activation function
)

Sigmoid as last step in network of linear layers is equivalent to traditional logistic regression

Getting acquainted with softmax

Three classes:

Getting acquainted with softmax

Three classes:

A lemur with the option "bird"

Getting acquainted with softmax

Three classes:

A lemur with the options "bird" and mammal

Getting acquainted with softmax

Three classes:

A lemur with three options: bird, mammal, or reptile

Getting acquainted with softmax

Diagram of part of a neural network with inputs

Takes three-dimensional as input and outputs the same shape

Getting acquainted with softmax

Diagram of part of a neural network with inputs, a vector as output to the linear layers, a softmax activation function, and an output

Takes three-dimensional as input and outputs the same shape
Outputs a probability distribution:
- Each element is a probability (it's bounded between 0 and 1)
- The sum of the output vector is equal to 1

Getting acquainted with softmax

Diagram of part of a neural network with inputs, a vector as output to the linear layers, a softmax activation function, and an output

Takes three-dimensional as input and outputs the same shape
Outputs a probability distribution:
- Each element is a probability (it's bounded between 0 and 1)
- The sum of the output vector is equal to 1

Getting acquainted with softmax

import torch
import torch.nn as nn

# Create an input tensor
input_tensor = torch.tensor(
    [[4.3, 6.1, 2.3]])

# Apply softmax along the last dimension

probabilities = nn.Softmax(dim=-1)
output_tensor = probabilities(input_tensor)
print(output_tensor)

tensor([[0.1392, 0.8420, 0.0188]])

dim = -1 indicates softmax is applied to the input tensor's last dimension
nn.Softmax() can be used as last step in nn.Sequential()

Let's practice!

Introduction to Deep Learning with PyTorch