type

Post

Created date

Mar 9, 2024 09:16 AM

category

Data Science

tags

Machine Learning

Artificial Intelligence

status

Published

Language

English

From

summary

Exploring sigmoid and softmax functions for predicting loan defaults effectively.

slug

softmax-sigmoid

password

Author

Priority

Featured

Featured

Cover

Origin

Type

URL

Youtube

Youtube

icon

#### Difference between Sigmoid and Softmax

## A conversation on what use of activation functions in the context of predicting loan defaults.

dialogue between Alex, a Machine Learning Engineer, and Jordan, a Product Manager, discussing the use of activation functions in the context of predicting loan defaults.

**Jordan:**Alex, I'm trying to understand how we're using neural networks for our loan default prediction model. Specifically, what's this about using different activation functions?

**Alex:**Sure, Jordan. In our neural network, an activation function decides whether a neuron should be activated or not. It's like deciding if a piece of information is relevant for the prediction.

**Jordan:**Okay, and what’s the role of the Sigmoid function here?

**Alex:**The Sigmoid function is perfect when we’re making a binary decision. In the context of loan defaults, it helps us decide between two classes: will default or will not default. It outputs a value between 0 and 1, which we can interpret as a probability.

**Jordan:**Got it. And the Softmax function?

**Alex:**Softmax is used when we have more than two classes. Although for loan defaults we generally have a yes or no decision, if we had multiple levels of risk we wanted to classify, like 'low', 'medium', or 'high', Softmax would be suitable as it gives a probability distribution across those classes.

**Jordan:**That makes sense. But what do you mean by using them in different layers?

**Alex:**In a neural network, we have an input layer, hidden layers, and an output layer. The Sigmoid can be used in the final layer for binary outcomes like our case, while Softmax is typically used in the final layer for multi-class problems. However, we can also use them in hidden layers to help model complex relationships.

**Jordan:**So in hidden layers, they help in understanding the complex patterns regarding who might default on a loan?

**Alex:**Exactly! They determine what information is passed forward through the network, contributing to our final prediction.

**Jordan:**Makes sense now. The activation function is crucial in shaping the output at each layer, whether it's recognizing simple patterns or making the final prediction in our loan default scenario.

**Alex:**Precisely. Each function plays a significant role in our model's ability to learn from the data and make accurate predictions.

## A dialogue between Taylor, a Data Scientist, and Casey, a Data Analyst, discussing the code

## Code from Softmax/SoftmaxActivation.py at main · AIMLModeling/Softmax (github.com)

`from numpy import exp import numpy as np import matplotlib.pyplot as plt # calculate the softmax of a vector def softmax(vector): e = exp(vector) return e / e.sum() def sigmoid(x): return 1/(1 + np.exp(-x)) # define data data = [-1.5, 2.2, -0.8, 3.6] # convert list of numbers to a list of probabilities print(f"Input vector:{data}") result_softmax = softmax(data) # report the probabilities print(f"softmax result:{result_softmax}") sum_softmax=0.0 for i in range(0, len(result_softmax)): sum_softmax = sum_softmax + result_softmax[i]; print(f"Sum of all the elements of softmax results: {sum_softmax}"); print("") sig_result=[0] *len(data) sum_sigmoid=0 for i in range(0, len(data)): sig_result[i] = sigmoid(data[i]); print(f"Sigmoid result {i}: {sig_result[i]}") sum_sigmoid = sum_sigmoid + sig_result[i]; print(f"Sum of all the elements of Sigmoid results: {sum_sigmoid}"); x = np.linspace(-10, 10, 100) y = softmax(x) plt.scatter(x, y) plt.title('Softmax Function') plt.show()`

**Casey:**Hey Taylor, I came across this code snippet that uses softmax and sigmoid functions, and I'm having trouble understanding it. Can you walk me through it?

**Taylor:**Of course, Casey! Let's start with the basics. Both softmax and sigmoid are activation functions in neural networks, which you already know. This code defines these functions and then applies them to a data vector.

**Casey:**Okay, I see two functions defined here,

**softmax**

and **sigmoid**

. What's the difference between them?**Taylor:**The

**softmax**

function is used to convert a vector of numbers into a vector of probabilities, where the probabilities of each value are proportional to the exponentials of the input numbers. On the other hand, **sigmoid**

function gives us the probability between 0 and 1 for an individual value.**Casey:**I see, so

**softmax**

is about the whole vector, and **sigmoid**

is for individual values. Why do we need to convert numbers into probabilities?**Taylor:**In the context of machine learning, probabilities help us make decisions. For instance, if we're trying to classify data into categories, probabilities give us a measure of confidence about our classifications.

**Casey:**Got it. Now, the code has a data vector. What does it represent?

**Taylor:**It's just an example data vector to demonstrate the functions. Think of it as raw scores or logits that you might get from the output layer of a neural network before activation.

**Casey:**Makes sense. And then we apply

**softmax**

to this vector, right?**Taylor:**Yes, we pass the data vector through the

**softmax**

function which normalizes these values into probabilities that sum up to 1, making it a proper probability distribution.**Casey:**The code prints the result and the sum of the softmax results. Why is the sum important?

**Taylor:**It's to show that

**softmax**

has done its job correctly. The sum of probabilities should be 1, which confirms that we have a valid probability distribution.**Casey:**Okay, and the

**sigmoid**

function is applied in a loop. Why is that?**Taylor:**The

**sigmoid**

function is meant for individual numbers. The loop applies **sigmoid**

to each number in the data vector separately, giving us a list of probabilities.**Casey:**So we end up with two different lists of probabilities, one from

**softmax**

and another from **sigmoid**

?**Taylor:**Exactly.

**softmax**

gives a distribution across our vector, useful for multi-class classification. **sigmoid**

gives individual probabilities, useful for binary classification.**Casey:**What about the scatter plot at the end?

**Taylor:**That's a visual representation of the softmax function. It plots the softmax probabilities for a range of values from -10 to 10. It's useful to see how the function behaves across different inputs.

**Casey:**Now it's clearer. We're using these functions to understand the probabilities of different outcomes, and the plot is to see how

**softmax**

assigns probabilities.**Taylor:**You've got it, Casey! And remember, understanding the output of these functions is key in predicting outcomes like whether a loan will default or not, based on the learned patterns.

**Casey:**Thanks, Taylor. This was really helpful!

I hope this dialogue clarifies the context and functionality of the code for you.

## A similar conversation between Taylor and Casey discussing how to apply the **softmax**

and **sigmoid**

functions in the context of loan default predictions.

**softmax**

**sigmoid**

## Refined Code

`from numpy import exp import numpy as np import matplotlib.pyplot as plt # Sigmoid function for binary classification def sigmoid(x): return 1/(1 + np.exp(-x)) # Define logits for loan default probabilities # A real-world model would output these logits based on application data loan_logits = np.array([0.8, -1.2, 3.0]) # Example logits from our model # Calculate the probability of default using sigmoid default_probabilities = sigmoid(loan_logits) # Print probabilities of default print(f"Loan default probabilities: {default_probabilities}") # Plotting the sigmoid function x = np.linspace(-10, 10, 100) y = sigmoid(x) plt.plot(x, y) plt.title('Sigmoid Function') plt.xlabel('Logits') plt.ylabel('Default Probability') plt.show()`

**Taylor:**Here, we have a list of logits,

**loan_logits**

, which our neural network has determined based on loan application data. The **sigmoid**

function is then used to calculate the probability of default for each application.**Casey:**I understand now. We're not using

**softmax**

here because we don't have multiple categories, right?**Taylor:**Exactly. We're only predicting if someone will default or not, which is a binary outcome. If we were assigning applications to different risk categories, that's when

**softmax**

would come into play.**Casey:**And the plot shows us how the

**sigmoid**

function translates logits into probabilities?**Taylor:**Correct. It's a visual way to understand how changes in logits affect the probability of default.

**Casey:**Thanks, that makes it clear how we apply these functions to loan defaults.

#### Reference:

**Author:**Jason Siu**URL:**https://jason-siu.com/article/softmax-sigmoid**Copyright:**All articles in this blog, except for special statements, adopt BY-NC-SA agreement. Please indicate the source!

Relate Posts