Fundamentals of Deep Learning
Introduction of Deep Learning
Deep Learning is a subset of machine learning in artificial intelligence (AI) that deals with artificial neural networks, algorithms caused by the biological structure and functioning of the human brain to aid machines with intelligence. It learns from a large amount of data to bring out meaningful insights for decision-making.
Deep learning would be able to leverage the surplus data more effectively for improved performance. The following diagram represents the deep learning model performance with the data size.
Deep Learning models are designed using the neural network architecture, and it enables learning through performing tasks repeatedly to improve the outcome. A neural network is a collection of the hierarchical structure of neurons, and it is similar to the nervous system in the human body works. Each neuron with connection to other neurons, and it transmits the information or signal to other neurons.
The deep neural network will consist of three types of layers:
- Input Layer
- Hidden Layer
- Output Layer
As you can see above example, the input layer takes the input data by the user, and that input data has consumed by the neurons in the first hidden layer, then it performs various computations on the input data, which then provides an output from the output layer.
Each layer has one or more neurons, and each of them will compute various functions (like activation function). The connection between two neurons would have some weight. That weight defines the impact of the input for the next neuron, and finally, for the overall final output is provided by the output layer. In a neural network, the initial weights would all be random during the model training, but these weights are updated or learned iteratively to predict a correct output.
tf.keras.layers.Dense(
units,
activation=None,
use_bias=True,
kernel_initializer="glorot_uniform",
bias_initializer="zeros",
kernel_regularizer=None,
bias_regularizer=None,
activity_regularizer=None,
kernel_constraint=None,
bias_constraint=None,
**kwargs
)
Dropout Layer
A dropout layer intercepts all neurons in a layer of synchronously optimizing their weights. It helps to reduce overfitting by introducing regularization and generalization capabilities into the model. Its drops out some neurons from layers.
tf.keras.layers.Dropout(rate,
noise_shape=None,
seed=None,
**kwargs)
Loss Function
The loss function is an important concept of deep learning. It's nothing but a prediction error of a neural network. It helps a neural network understand whether model learning goes in the right direction.
- Mean squared error
- Mean absolute error
- Binary cross-entropy
- Categorical cross-entropy
- Sparse categorical cross-entropy
Optimizers
Optimizer function is a mathematical algorithm to use understand how much change the network will see in the loss function. It helps to reduce losses and get results faster.
There are some popular optimizer available here:
- Adam(Adaptive Moment Estimation)
- SGD(Stochastic Gradient Descent)
- RMSprop(Root Mean Square Propagation)
Metrics
The metrics can be understood as the function that is used to judge the performance of the model, that the results from evaluating metrics are not used in training the model concerning optimization. And we can also define custom functions for our model metrics.
Model Training
Once we configure a model, we have ready to train the model with the training data and validation data for us to evaluate whether the model is performing as desired after each epoch.
Example:
import numpy as np
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
# Create dummy dataset
# training dataset
np.random.seed(1000)
X_train = np.random.random((10000,10))
y_train = np.random.randint(2, size=(10000, 1))
# validation dataset
X_val = np.random.random((2500,10))
y_val = np.random.randint(2, size=(2500, 1))
# test dataset
X_test = np.random.random((2500,10))
y_test = np.random.randint(2, size=(2500, 1))
#Define the model architecture
model = Sequential()
model.add(Dense(64,input_dim=10,activation="relu"))
model.add(Dense(32,activation = "relu"))
model.add(Dense(16,activation = "relu"))
model.add(Dense(8,activation = "relu"))
model.add(Dense(4,activation = "relu"))
model.add(Dense(1,activation = "sigmoid"))
#Compile the model
model.compile(optimizer='Adam',
loss='binary_crossentropy',
metrics=['accuracy'])
#Train the model
model.fit(X_train,
y_train,
epochs=2,
validation_data=(X_val,y_val))
Out[]:
Epoch 1/2
157/157 [=============] - 1s 5ms/step -
loss: 0.6934 - accuracy: 0.4964 -
val_loss: 0.6934 - val_accuracy: 0.4904
Epoch 2/2
157/157 [=============] - 1s 3ms/step -
loss: 0.6933 - accuracy: 0.5044 -
val_loss: 0.6933 - val_accuracy: 0.5044
Conclusion
In summary, I hope now you understand the fundamentals of deep learning. It’s really easy once you understand doing it practically as well. If you want to explore more, please check my blog site: Techy Scientists and GitHub.
No comments:
Post a Comment