Sunday, 5 December 2021

Fundamentals of Deep Learning | Basic Concepts

 



Fundamentals of Deep Learning


    In this blog, we will discuss the fundamentals of deep learning with a brief introduction and then look at the logical component of deep learning. 


Introduction of Deep Learning


    Deep Learning is a subset of machine learning in artificial intelligence (AI) that deals with artificial neural networks, algorithms caused by the biological structure and functioning of the human brain to aid machines with intelligence. It learns from a large amount of data to bring out meaningful insights for decision-making.





    Deep learning would be able to leverage the surplus data more effectively for improved performance. The following diagram represents the deep learning model performance with the data size.




    Deep Learning models are designed using the neural network architecture, and it enables learning through performing tasks repeatedly to improve the outcome. A neural network is a collection of the hierarchical structure of neurons, and it is similar to the nervous system in the human body works. Each neuron with connection to other neurons, and it transmits the information or signal to other neurons.





The deep neural network will consist of three types of layers:

  • Input Layer
  • Hidden Layer
  • Output Layer


    As you can see above example, the input layer takes the input data by the user, and that input data has consumed by the neurons in the first hidden layer, then it performs various computations on the input data, which then provides an output from the output layer. 





    Each layer has one or more neurons, and each of them will compute various functions (like activation function). The connection between two neurons would have some weight. That weight defines the impact of the input for the next neuron, and finally, for the overall final output is provided by the output layer. In a neural network, the initial weights would all be random during the model training, but these weights are updated or learned iteratively to predict a correct output. 



Basic Components of Deep Learning

Activation Function

    An activation function is the function that takes the combined input as shown in the previous sample, applies a function on it, and passes the output value, it decides whether the neuron should be activated or not.


    

    There are many types of activation functions available in deep learning. The most commonly used functions are sigmoid function, the ReLU (rectified linear unit), SoftMax function and, tanh function.






Core Layers

    There are some important layers in deep neural network(DNN), that we will be using in the most use case.

Dense Layer

    A dense layer is also referred to as a fully connected layer, it is a regular DNN layer that connects all neurons in the output layer to all neurons in the previous layer.
tf.keras.layers.Dense(
    units,
    activation=None,
    use_bias=True,
    kernel_initializer="glorot_uniform",
    bias_initializer="zeros",
    kernel_regularizer=None,
    bias_regularizer=None,
    activity_regularizer=None,
    kernel_constraint=None,
    bias_constraint=None,
    **kwargs
)


Dropout Layer


    A dropout layer intercepts all neurons in a layer of synchronously optimizing their weights. It helps to reduce overfitting by introducing regularization and generalization capabilities into the model. Its drops out some neurons from layers.

tf.keras.layers.Dropout(rate, 
	noise_shape=None, 
    	seed=None, 
    	**kwargs)


Loss Function


The loss function is an important concept of deep learning. It's nothing but a prediction error of a neural network. It helps a neural network understand whether model learning goes in the right direction.


There are some popular loss functions available here:
  • Mean squared error
  • Mean absolute error
  • Binary cross-entropy
  • Categorical cross-entropy
  • Sparse categorical cross-entropy
        etc...

Optimizers 


Optimizer function is a mathematical algorithm to use understand how much change the network will see in the loss function. It helps to reduce losses and get results faster.


There are some popular optimizer available here:

  • Adam(Adaptive Moment Estimation)
  • SGD(Stochastic Gradient Descent)
  • RMSprop(Root Mean Square Propagation)
        etc...

Metrics


    The metrics can be understood as the function that is used to judge the performance of the model, that the results from evaluating metrics are not used in training the model concerning optimization. And we can also define custom functions for our model metrics.


Model Training


    Once we configure a model, we have ready to train the model with the training data and validation data for us to evaluate whether the model is performing as desired after each epoch. 



Example:

import numpy as np
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense

# Create dummy dataset  

# training dataset
np.random.seed(1000)
X_train = np.random.random((10000,10))
y_train = np.random.randint(2, size=(10000, 1))

# validation dataset
X_val = np.random.random((2500,10))
y_val = np.random.randint(2, size=(2500, 1))

# test dataset
X_test = np.random.random((2500,10))
y_test = np.random.randint(2, size=(2500, 1))

#Define the model architecture
model = Sequential()
model.add(Dense(64,input_dim=10,activation="relu"))
model.add(Dense(32,activation = "relu")) 
model.add(Dense(16,activation = "relu")) 
model.add(Dense(8,activation = "relu")) 
model.add(Dense(4,activation = "relu"))
model.add(Dense(1,activation = "sigmoid")) 

#Compile the model
model.compile(optimizer='Adam',
	loss='binary_crossentropy',
    	metrics=['accuracy'])

#Train the model
model.fit(X_train, 
	y_train, 
    	epochs=2, 
    	validation_data=(X_val,y_val))
Out[]:
Epoch 1/2
157/157 [=============] - 1s 5ms/step - 
loss: 0.6934 - accuracy: 0.4964 - 
val_loss: 0.6934 - val_accuracy: 0.4904
Epoch 2/2
157/157 [=============] - 1s 3ms/step - 
loss: 0.6933 - accuracy: 0.5044 - 
val_loss: 0.6933 - val_accuracy: 0.5044



Conclusion


    In summary, I hope now you understand the fundamentals of deep learning. It’s really easy once you understand doing it practically as well. If you want to explore more, please check my blog site: Techy Scientists and GitHub



References:

    [1]: Learn Keras for Deep Neural Networks, A Fast-Track Approach to Modern Deep Learning with Python, Jojo Moolayil, Apress. link

No comments:

Post a Comment