References
Notes {{word-count}}
Summary:
Key points:
What is a Neural Network?
A Neural Network is a multi-layer parametric functions with learned parameters.
This is a definition by Professor Sergey Levine at 25:50 in the second part of lecture 1.
In a Neural Network, the parameters of every layer are usually (not always) trained with respect to the overall task objective (e.g., accuracy, loss, cumulative rewards, etc).
It is also called the end-to-end learning.
Exceptions of this includes the technique of freezing parameters in certain layers and training a partial network.
Neural Networks can acquire representations by using high-capacity models and a lot of data without requiring manual engineering of features or representations.
Model capacity here means how many different functions a particular model class can represent.
This means that we do not need to know what good features are, and we expect the model to figure it out from data.
When representations are learned in an end-to-end fashion, they are better tailored to the current task.
Pros and cons of Neural Networks
They need to be huge, require a large amount of data, and a lot of compute.
As we add more layers, data, and compute, they become more and more powerful.
But they do plateau.
Learning (nature) and Inductive Bias (nurture)
Models that get most of their performance from their data rather than a designer's insight.
Scalability
Scalability is the ability that the performance gets better and better as we add more data, representational capacity, and compute.
Why do we call them Neural Networks?
Neural Networks were proposed as a rudimentary model of neurons in the brain.
In our brain, dendrites receive signals from other neurons.
Neuron decide whether to fire based on incoming signals.
Axon further transmits signal to downstream neurons.
Artificial neuron sums up signals from upstream neurons (units).
Neuron decide how much to fire based on incoming signals.
Activation transmitted to downstream units.
The function that represents the transformation form input to internal representation to output is usually a deep Neural Network.
Neural Networks can acquire representations by using high-capacity models and a lot of data without requiring manual engineering of features or representations.