Instructions: – Answer all 3 questions.
- Interpret the questions logically, show your steps and write down your assumption(s) when necessary.
- Please submit your answer to before the due date.
- Late Submission Policy
- 3-hour “grace period” is given.
- 10% off for every 3-hour late
- Plagiarism Policy
- Both giver and receiver subject to the same penalty below
- All the students involved not only will receive 0 marks for this assessment, but also will have an additional 50% penalty applied, e.g., 5 marks for a 10-mark assessment.
- In our lecture notes, we have already seen the derivative of a sigmoid function wrt its weight parameters as follows.
we have
Let us now consider the softmax function with the function form
.
Derive the derivative of softmax wrt . You MUST use the symbols above, i.e., , , etc., to present your answer.
Hint:
Using the quotient rule and let and , we have
.
- Image Inpainting refers to the task of rebuilding missing or damaged patches of an image. Given grayscale images of fixed size 300(h)x200(w) pixels with a rectangular patch of pixels of size 60(h)x100(w) pixels missing as exemplified on the left photo below, we want to rebuild the missing pixels like the right photo on the right. Recall that grayscale images consist of 1 pixel channel with pixel value from 0 to 255 (represented by 1 byte).
- Suppose you are asked to use a multilayer perceptron (MLP) neural network as shown in Fig.1 to carry out the inpainting task, i.e., predicting the missing image patch.
· · |
Fig.1
- Describe how you formulate the problem by describing what should be the input (pixel vector xi) and what should be the output (pixel vector sxi).
- How training data should be prepared to train the MLP based image inpainting model?
- Suppose now the MLP in part (a) is enhanced with convolutional layers and pooling layers so that a CNN is resulted to carry out greyscale image inpainting. For the following table of architecture, how many learnable parameters are there in each of the specified layers? Show the formula or calculations in your answers. Also, there exist TWO unknown parameters in the specification column, namely, U1 and U2, write down their correct values.
Layer in CNN | Specification | Number of learnable parameters (formula answer is acceptable) |
Input Layer | Your solution in part (a) | 0 |
1st Convolutional Layer | 64 3x3xU1 filters; stride=1; no zero padding | |
1st Max Pooling Layer | 2×2 window; stride=2 | |
2nd Convolutional Layer | 256 3x3xU2 filters; stride=1; no zero padding | |
2nd Max Pooling Layer | 2×2 window; stride=2 | |
Input layer of fully connected (fc) feedforward network | Just the flattened output from previous layer | 0 |
1st hidden layer of fc feedforward network | NL2 hidden neurons | |
2nd hidden layer of fc feedforward network | NL3 hidden neurons | |
Output layer | Your solution in part (a) |
- Autoencoder is a widely used unsupervised deep learning model. Describe how it can be used in the following applications.
- k-means clustering of face images
Given a collection of human face images, it is aimed to cluster them into some natural groups of similar faces.
- Hierarchical clustering of animal images
Given a collection of animal images, it is aimed to build a hierarchy of clusters of similar animal types.
For these two applications, state clearly the input-to-hidden and hidden-to-output arrangements of the autoencoder model used (e.g. convolutional autoencoder, denoising autoencoder, etc.), and how the representation learnt by autoencoder is used to carry out the clustering tasks.