AI 4 Neural nets lab

AI 4 Neural nets lab

1. Note - some of you pointed out that the program mlpdemo didn't run properly - you need to change the line
newrun = 0;
to
newrun =1
in mlpdemo.m (line 16) if you want it to re-initialise the weights at the start. about that...

2. Also, note that you will need to use multidimensional arrays to store the networks weights (w1 & w2) for the different members of the population. A multi-dimensional array can be created e.g.
PopW1 = randn(Npop, ninput, nhidden);
PopW2 = randn(Npop, nhidden+1, noutput);
which creates randomly initialised weights for Npop individual networks. One quirk of matlab is that if you want to use the individual weights from these you would need to do e.g.
w1 = squeeze(PopW1(i,:,:));
which would take the ith matrix from the multidimensional array.

3. If you do try the classification problem I suggest avoiding making a 26-class classifier - stick to a two-class problem (try to separate one of the letters from all others).

Machine Learning

Adapt the MATLAB code in program mlpdemo.m to optimise the network weights using an evolutionary approach. This can be done by adding less than 20 lines of code, so it should not be too time-consuming to programme. Experiment with different selection processes for chosing which networks survive to go on to the next generation, and with how those which remain are altered (a simple approach would just be to add some random change to the parameters). You can create your own dataset to learn from, or if you want try the character recognition problem below.

Look at the mean square error history for the population of networks over the generations, and experiment with the effect of changes to the selection and mutation process. Compare the results to the straightforward back-propagation algorithm in terms of classification performance, and computational effort (for this you can use the flops command in matlab).

About Matlab

MATLAB is available on Windows and Linux. Use the help command, and the demo command to get an overview of what's available. Matlab is a commercial piece of software sold by MathWorks. (The European server is faster) Matlab is an interpreted language with strong visualisation capabilities, which is well suited for numerical tasks such as neural network implementation.

Note that the scripts or function file names are M-files, and end in .m, while matlab data-files are MAT-files, and end in .mat

The basic introduction demos are also available on the web.
A list of tutorials.

Classification problem

The post office is running a competition to see who produce the best classification on the following data set for character recognition. The details of this data are given overleaf. Use a multi-layer perceptron to classify the data. Data is in Dataset.data. What do you notice about the progress of training and test set errors (Implementation: classdemo.m,mlpforward.m, mlpbackward.m, confusionmulti.m)? Describe what happens and experiment with the training time, number of hidden units and size of training set.

You can change the loop so that you include some convergence criterion, rather than running until max_iters iterations. Rather than choosing the network that achieves best classification on the training set, we recommend separating some of the data to keep as a separate test-set. Discuss how the test behaviour changes during learning relative to the training error. You might also want to monitor the percentage of correctly classified cases - why might this not change proportional to the sum of squared errors? Which letters does the classifier tend to make errors on? Why do you think this is?

Note that after some initial experimentation it might be better to write some m-files to run a number of tests, as they are likely to be time consuming.

The description of the data is as follows:

1. Title: Letter Image Recognition Data

2. Source Information

-- Creator: David J. Slate
-- Odesta Corporation; 1890 Maple Ave; Suite 115; Evanston, IL 60201
-- Donor: David J. Slate (dave@math.nwu.edu) (708) 491-3867
-- Date: January, 1991

3. Past Usage:

-- P. W. Frey and D. J. Slate (Machine Learning Vol 6 #2 March 91):

"Letter Recognition Using Holland-style Adaptive Classifiers".
The research for this article investigated the ability of several variations of Holland-style adaptive classifier systems to learn to correctly guess the letter categories associated with vectors of 16 integer attributes extracted from raster scan images of the letters. The best accuracy obtained was a little over 80%. It would be interesting to see how well other methods do with the same data.

4. Relevant Information:

The objective is to identify each of a large number of black-and-white rectangular pixel displays as one of the 26 capital letters in the English alphabet. The character images were based on 20 different fonts and each letter within these 20 fonts was randomly distorted to produce a file of 20,000 unique stimuli. Each stimulus was converted into 16 primitive numerical attributes (statistical moments and edge counts) which were then scaled to fit into a range of integer values from 0 through 15. We typically train on the first 16000 items and then use the resulting model to predict the letter category for the remaining 4000. See the article cited above for more details.

5. Number of Instances: 20000

6. Number of Attributes: 17 (Letter category and 16 numeric features)

7. Attribute Information:

1. lettr capital letter (26 values from A to Z)

2. x-box horizontal position of box (integer)

3. y-box vertical position of box (integer)

4. width width of box (integer)

5. high height of box (integer)

6. onpix total # on pixels (integer)

7. x-bar mean x of on pixels in box (integer)

8. y-bar mean y of on pixels in box (integer)

9. x2bar mean x variance (integer)

10. y2bar mean y variance (integer)

11. xybar mean x y correlation (integer)

12. x2ybr mean of x * x * y (integer)

13. xy2br mean of x * y * y (integer)

14. x-ege mean edge count left to right (integer)

15. xegvy correlation of x-ege with y (integer)

16. y-ege mean edge count bottom to top (integer)

17. yegvx correlation of y-ege with x (integer)

18. Missing Attribute Values: None

19. Class Distribution:

789 A 766 B 736 C 805 D 768 E 775 F 773 G 734 H 755 I 747 J 739 K 761 L 792 M 783 N 753 O 803 P 783 Q 758 R 748 S 796 T 813 U 764 V 752 W 787 X 786 Y 734 Z