Homework 8
Table of Contents
Task 1 (20 pts)
Read/skim "Strong and Weak Emergence" by David Chalmers from 2006 (PDF; full citation). Answer these questions:
- Define what Chalmers means by weak emergence.
- Chalmers writes, "We can think of strongly emergent phenomena as being systematically determined by low-level facts without being deducible from those facts." Give an example (1-2 sentences) that may possibly satisfy this definition of strong emergence.
- Are the NetLogo models we looked at (sheep and wolves, ants, termites) examples of strong or weak emergence? Provide a 1-2 sentence argument.
Task 2 (20 pts)
Execute the k-means algorithm by hand on the following data:
item # | w | x | y | z | true label |
---|---|---|---|---|---|
1 | 1.0 | 1.0 | 0.0 | 1.0 | A |
2 | 0.0 | 0.0 | 1.0 | 0.0 | B |
3 | 2.0 | 2.0 | 0.0 | 2.0 | A |
4 | 0.0 | 0.0 | 1.0 | 1.0 | B |
5 | 2.0 | 2.0 | 0.0 | 2.0 | A |
6 | 0.0 | 2.0 | 1.0 | 1.0 | B |
Use \(k=2\). Show the centroids as they change, and give the final centroids. Choose random (or not so random) starting centroid values. Finally, give the confusion matrix.
Task 3 (20 pts)
Run the k-means algorithm in Weka using this dataset: iris.arff (iris species clustering).
Choose \(k=3\) and \(k=4\). Give the confusion matrix for each value of \(k\). Also report the percent of correctly classified instances for each class, for each \(k\).
Task 4 (20 pts)
Execute the k-nearest neighbor algorithm by hand on the dataset below (same as before). Use \(k = 2\). Classify the data point: \(<1, 0, 1, 2>\).
item # | w | x | y | z | true label |
---|---|---|---|---|---|
1 | 1.0 | 1.0 | 0.0 | 1.0 | A |
2 | 0.0 | 0.0 | 1.0 | 0.0 | B |
3 | 2.0 | 2.0 | 0.0 | 2.0 | A |
4 | 0.0 | 0.0 | 1.0 | 1.0 | B |
5 | 2.0 | 2.0 | 0.0 | 2.0 | A |
6 | 0.0 | 2.0 | 1.0 | 1.0 | B |
Task 5 (10 pts)
Run the k-nearest neighbor in Weka using this dataset: letter.arff (handwritten letter classification). Find a good value of \(k\). Use 10-fold cross validation and report the accuracy and give the confusion matrix.
Task 6 (10 pts)
Explain the differences between k-means and k-nearest neighbor algorithms.
Extra credit (+20 pts)
Play around with Weka. Report how well at least three different classification algorithms (avoid k-means and k-nn) perform on the letter.arff data with 10-fold cross validation. Collect accuracies in a table.