Archive for February, 2009

k-MEANS CLUSTERING

Saturday, February 28th, 2009

The k-means clustering algorithm [1] is a straightforward and effective algorithm for finding clusters in data. The algorithm proceeds as follows.

  • Step 1: Ask the user how many clusters k the data set should be partitioned into.
  • Step 2: Randomly assign k records to be the initial cluster center locations. (more…)

The constant boost

Saturday, February 28th, 2009

The constant boost to recall contrasts with the typical curve of forgetting we could expect with review. It would not be optimistic to expect a 400-500% boost in learning from this learning plan.

In a unique study reported in “Practical Aspects of Memory” Mangold Linton kept a diary over a four year period. She was able to show that those events in the diary which she never reviewed were 65% forgotten. Even a single review cut down forgetting significantly, whereas four reviews over a four year period reduced the probability of forgetting down to a level of about 12%. Put positively, just 4 reviews could produce an 88% probability of recall!

Your memory and your ability to learn are much, much greater than you have supposed.

Yet we have not even begun to discuss the biggest single aid to fast and effective learning – the enormous power of association. Before we do, let us turn our attention from natural actions that aid memory, to some artificial but valuable and instructive aids.

Taken From: Accelerated Learning

Complete-Linkage Clustering

Friday, February 27th, 2009

Next, lets examine whether using the complete-linkage criterion would result in a different clustering of this sample data set. Complete linkage seeks to minimize the distance among the records in two clusters that are farthest from each other. Figure 8.3 illustrates complete-linkage clustering for this data set.

  • Step 1: Since each cluster contains a single record only, there is no difference between single linkage and complete linkage at step 1. The two clusters each containing 33 are again combined. (more…)

Review

Friday, February 27th, 2009

If breaks enhance memory consolidation, there is a similar pattern that significantly enhances long term memory consolidation, and which dramatically slashes the overall time spent in learning.

From various sources that include Tony Buzans excellent book ‘Use your Head’, Peter Russell’s equally fascinating ‘The Brain Book’ and from journals of experimental and educational psychology, the following pattern of Review would seem to be ideal. It assumes that initial learning period is up to 45 minutes.

1. Learn material with immediate tests continuously built in to ensure the basic transfer from Short Term to Long Term Memory.

This pattern of review will necessitate about 20 minutes of time per 45 minutes of initial learning, but will conservatively save many hours of learning compared with the normal instinctive urge which is to “learn it all in one go”.

Taken From: Accelerated Learning

Single-Linkage Clustering

Thursday, February 26th, 2009

Suppose that we are interested in using single-linkage agglomerative clustering on this data set. Agglomerative methods start by assigning each record to its own cluster. Then, single linkage seeks the minimum distance between any records in two clusters. Figure 8.2 illustrates how this is accomplished for this data set. The minimum cluster distance is clearly between the single-record clusters which each contain the value 33, for which the distance must be zero for any valid metric. Thus, these two clusters are (more…)

Taking a Break

Thursday, February 26th, 2009

The popular view that you need to take a break every now and then has been confirmed by French Researcher Henri Pieron. He has found that a planned series of breaks during a study period or lesson increases the probability of recall. A break every 30 minutes is probably optimum, and each break should be of the order of 5 minutes. Certainly no improvement is gained when the break exceeds 10 minutes.

The break should be a complete rest from the type of study being undertaken, otherwise too many competing or interfering associations will be formed, and they will confuse the memory traces laid down in the study period. The deep breathing and relaxation exercises – described in Chapter 12 are specifically designed to produce mental and physical relaxation and enhance oxygen flow to the brain.

The effect of the breaks will be to sustain recall in the way that the diagram opposite shows.

What is at work is the effect of Primacy and Recency coupled with the “Zeigarnik Effect”. Zeigarnik, a German researcher, found that interrupting a task, in which a person was involved, even if that task is going well, can lead to appreciably subsequent higher recall.

Taken From: Accelerated Learning

HIERARCHICAL CLUSTERING METHODS

Wednesday, February 25th, 2009

Clustering algorithms are either hierarchical or nonhierarchical. In hierarchical clustering, a treelike cluster structure (dendrogram) is created through recursive partitioning (divisive methods) or combining (agglomerative) of existing clusters. Agglomerative clustering methods initialize each observation to be a tiny cluster of its own. Then, in succeeding steps, the two closest clusters are aggregated into a new
combined cluster. In this way, the number of clusters in the data set is reduced by one at each step. (more…)

Sleep Learning (Hypnopaedia)

Wednesday, February 25th, 2009

If sleep helps you to assimilate facts, form opinions, reach solutions, and indulge in a “test run” of behaviour, then can you actually use the period of sleep to learn actively?

Experiments began in the U.S.A. in 1942 and extended to Russia in the 50’s. “Hypnopaedia” was a popular idea but there has never been any real success with it.

On the basis of Dr. Chris Evans’ work and his conclusion that sleep is when the brain, as computer, is “off line”, we can understand why sleep learning has never succeeded. Learning is an activity when fresh information is presented. Sleep is precisely the period when the information is reviewed, not when new data are taken in.

We are now also sufficiently aware of the importance of holistic (left and right brain) learning, that we would not really expect sleep learning to work. Learning may be easiest in a state of calm, relaxed alertness, but that does not mean that you need not be fully conscious.

Taken From: Accelerated Learning

CLUSTERING TASK (2)

Tuesday, February 24th, 2009

Clustering is often performed as a preliminary step in a data mining process, with the resulting clusters being used as further inputs into a different technique downstream, such as neural networks. Due to the enormous size of many present-day databases, it is often helpful to apply clustering analysis first, to reduce the search space for the downstream algorithms. In this chapter, after a brief look at hierarchical clustering methods, we discuss in detail k-means clustering; in Chapter 9 we examine clustering using Kohonen networks, a structure related to neural networks. (more…)

CLUSTERING TASK

Monday, February 23rd, 2009

Clustering refers to the grouping of records, observations, or cases into classes of similar objects. A cluster is a collection of records that are similar to one another and dissimilar to records in other clusters. Clustering differs from classification in that there is no target variable for clustering. The clustering task does not try to classify, estimate, or predict the value of a target variable. Instead, clustering algorithms seek to segment the entire data set into relatively homogeneous subgroups or clusters, where the similarity of the records within the cluster is maximized, and the similarity to records outside this cluster is minimized. (more…)