Next, consider Figure 9.7, which is a distribution plot of the clusters, with an International Plan overlay. Clusters 12 and 22 contain records if and only if they are adopters of the international plan, while the other clusters contain records if and only if they have not adopted the international plan. This time, the clustering algorithm has found another perfect discrimination along this dimension, dividing the data set perfectly among adopters and nonadopters of the International Plan. (more…)
Archive for April, 2009
Interpreting the Clusters (2)
Thursday, April 30th, 2009Interpreting the Clusters
Monday, April 27th, 2009How are we to interpret these clusters? How can we develop cluster profiles? Consider Figure 9.5, which is a bar chart of the clusters, with a VoiceMail Plan overlay. Clusters 02 and 12 c ontain records only if they are adopters of the VoiceMail Plan, while clusters 00, 10, and 20 contain records if and only if they have not adopted the VoiceMail Plan. Cluster 22 contains only a tiny portion of voicemail adopters. Excepting this small proportion of records in cluster 22, the clustering algorithm has found a high-quality discrimination along this dimension, dividing the data set nearly perfectly among adopters and nonadopters of the VoiceMail Plan. (more…)
San Antonio Realtors Find You the Best Living
Sunday, April 26th, 2009Imagine that you are living around Texas. Maybe this is one of your dreams. On the other hand, some of you want to sell your house in Texas and want to start a new life in other area. To do the buying and selling activities difficult for you who dont have enough experience. Probably you will confuse about the pricing and to find the right market. Realtors service is here to help you with above problem. (more…)
APPLICATION OF CLUSTERING USING KOHONEN NETWORKS
Friday, April 24th, 2009Next, we apply the Kohonen network algorithm to the churn data set from Chapter 3 (available at the book series Web site; also available from http://www.sgi.com/ tech/mlc/db/). Recall that the data set contains 20 variables worth of information about 3333 customers, along with an indication of whether that customer churned (left the company) or not. (more…)
Enjoy Warcraft Game as Much You Want
Tuesday, April 21st, 2009Everybody; no matter how much they aged should like spending time with having some games playing with friends, family or alone. There are so many kinds of game created; all of them are for entertaining and fun. Some games have high level of difficulties; it needs some good strategy while playing so that we can win the games. I think such game can give more fun and challenge to think further. (more…)
CLUSTER VALIDITY
Tuesday, April 21st, 2009To avoid spurious results, and to assure that the resulting clusters are reflective of the general population, the clustering solution should be validated. One common validation method is to split the original sample randomly into two groups, develop cluster solutions for each group, and then compare their profiles using the methods below or other summarization methods. (more…)
CONFLUENCE OF RESULTS: APPLYING A SUITE OF MODELS
Tuesday, April 21st, 2009In Olympic figure skating, the best-performing skater is not selected by a single judge alone. Instead, a suite of several judges is called upon to select the best skater from among all the candidate skaters. Similarly in model selection, whenever possible, the analyst should not depend solely on a single data mining method. Instead, he or she should seek a confluence of results from a suite of different data mining models. (more…)
EXAMPLE OF A KOHONEN NETWORK STUDY (3)
Monday, April 20th, 2009The winning node is node 3 because its weights (0.1, 0.8) are the closest to the third records field values. Hence, we may expect node 3 to represent a cluster of younger, high-income persons. (more…)
EXAMPLE OF A KOHONEN NETWORK STUDY (2)
Sunday, April 19th, 2009Note the type of adjustment that takes place. The weights are nudged in the direction of the fields values of the input record. That is, w11, the weight on the age connection for the winning node, was originally 0.9, but was adjusted in the direction of the normalized value for age in the first record, 0.8. Since the learning rate = 0.5, this adjustment is half (0.5) of the distance between the current weight and the field value. This adjustment will help node 1 to become even more proficient at capturing the records of older, high-income persons. (more…)
EXAMPLE OF A KOHONEN NETWORK STUDY
Saturday, April 18th, 2009Consider the following simple example. Suppose that we have a data set with two attributes, age and income, which have already been normalized, and suppose that we would like to use a 2 2 Kohonen network to uncover hidden clusters in the data set. We would thus have the topology shown in Figure 9.2. (more…)


