1
$\begingroup$

I have a small dataset of 134 observations. The dataset consists of answers to a questionnaire. All the variables are measured in Likert-scale, ranging from 1 to 5 (strongly agree).

I want to perform a hierarchical cluster analysis on the variables but I am unsure about which linkage method and distance measure to use.

I have tried running Ward's linkage with squared Euclidean distance measure but I am getting conflicting answers online reading some of the posts and comments on different statistics forums. I know that Ward may not be the most appropriate linkage to use with ordinal Likert scale data like mine, but when I run the analysis with Ward's and squared Euclidean I get clusters that conceptually make a lot of sense and are well separated from each other.

So my questions:

  1. Should I use this approach with my data?

  2. Should I use some different measure?

$\endgroup$
6
  • $\begingroup$ Both your questions are really "should", not "can", so I've taken the liberty of editing accordingly. Euclid, Likert and Ward are all people's names. $\endgroup$ Commented Jan 15, 2024 at 12:52
  • $\begingroup$ How many such variables? $\endgroup$ Commented Jan 15, 2024 at 12:53
  • $\begingroup$ Three variables measured using the Likert-scale $\endgroup$ Commented Jan 15, 2024 at 13:19
  • $\begingroup$ So, from one point of view there are $5^3 = 125$ distinct clusters -- from $111$ to $555$ --- that define themselves. Too many for some purposes, perhaps, but still worth looking at the pattern of frequencies. With any luck they won't all occur in the data. $\endgroup$ Commented Jan 15, 2024 at 13:22
  • $\begingroup$ Yes. I would like to minimize the number of clusters as much as possible. Any thought on whether I should use this approach with my data or whether I should use some different measure? $\endgroup$ Commented Jan 15, 2024 at 14:04

1 Answer 1

1
$\begingroup$

My view (and others may differ) is that when it comes to cluster analysis, if a method gives you something useful, you can use it. You say you get clusters that are well separated and that make sense. Good.

More generally, ordinal data is tricky. Strictly speaking, the fact that a variable is ordinal means that you can't assume that the distance between (say) 1 and 2 is the same as between (say) 2 and 3. But ..... usually it's fairly close for some abstract definition of "close". Yes, you could recode 1, 2, 3, 4, 5 as 1, 2, 181,100291, 239918188 (if you are strict about what ordinal means) but, that just doesn't make any intuitive sense.

A long time ago, in grad school, I saw research trying to figure out how close some Likert scales are to being equally spaced, but I don't remember the details much less a citation.

$\endgroup$
4
  • 1
    $\begingroup$ The practical distinction between ordinal and other scales is much fuzzier than many of the more doctrinaire treatments of measurement scales imply. If I and my colleagues grade (mark) academic work on a percent scale giving say 68 or 74 is assigning marks on what might be called an ordinal scale with 101 allowed integer levels, but we have criteria for marks and mechanisms for cross-checking each other's grades and at my workplace (and many others) averaging grades is the start of summarizing students' work. $\endgroup$ Commented Feb 20 at 8:24
  • 1
    $\begingroup$ It is routine that universities apply data analysis methods condemned in some of their own departments. $\endgroup$ Commented Feb 20 at 8:24
  • $\begingroup$ Indeed. Of course, marking on a "percent scale" is often not really a percent of any particular thing, depending on the format of the test. $\endgroup$ Commented Feb 20 at 12:25
  • $\begingroup$ Agreed. Some times 100 means "all correct" and sometimes it means as good as could be expected, except that it is not awarded often. Equally 0 doesn't usually meaning completely ignorant.... $\endgroup$ Commented Feb 20 at 13:46

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.