Skip to contents

Maximum homogeneity clustering for one-dimensional data

Usage

oneclust(x, k, w = NULL, sort = TRUE)

Arguments

x

Numeric vector, samples to be clustered.

k

Integer, number of clusters.

w

Numeric vector, sample weights (optional). Note that the weights here should be sampling weights (for example, a certain proportion of the population), not frequency weights (for example, number of occurrences).

sort

Should we sort x (and w) before clustering? Default is TRUE. Otherwise the order of the data is respected.

Value

A list containing:

  • cluster - cluster id of each sample.

  • cut - index of the optimal cut points.

References

Fisher, Walter D. 1958. On Grouping for Maximum Homogeneity. Journal of the American Statistical Association 53 (284): 789--98.

Examples

set.seed(42)
x <- sample(c(
  rnorm(50, sd = 0.2),
  rnorm(50, mean = 1, sd = 0.3),
  rnorm(100, mean = -1, sd = 0.25)
))
oneclust(x, 3)
#> $cluster
#>   [1] 3 1 3 2 1 1 1 3 2 3 2 2 3 1 1 1 1 1 2 1 1 1 1 1 2 3 2 2 1 1 1 2 1 1 1 3 1
#>  [38] 1 3 1 3 2 1 1 3 2 3 2 1 1 3 3 1 2 3 3 1 1 1 1 3 3 1 1 1 1 1 3 2 2 2 2 2 1
#>  [75] 1 2 3 2 1 2 1 3 2 3 1 2 3 1 3 1 1 2 1 1 2 3 3 1 2 3 2 3 1 1 2 1 3 1 1 1 1
#> [112] 3 1 1 1 1 1 3 1 2 2 1 1 2 1 1 2 2 2 1 2 1 2 1 3 2 2 1 3 3 2 2 2 1 1 3 1 1
#> [149] 3 1 2 3 2 3 1 3 1 2 1 1 2 3 1 2 2 3 2 1 1 3 3 1 1 1 1 3 1 3 1 3 1 2 3 2 1
#> [186] 3 1 1 1 1 1 1 1 1 1 2 3 3 1 1
#> 
#> $cut
#> [1]   1 101 152
#>