Skip to contents

Simulate the levels and their sizes in a high-cardinality feature

Usage

sim_postcode_levels(nlevels = 100L, seed = 1001)

Arguments

nlevels

Number of levels to generate.

seed

Random seed.

Value

A data frame of postal codes and sizes.

Note

The code is derived from the example described in the "rare levels" vignette in the vtreat package.

Examples

df_levels <- sim_postcode_levels(nlevels = 500, seed = 42)
head(df_levels)
#>    size postcode
#> 1 15756   z04113
#> 2  2274   z04578
#> 3  5751   z02580
#> 4  7532   z01457
#> 5  5993   z03546
#> 6  3597   z04056