The goal of the package {elbow} is to implement the Elbow (or knee of a curve) method to detect the inflection point of a concave curve. More information on this method can be found below and on Wikipedia.


Getting started

We can install the package {elbow} from GitHub with:

devtools::install_github("ahasverus/elbow")

library(elbow)


Let’s take a look at the package content:

ls("package:elbow")
##  [1] "elbow"    "profiles"

The package contains two elements:

  • elbow() - the only function of the package used to detect inflection point;
  • profiles - a reproductible example.


Reproductible example

First, we will load the profiles dataset and print a summary of its content:

data(profiles)

str(profiles)
##  'data.frame':   31 obs. of  5 variables:
##   $ x                   : int  0 1 2 3 4 5 6 7 8 9 ...
##   $ concave_down_pos_slo: num  0 0.5 0.75 0.875 0.938 ...
##   $ concave_down_neg_slo: num  1 1 1 1 1 ...
##   $ concave_up_pos_slo  : num  9.31e-10 1.86e-09 3.73e-09 7.45e-09 1.49e-08 ...
##   $ concave_up_neg_slo  : num  1 0.5 0.25 0.125 0.0625 ...

head(profiles)
##    x concave_down_pos_slo concave_down_neg_slo concave_up_pos_slo
##  1 0              0.00000                    1       9.313226e-10
##  2 1              0.50000                    1       1.862645e-09
##  3 2              0.75000                    1       3.725290e-09
##  4 3              0.87500                    1       7.450581e-09
##  5 4              0.93750                    1       1.490116e-08
##  6 5              0.96875                    1       2.980232e-08
##    concave_up_neg_slo
##  1            1.00000
##  2            0.50000
##  3            0.25000
##  4            0.12500
##  5            0.06250
##  6            0.03125


The profiles dataset is a data frame with 31 rows and the five following variables:

Variable name Description
x A sequence from 0 to 30 (x-Axis)
concave_down_pos_slo 1st profile - Concave down with positive slope
concave_down_neg_slo 2nd profile - Concave down with negative slope
concave_up_pos_slo 3rd profile - Concave up with positive slope
concave_up_neg_slo 4th profile - Concave up with negative slope

Let’s plot these four concave curves along the x sequence:



NB. In Clustering Analysis or Principal Component Analysis, we frequently meet the profiles A (a quantity increasing as the number of groups increases) and D (a quantity decreasing as the number of groups increases).


The Elbow algorithm

We are going to detect the inflection point of the profile A (Concave down with a positive slope). In Clustering Analysis, the x-Axis may represent the number of groups and the y-Axis the explained variance (R2).

The idea behind the Elbow method is to maximize a quantity (benefits) while reducing the costs (number of groups). Consequently, the inflection point will be the point from which the benefits become lower than the costs.

NB. In the profile D (Concave up with a negative slope) the objective is to minimize the quantity while reducing the costs.


How does it work?

From the profile (1), we apply a constant increase along the x-Axis to reach the maximum value on the y-Axis (2). Then, for each value on the x-Axis, we compute the difference between the two series (3) to generate the profile (4) (net benefits). The inflection point, in this case, corresponds to the maximum value on this new profile (red dot).


Usage of elbow()

This algorithm is implemented in the function elbow::elbow() which takes two arguments:

  • data - a two-columns data frame (x and y respectively);
  • plot - a boolean. If TRUE (default) the curves are plotted.


Let’s apply this function to detect the inflection point of the profile A.

dopo <- profiles[ , c("x", "concave_down_pos_slo")]

ipoint <- elbow(data = dopo)


What is the returned object?

class(ipoint)
##  [1] "list"

names(ipoint)
##  [1] "x_selected" "data"

print(ipoint)
##  $x_selected
##  [1] 4
##  
##  $data
##      x concave_down_pos_slo constant benefits
##  1   0            0.0000000    0.000    0.000
##  2   1            0.5000000    0.033    0.467
##  3   2            0.7500000    0.067    0.683
##  4   3            0.8750000    0.100    0.775
##  5   4            0.9375000    0.133    0.804
##  6   5            0.9687500    0.167    0.802
##  7   6            0.9843750    0.200    0.784
##  8   7            0.9921875    0.233    0.759
##  9   8            0.9960938    0.267    0.729
##  10  9            0.9980469    0.300    0.698
##  11 10            0.9990234    0.333    0.666
##  12 11            0.9995117    0.367    0.633
##  13 12            0.9997559    0.400    0.600
##  14 13            0.9998779    0.433    0.567
##  15 14            0.9999390    0.467    0.533
##  16 15            0.9999695    0.500    0.500
##  17 16            0.9999847    0.533    0.467
##  18 17            0.9999924    0.567    0.433
##  19 18            0.9999962    0.600    0.400
##  20 19            0.9999981    0.633    0.367
##  21 20            0.9999990    0.667    0.333
##  22 21            0.9999995    0.700    0.300
##  23 22            0.9999998    0.733    0.267
##  24 23            0.9999999    0.767    0.233
##  25 24            0.9999999    0.800    0.200
##  26 25            1.0000000    0.833    0.167
##  27 26            1.0000000    0.867    0.133
##  28 27            1.0000000    0.900    0.100
##  29 28            1.0000000    0.933    0.067
##  30 29            1.0000000    0.967    0.033
##  31 30            1.0000000    1.000    0.000

The element ipoint$data is returned to reproduce the graphic. The one we are interested in is ipoint$x_selected which returns the coordinate on the x-Axis (not the position on the data frame) corresponding to the inflection point.

ipoint$"x_selected"
##  [1] 4

ipoint[["data"]][ipoint[["data"]][ , "x"] == ipoint[["x_selected"]], 1:2]
##    x concave_down_pos_slo
##  5 4               0.9375


What about the other profiles?

profs <- colnames(profiles)[-1]

par(mfrow = c(2, 2))

for (prof in profs) {
  elbow(profiles[ , c("x", prof)])
}


Code of Conduct

Please note that the elbow project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.


Last updated: 2020/04/27