Title: | Kantorovich Distance Between Probability Measures |
---|---|
Description: | Computes the Kantorovich distance between two probability measures on a finite set. The Kantorovich distance is also known as the Monge-Kantorovich distance or the first Wasserstein distance. |
Authors: | Stéphane Laurent |
Maintainer: | Stéphane Laurent <[email protected]> |
License: | GPL-3 |
Version: | 3.2.0 |
Built: | 2024-10-31 21:13:05 UTC |
Source: | https://github.com/stla/kantorovich |
Computes the Kantorovich distance between two probability measures on a finite set, also known as the earth mover's distance. The Kantorovich distance is not a "unique" distance: it is defined by a given distance on the two finite sets (generally equal). Note that the default distance is the 0-1 distance and with this choice the Kantorovich computation is totally useless (see the vignette). Computing the Kantoroich distance is a linear programming problem, and several methods are provided in the package. In particular there is an exact method available when the probability weights are rational numbers and when the distances are rational numbers as well. A benchmark suggests that the faster methods are those using the 'CVXR' package.
To learn more, start with the vignettes:
browseVignettes(package="kantorovich")
.
If you encounter a bug, or if you have a suggestion to improve the package, please file an issue on the Github repo https://github.com/stla/kantorovich.
Package: | kantorovich |
Type: | Package |
Version: | 3.1.0 |
Date: | 2023-08-22 |
License: | GPL-3 |
Stéphane Laurent
Compute the distances at the extreme joinings.
edistances(mu, nu, dist = NULL, ...)
edistances(mu, nu, dist = NULL, ...)
mu |
(row margins) probability measure in numeric or bigq/character mode |
nu |
(column margins) probability measure in numeric or bigq/character mode |
dist |
function or matrix, the distance to be minimized on average.
If |
... |
arguments passed to |
A list with two components: the extreme joinings in a list and the distances in a vector.
This function, called by kantorovich
, is rather for internal purpose.
Return extreme joinings between mu
and nu
.
ejoinings(mu, nu, zeros = FALSE)
ejoinings(mu, nu, zeros = FALSE)
mu |
(row margins) probability measure in numeric or bigq/character mode |
nu |
(column margins) probability measure in numeric or bigq/character mode |
zeros |
logical; in case when |
A list containing the extreme joinings (matrices).
mu <- nu <- c(0.5, 0.5) ejoinings(mu, nu) # use exact arithmetic library(gmp) mu <- nu <- as.bigq(c(0.5,0.5)) ejoinings(mu, nu) # different lengths example mu <- setNames(as.bigq(c(1,2,4), 7), c("a", "b", "c")) nu <- setNames(as.bigq(c(3,1), 4), c("b", "c")) ejoinings(mu, nu)
mu <- nu <- c(0.5, 0.5) ejoinings(mu, nu) # use exact arithmetic library(gmp) mu <- nu <- as.bigq(c(0.5,0.5)) ejoinings(mu, nu) # different lengths example mu <- setNames(as.bigq(c(1,2,4), 7), c("a", "b", "c")) nu <- setNames(as.bigq(c(3,1), 4), c("b", "c")) ejoinings(mu, nu)
Compute the Kantorovich distance between two probability measures on a finite set.
kantorovich(mu, nu, dist = NULL, details = FALSE, ...)
kantorovich(mu, nu, dist = NULL, details = FALSE, ...)
mu |
(row margins) probability measure in numeric or bigq/character mode |
nu |
(column margins) probability measure in numeric or bigq/character mode |
dist |
function or matrix, the distance to be minimized on average;
if |
details |
prints the joinings achieving the Kantorovich distance and
returns them in the |
... |
arguments passed to |
The function firstly computes all the extreme joinings of mu
and nu
, then evaluates the average distance for each of them, and
then returns the minimal one.
The Kantorovich distance between mu
and nu
.
mu <- c(1/7, 2/7, 4/7) nu <- c(1/4, 1/4, 1/2) kantorovich(mu, nu) library(gmp) mu <- as.bigq(c(1,2,4), 7) nu <- as.bigq(c(1,1,1), c(4,4,2)) kantorovich(mu, nu) mu <- c("1/7", "2/7", "4/7") nu <- c("1/4", "1/4", "1/2") kantorovich(mu, nu, details=TRUE)
mu <- c(1/7, 2/7, 4/7) nu <- c(1/4, 1/4, 1/2) kantorovich(mu, nu) library(gmp) mu <- as.bigq(c(1,2,4), 7) nu <- as.bigq(c(1,1,1), c(4,4,2)) kantorovich(mu, nu) mu <- c("1/7", "2/7", "4/7") nu <- c("1/4", "1/4", "1/2") kantorovich(mu, nu, details=TRUE)
Kantorovich distance using the CVXR
package
kantorovich_CVX( mu, nu, dist, solution = FALSE, stop_if_fail = TRUE, solver = "ECOS", ... )
kantorovich_CVX( mu, nu, dist, solution = FALSE, stop_if_fail = TRUE, solver = "ECOS", ... )
mu |
(row margins) probability measure in numeric mode |
nu |
(column margins) probability measure in numeric mode |
dist |
matrix defining the distance to be minimized on average |
solution |
logical; if |
stop_if_fail |
logical; if |
solver |
the |
... |
other arguments passed to |
x <- c(1.5, 2, -3) mu <- c(1/7, 2/7, 4/7) y <- c(4, 3.5, 0, -2) nu <- c(1/4, 1/4, 1/4, 1/4) M <- outer(x, y, FUN = function(x, y) abs(x - y)) kantorovich_CVX(mu, nu, dist = M)
x <- c(1.5, 2, -3) mu <- c(1/7, 2/7, 4/7) y <- c(4, 3.5, 0, -2) nu <- c(1/4, 1/4, 1/4, 1/4) M <- outer(x, y, FUN = function(x, y) abs(x - y)) kantorovich_CVX(mu, nu, dist = M)
Kantorovich distance using the Rglpk
package
kantorovich_glpk(mu, nu, dist, solution = FALSE, stop_if_fail = TRUE, ...)
kantorovich_glpk(mu, nu, dist, solution = FALSE, stop_if_fail = TRUE, ...)
mu |
(row margins) probability measure in numeric mode |
nu |
(column margins) probability measure in numeric mode |
dist |
matrix defining the distance to be minimized on average |
solution |
logical; if |
stop_if_fail |
logical; if |
... |
arguments passed to |
x <- c(1.5, 2, -3) mu <- c(1/7, 2/7, 4/7) y <- c(4, 3.5, 0, -2) nu <- c(1/4, 1/4, 1/4, 1/4) M <- outer(x, y, FUN = function(x, y) abs(x - y)) kantorovich_glpk(mu, nu, dist = M)
x <- c(1.5, 2, -3) mu <- c(1/7, 2/7, 4/7) y <- c(4, 3.5, 0, -2) nu <- c(1/4, 1/4, 1/4, 1/4) M <- outer(x, y, FUN = function(x, y) abs(x - y)) kantorovich_glpk(mu, nu, dist = M)
Kantorovich distance using the lpSolve
package
kantorovich_lp(mu, nu, dist, solution = FALSE, lp.object = FALSE, ...)
kantorovich_lp(mu, nu, dist, solution = FALSE, lp.object = FALSE, ...)
mu |
(row margins) probability measure in numeric mode |
nu |
(column margins) probability measure in numeric mode |
dist |
matrix defining the distance to be minimized on average |
solution |
logical, to use only if |
lp.object |
logical, if |
... |
arguments passed to |
x <- c(1.5, 2, -3) mu <- c(1/7, 2/7, 4/7) y <- c(4, 3.5, 0, -2) nu <- c(1/4, 1/4, 1/4, 1/4) M <- outer(x, y, FUN = function(x, y) abs(x - y)) kantorovich_lp(mu, nu, dist = M)
x <- c(1.5, 2, -3) mu <- c(1/7, 2/7, 4/7) y <- c(4, 3.5, 0, -2) nu <- c(1/4, 1/4, 1/4, 1/4) M <- outer(x, y, FUN = function(x, y) abs(x - y)) kantorovich_lp(mu, nu, dist = M)
Kantorovich distance using the ompr
package
kantorovich_ompr(mu, nu, dist, solution = FALSE, stop_if_fail = TRUE)
kantorovich_ompr(mu, nu, dist, solution = FALSE, stop_if_fail = TRUE)
mu |
(row margins) probability measure in numeric mode |
nu |
(column margins) probability measure in numeric mode |
dist |
matrix defining the distance to be minimized on average |
solution |
logical; if |
stop_if_fail |
logical; if |
The glpk
solver is the one used to solve the problem.
x <- c(1.5, 2, -3) mu <- c(1/7, 2/7, 4/7) y <- c(-4, 3.5, 0) nu <- c(1/4, 1/4, 1/2) M <- outer(x, y, FUN = function(x, y) abs(x - y)) kantorovich_ompr(mu, nu, dist = M)
x <- c(1.5, 2, -3) mu <- c(1/7, 2/7, 4/7) y <- c(-4, 3.5, 0) nu <- c(1/4, 1/4, 1/2) M <- outer(x, y, FUN = function(x, y) abs(x - y)) kantorovich_ompr(mu, nu, dist = M)
Names for bigq vectors
## S3 method for class 'bigq' names(x)
## S3 method for class 'bigq' names(x)
x |
a |
the names of x