--- title: "The 'gfiExtremes' package" output: rmarkdown::html_vignette: css: vignette.css vignette: > %\VignetteIndexEntry{The 'gfiExtremes' package} %\VignetteEncoding{UTF-8} %\VignetteEngine{knitr::rmarkdown} editor_options: chunk_output_type: console --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.width = 6, fig.height = 5 ) ``` The 'gfiExtremes' package offers two main functions: `gfigpd1` and `gfigpd2`. Each of them generates simulations from the fiducial distribution of a model involving the generalized Pareto distribution. The difference is that the threshold $\mu$ of the generalized Pareto distribution is assumed to be known by `gfigpd1`, whereas `gfigpd2` estimates this threshold. The algorithms are implemented in C++. They are translated from the original R code written by the authors of the reference paper. ### Note The package allows to run some MCMC chains in parallel. In the examples below, as well as in the examples of the package documentation, I set `nthreads = 2L` because of CRAN restrictions: CRAN does not allow more than two R processes in parallel and then it would not accept the package if a higher number of threads were set. ```{r setup, message=TRUE} library(gfiExtremes) ``` ## The model with a known threshold When the threshold $\mu$ is known, it is assumed that the data are independent realizations of a random variable $X$ which follows a generalized Pareto distribution $GP(\mu,\gamma,\sigma)$ conditionally to $X \geqslant \mu$. On the event $X < \mu$, no distributional assumption is made. Then the algorithm performed by the `gfigpd1` function produces some simulations of the fiducial distributions of $\gamma$, $\sigma$ and of the $100\beta\%$-quantiles of $X$ at the requested values of $\beta$. These are MCMC chains. For example, assume that $X$ follows the $GP(\mu,\gamma,\sigma)$ distribution (so $X < \mu$ never happens). ```{r gfigpd1} set.seed(666L) X <- rgpareto(200L, mu = 10, gamma = 0.3, sigma = 2) gf1 <- gfigpd1( X, beta = c(0.99, 0.995, 0.999), threshold = 10, iter = 10000L, nchains = 4L, nthreads = 2L ) ``` The output of `gfigpd1` is a R object ready for analysis with the 'coda' package. In particular, it has a `summary` method. ```{r summary_gfigpd1} summary(gf1) # compare with the true quantiles: qgpareto(c(0.99, 0.995, 0.999), mu = 10, gamma = 0.3, sigma = 2) ``` Every parameter is very well estimated by the median of its fiducial distribution. We can get the shortest fiducial confidence intervals with the 'coda' function `HPDinterval`: ```{r hpd_gfigpd1} HPDinterval(joinMCMCchains(gf1)) ``` ## The model with an unknown threshold When the threshold $\mu$ is unknown, it is also assumed that the data are independent realizations of a random variable $X$ which follows a generalized Pareto distribution $GP(\mu,\gamma,\sigma)$ conditionally to $X \geqslant \mu$, but there are additional assumptions. These additional assumptions have no meaningful interpretation but this is not important in order to estimate the quantiles of $X$: it is possible that the parameters $\gamma$ and $\sigma$ cannot be estimated (it is always possible if $X$ strictly follows the unrealistic assumptions of the model) but $\mu$ can always be estimated and the fiducial distributions of the quantiles are available. Let's assume for example that $X$ follows a log-normal distribution: ```{r gfigpd2} set.seed(666L) X <- rlnorm(400L, meanlog = 1) gf2 <- gfigpd2( X, beta = c(0.99, 0.995, 0.999), iter = 10000L, burnin = 10000L, nchains = 4L, nthreads = 2L ) summary(gf2) # compare with the true quantiles: qlnorm(c(0.99, 0.995, 0.999), meanlog = 1) ``` As you can see, the inference on the quantiles is good.