Computing the Rao's distance between negative binomial distributions. Application to Exploratory Data Analysis - Université de Toulon Accéder directement au contenu
Pré-Publication, Document De Travail Année : 2019

Computing the Rao's distance between negative binomial distributions. Application to Exploratory Data Analysis

Claude Manté

Résumé

The statistical analysis of counts of living organisms brings information about the collective behavior of species (schooling, habitat preference, etc), possibly depending on their biological characteristics (growth rate, reproductive power, survival rate, etc). This task can be implemented in a non-parametric setting, but parametric distributions, such as the negative binomial (NB) distributions studied here, are also very useful for modeling populations abundance. Nevertheless , the parametric approach is ill-suited from an exploratory point of view, because the visual distance between parameters is irrelevant. On the contrary, considering the Riemannian manifold N B(D R) of NB distributions equipped with the Rao metrics D R , one can compute intrinsic distances between species which can be considered as absolute. Unfortunately, computing this distance requires solving a second-order nonlinear dierential equation, whose solution cannot be always found in an acceptable length of time with enough precision. While Manté and Kidé [1] proposed numerical remedies to these problem, we propose a geometrical one, based on Poisson approximation. It consists in superseding A and/or B by "equivalent" better-suited distribution(s) before computing the distance, insofar as possible. The proposed method is illustrated by displaying distributions of counts of marine species: these counts having been tted by NB distributions, we compute the distance table ∆ between species and represent ∆ through multidimensional scaling (MDS). Poisson approximation, Multidimensional scaling Notations Consider a Riemannian manifold M, and a parametric curve α : [a, b] → M. Its rst derivative will be denoted ˙ α. A geodesic curve γ connecting two points p and q of M will be denoted p q, and p s ⊕ s q will denote the broken geodesic [2] connecting p to q with a stopover at s. We will also consider for any θ ∈ M the local norm V g (θ) associated with the metrics g on the tangent space T θ M : ∀ V ∈ T θ M, V g (θ) := V t .g(θ).V. (1) The length of a curve α traced on M will be denoted Λ (α). A parametric probability distribution L i will be identied with its coordinates with respect to some chosen parametrization; for instance, we will write L i ≡ φ i , µ i for some negative binomial distribution. In addition, R + * := ]0, +∞[, and M F will denote the Frobenius norm of the matrix M ; logical propositions will be combined by using the classical connectors ∨ (or) and ∧ (and).
Fichier principal
Vignette du fichier
JMVA_MantePartIMain.pdf (676.8 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-02130199 , version 1 (15-05-2019)

Identifiants

  • HAL Id : hal-02130199 , version 1

Citer

Claude Manté. Computing the Rao's distance between negative binomial distributions. Application to Exploratory Data Analysis. 2019. ⟨hal-02130199⟩
101 Consultations
168 Téléchargements

Partager

Gmail Facebook X LinkedIn More