Computing the Rao's distance between negative binomial distributions. Application to Exploratory Data Analysis

Claude Manté

Pré-Publication, Document De Travail Année : 2019

Computing the Rao's distance between negative binomial distributions. Application to Exploratory Data Analysis

(1)

Claude Manté

Fonction : Auteur
PersonId : 19852
IdHAL : claude-mante
ORCID : 0000-0002-7268-9789

Institut méditerranéen d'océanologie

Résumé

The statistical analysis of counts of living organisms brings information about the collective behavior of species (schooling, habitat preference, etc), possibly depending on their biological characteristics (growth rate, reproductive power, survival rate, etc). This task can be implemented in a non-parametric setting, but parametric distributions, such as the negative binomial (NB) distributions studied here, are also very useful for modeling populations abundance. Nevertheless , the parametric approach is ill-suited from an exploratory point of view, because the visual distance between parameters is irrelevant. On the contrary, considering the Riemannian manifold N B(D R) of NB distributions equipped with the Rao metrics D R , one can compute intrinsic distances between species which can be considered as absolute. Unfortunately, computing this distance requires solving a second-order nonlinear dierential equation, whose solution cannot be always found in an acceptable length of time with enough precision. While Manté and Kidé [1] proposed numerical remedies to these problem, we propose a geometrical one, based on Poisson approximation. It consists in superseding A and/or B by "equivalent" better-suited distribution(s) before computing the distance, insofar as possible. The proposed method is illustrated by displaying distributions of counts of marine species: these counts having been tted by NB distributions, we compute the distance table ∆ between species and represent ∆ through multidimensional scaling (MDS). Poisson approximation, Multidimensional scaling Notations Consider a Riemannian manifold M, and a parametric curve α : [a, b] → M. Its rst derivative will be denoted ˙ α. A geodesic curve γ connecting two points p and q of M will be denoted p q, and p s ⊕ s q will denote the broken geodesic [2] connecting p to q with a stopover at s. We will also consider for any θ ∈ M the local norm V g (θ) associated with the metrics g on the tangent space T θ M : ∀ V ∈ T θ M, V g (θ) := V t .g(θ).V. (1) The length of a curve α traced on M will be denoted Λ (α). A parametric probability distribution L i will be identied with its coordinates with respect to some chosen parametrization; for instance, we will write L i ≡ φ i , µ i for some negative binomial distribution. In addition, R + * := ]0, +∞[, and M F will denote the Frobenius norm of the matrix M ; logical propositions will be combined by using the classical connectors ∨ (or) and ∧ (and).

Mots clés

Riemannian manifold Negative Binomial geodesics cut locus

Domaines

Statistiques [math.ST]

Fichier principal

JMVA_MantePartIMain.pdf (676.8 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Claude Manté : Connectez-vous pour contacter le contributeur

https://hal.science/hal-02130199

Soumis le : mercredi 15 mai 2019-16:13:46

Dernière modification le : jeudi 28 mars 2024-16:33:58

Dates et versions

hal-02130199 , version 1 (15-05-2019)

Identifiants

HAL Id : hal-02130199 , version 1

Citer

Claude Manté. Computing the Rao's distance between negative binomial distributions. Application to Exploratory Data Analysis. 2019. ⟨hal-02130199⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

IRD INSU UNIV-TLN CNRS UNIV-AMU MIO OSU-INSTITUT-PYTHEAS

101 Consultations

168 Téléchargements

Computing the Rao's distance between negative binomial distributions. Application to Exploratory Data Analysis

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager