23º SINAPE - Simpósio Nacional de Probabilidade e Estatística

Dados do Trabalho


Título

AUGMENTATION OF SAMPLES FROM CORRELATED BIVARIATE NORMAL DISTRIBUTION VIA ARCHETYPES

Resumo

Archetypes can be defined as extreme elements that well represent a sample, or population. Through the multivariate technique called Archetypal Analysis (AA) it is possible to find and select archetypes, which are convex combinations of the data. The AA can be applied in several areas and with different uses of the archetypes, including its use in the increase of the sample size. When data sets are characterized as incomplete or don’t have the size required to make a minimal error in the inference procedure, one has the option of increasing that sample. Thus, the data augmentation technique consists of introducing unobserved data or latent variables. The multivariate sample augmentation should consider the probability distributions and the correlation between the variables. The aim of this work was to evaluate the increase of multivariate data through its archetypes. So starting from a bivariate sample of random variables with normal distribution, the sample augmentation was performed from three algorithms to increase data via archetypes (A1, A2 and A3), a gold standard and a control. Then, a simulation study was carried out with six scenarios, evaluating the correlation between the variables (0.2, 0.5 and 0.9) and the proportion of the increase (10 % and 30 % of n). The three algorithms presented similar results: they have allowed to augment 10 % of the sample size without changing the probability distribution, nor the estimates of its parameters. Therefore, it is possible to augment a sample of a normal bivariate via archetypes.

Palavras-chave

Archetypal Analysis, Augmented data, Multivariate statistics, Simulation.

Área

Estatística Computacional

Autores

Pórtya Piscitelli Cavalcanti, Carlos Tadeu dos Santos Dias, Eric Batista Ferreira