Cohort effect on discriminant rate: the case of greater flamingo (Phœnicopterus roseus) chicks sexed with morphological characters

We used a large dataset of greater flamingo chicks banded and measured at Camargue, France, to verify the applicability of discriminant function analysis to sex this species. Males and females sexed genetically differed significantly in all of the morphological characters measured (body mass, tarsus and wing length), with males being significantly larger than females. Although the discriminant rate varied substantially from one year to another, we found that it increased with the sample size of genetically sexed individuals. Our results suggest that discriminant function analysis (DFA) does not provide an efficient tool to sex greater flamingo chicks as these relationship are highly variable across years, requiring the genetic determination of sex on a large number of individuals every year for calibrating the DFA and still providing an overall low accuracy in sex determination. Indeed, conditions at breeding seasons can vary between years and can be considered proximate causes affecting the correct discriminant rate. Like previous studies, we recommend caution in dealing with discriminant equations computed from small datasets, and our simulation suggests that 325 genetically sexed individuals are needed to obtain 80 % correctly classified greater flamingo chicks.


Introduction
The ability to correctly sex marked birds is crucial to most behavioral or ecological studies (Greenwood, 1980;Andersson, 1994;Short and Balaban, 1994;Childress and Bennun, 2002;Barbraud et al., 2003) and for the management and conservation of species (Zavalaga and Paredes, 1997;Fernandez-Juricic, et al., 2009). However, sex determination in immature and adult individuals where the plumage is similar in both sexes based only on external morphological characters is often difficult (Cuthill et al., 1999).
Sexing based on morphological measurements using multivariate statistical approaches (e.g., discriminant function analysis, DFA) is often a reasonable choice for quick and inexpensive but efficient sex identification in field studies on bird species presenting monomorphic plumage (Dechaume-Moncharmont et al., 2011). Indeed, the existence of a significant difference (even very small) between morphological measurements of males and females allows the discrimination between the sexes (Wilson, 1999;Donohue and Dufty, 2006;Moreno et al., 2007;Murphy, 2007;Ottvall and Gunnarsson, 2007;Cardoni et al., 2009). Diverse discriminant and logistic regression functions based on morphological measurements have been developed in many bird studies designed for widespread use by field researchers (Bosch, 1996;Balbontin et al., 2001;Bertellotti et al., 2002;Delvin et al., 2004;Figuerola et al., 2006, Hallgrimsson et al., 2011. However, different methods can be used to assess the reliability of these functions by examining the proportion of correctly classified males and females, such as, resubstitution (Zwarts et al., 1996;Copello et al., 2006), jackknife (O'Dwyer et al., 2006;Thorogood et al., 2009;Herring et al., 2008Herring et al., , 2010 or sample splitting methods (Setiawan et al., 2004;Meissner, 2005;Frey et al., 2008). Dechaume-Moncharmont et al. (2011) showed that the choice of the validation method may have a strong effect on the estimated discriminant rate, particularly for small sample sizes, and recommended the use of the jackknife method.
In this study, we used a large dataset (4013 birds measured across 11 years) of greater flamingo (Phoenicopterus roseus) chicks ringed and measured in the Camargue (southern France) and afterward sexed with molecular techniques (Bertault et al., 1999;Balkız et al., 2007). Greater flamingos lay a single egg and have a wide distribution, ranging from west Africa eastward throughout the Mediterranean to southwest and south Asia, and throughout sub-Saharan Africa (Johnson and Cézilly, 2007). Both sexes are similar in plumage but are sexually dimorphic in size when adults (Johnson and Cézilly, 2007). Although immature and adult male greater flamingos are significantly larger and heavier than females (Cramp and Simmons, 1977;Johnson et al., 1993), there remains a risk of error in sex determination based on size or weight due to the overlap of larger (and heavier) females with smaller (and lighter) males. This overlap is possibly wider in chicks, yet tarsus length of males appeared longer than that of females of exactly the same age in unfledged chicks from captivity (Studer-Thiersch, 1986).
The purposes of our study were to (1) examine whether DFA relying on morphological measurements of greater flamingo chicks predicts sex with a sufficient degree of accuracy, (2) test the applicability of one of the 11 single DFA formulas across cohorts and (3) evaluate through simulations the minimal sample size of chicks sexed by molecular methods to generate a DFA with a reliable discriminant rate.

Material and methods
As part of a long-term study on the reproductive biology of a colony of greater flamingos in Camargue, southern France (43 • 25 N, 4 • 38 E), a proportion of chicks were captured at the end of each breeding season (end of July-early August) and banded with a metal ring and a unique combination of plastic bands that allowed recognition of individuals. For each chick, only three external measurements were made (body weight, tarsus length and wing length; Childress et al., 2005) to avoid prolonged capture stress. Thus, between 1995Thus, between and 2008Thus, between (except for 2001Thus, between , 2002Thus, between and 2007, a total of 4013 Flamingos chicks were captured. A total of 1828 males and 2187 females were genetically sexed through PCR (polymerase chain reaction) amplification of the CHD-Z and CHD-W genes, using DNA extracted from blood or feather samples (Bertault et al., 1999, Balkız et al., 2007.

Discriminant analysis functions
For each cohort, we applied Box's M test to verify the homogeneity of the variance-covariance matrix assumption before performing quadratic DFA in the case of violation of the assumption, otherwise linear DFA (Stevens, 1992) was used. These were conducted with the qda or lda function from the MASS package (Venables and Ripley, 2002) in R (version 3.6.1; R Development Core Team, 2019). To estimate the proportion of individuals with correctly classified sex (discriminant rate), we used the jackknife method (leave one out) (Manly, 1994): the sex of an individual is predicted from the DFA calculated after that individual has been taken out from the dataset. This procedure is repeated until a sex is assigned to each individual (Tabachnick and Fidell, 2000). Then, we examined whether the DFA for a given year could provide (or not) a reliable discriminant rate across cohorts. Finally, we used Student's t test and Cohen's d effect size to calculate the difference between mean male and female chicks external measurements.

Effect of sample size
We simulated the effect of sample size on the discriminant rate using the largest dataset available, i.e., 2006. We defined 133 different sample sizes regularly spaced every five individuals and ranging from 25 to 685 individuals. For each sample size, we used the jackknife method to simulate 500 different datasets by randomly sampling individuals (Table 1). We then calculated the mean DFA for each random dataset for a given sample size.

Sex determination
The discriminant rate of greater flamingo chicks using DFA was always > 70 %, yet varied substantially across cohorts (Table 1). Females were always better discriminated than males (Table 1). Evaluating whether one particular annual DFA could correctly predict sex across years showed major differences in the discriminant rate. For instance, the result found using the DFA of 2006 (the largest dataset) on the different cohorts led to a large difference of correct sex determination (from 52.4 % in 2000 to 86.5 % in 2006; Fig. 2), and the all-year function had the lowest discriminant rate (Table 1).

Sample size effect
Obtained discriminant rates of correctly sexed greater flamingo chicks increased with increasing sample size. Indeed, the lowest DFA simulation results for samples of 50, 200 and 325 individuals were 58.0 %, 76.6 % and 80.0 %, respectively (Fig. 3).

Discussion
In the greater flamingos, as in many other birds, adult males are larger than females. However, sexing flamingo chicks remains difficult because flamingos continue to grow after fledging, and captures generally include individuals of different ages. Sexual dimorphism occurs in flamingos chicks with females of 1.5 to 2.5 months being already smaller than males of the same age (Studer-Thiersch, 1986; but see  Bertault et al., 2000). Our results demonstrate that the sex of greater flamingo chicks can be determined based on individual morphology using DFA and a subsample of individuals molecularly sexed. This method was also used for adult greater flamingos (Richter and Bourne, 1990) and other birds, where morphometric criteria may discriminate between males and females (Childress et al., 2005;Alarcos et al., 2007;Hurley et al., 2007;Ackerman et al., 2008;Herring et al., 2008;Lislevand et al., 2009;Herring et al., 2010).
However, we found that the reliability of one particular yearly discriminant function was reduced when attempting to apply it across cohorts. This result was consistent with the Evans et al. (1993) study on laughing gulls (Larus atricilla); they found a significant difference when DFA was applied at different years and localities. We also found that females were better discriminated than males in all years. The lower discriminant rate for males could result from an asymmetry in the distribution of the size of the males with an overrepresentation of smaller males. The discriminant rate of greater flamingo chicks varied from 67.1 % to 86.5 % and was lower than in other waterbird studies: Childress et al. (2005)  . Furthermore, our simulation showed that for small samples, there is a wide variance in the proportion of misclassified birds. Indeed, a high or low discriminant rate can be obtained by chance (low or high misclassified); therefore the use of a formula constructed from a small dataset is problematic in the greater flamingos and should be avoided. Like previous studies, we recommend extreme caution when sexing birds based on DFA, particularly when discriminant equations are derived from small datasets (Brennan et al., 1991;Shealer and Cleary, 2007;Isaksson et al., 2008;Dechaume-Moncharmont et al., 2011). Our results show that sexing flamingo chicks based on DFA requires a large dataset and repeated sampling to include natural yearly variations on body condition during the breeding season. We therefore conclude that DFA is not an effective method to sex flamingo chicks.
Data availability. The data that support the findings of this study are available from the Tour du Valat greater flamingo team (bechet@tourduvalat.org), upon reasonable request.
Author contributions. AbB designed and conducted the study under the supervision of AB, RP and BS. AbB designed the statistical analyses and wrote the text. AB, RP and BS supervised and commented on the text. RN, MG and FXDM commented on the text.
Competing interests. The authors declare that they have no conflict of interest.

Acknowledgements.
We are most grateful to Ruben Heleno, Carolina Hospitaleche and the anonymous reviewer for their valuable comments. We are grateful to Luc Hoffmann and the Tour du Valat Foundation for material support.
Financial support. This work was financed by the MAVA foundation and the Algerian Ministère de l'Enseignement Supérieur et de la Recherche Scientifique (MESRS/DGRSTD). Abdennour Boucheker benefited from a PhD thesis grant from the MAVA foundation.
Review statement. This paper was edited by Ruben Heleno and reviewed by Carolina Hospitaleche and one anonymous referee.