Generalized gametic relationships for flexible analyses of parent-of-origin effects

Abstract A class of epigenetic inheritance patterns known as genomic imprinting allows alleles to influence the phenotype in a parent-of-origin-specific manner. Various pedigree-based parent-of-origin analyses of quantitative traits have attempted to determine the share of genetic variance that is attributable to imprinted loci. In general, these methods require four random gametic effects per pedigree member to account for all possible types of imprinting in a mixed model. As a result, the system of equations may become excessively large to solve using all available data. If only the offspring have records, which is frequently the case for complex pedigrees, only two averaged gametic effects (transmitting abilities) per parent are required (reduced model). However, the parents may have records in some cases. Therefore, in this study, we explain how employing single gametic effects solely for informative individuals (i.e., phenotyped individuals), and only average gametic effects otherwise, significantly reduces the complexity compared with classical gametic models. A generalized gametic relationship matrix is the covariance of this mixture of effects. The matrix can also make the reduced model much more flexible by including observations from parents. Worked examples are present to illustrate the theory and a realistic body mass data set in mice is used to demonstrate its utility. We show how to set up the inverse of the generalized gametic relationship matrix directly from a pedigree. An open-source program is used to implement the rules. The application of the same principles to phased marker data leads to a genomic version of the generalized gametic relationships.


Introduction
Epigenetic marks of parent-of-origin may lead to the functional inequivalence of parental alleles with respect to gene expression. This pattern is known as genomic imprinting. According to previous reviews (e.g., Lawson et al. 2013), the phenotypic signature of genomic imprinting at a single locus appears as a trait mean difference between reciprocal heterozygotes AB and BA (the first allele is paternal). The broader term parent-of-origin effects (POEs) is often used to emphasize that other underlying mechanisms such as maternal genetic effects could contribute to the observed inequivalence (Hager et al. 2008). For complex traits, there is evidence that variants of nonimprinted genes can generate substantial POEs by interacting with imprinted loci (Mott et al. 2014), despite the small number known for the latter. An imprinting variance component can serve as a type of joint phenotypic signature of multiple imprinted loci that could potentially affect a quantitative trait. This component summarizes the squared trait differences between reciprocal heterozygotes at all imprinted loci as a single quantity and it can be estimated by analyzing large pedigreed populations with suitable mixed models. Recent examples of this type of study in livestock genetics by Neugebauer et al. (2010aNeugebauer et al. ( , 2010b, Tier and Meyer (2012), Blunk et al. (2017aBlunk et al. ( , 2017b, Okamoto et al. (2019), and Inoue et al. (2020) all considered carcass traits in cattle. Furthermore, Blunk et al. (2020) investigated type 1 diabetes and rheumatoid arthritis in the field of human genetic epidemiology.
Some of the studies mentioned above used simplified relationships, including only sire and maternal grandsire information, mainly due to partially missing pedigree information (Okamoto et al. 2019) or to ensure that the number of equations was small when analyzing large volumes of data (Blunk et al. 2017a). In other studies, the mixed model employed a classical gametic relationship matrix Blunk et al. 2020) derived from complete pedigree information for several tens to hundreds of thousands of individuals. The gametic relationship model is the most costly variant in terms of the number of equations for a given pedigree, but it allows imprinting analysis to be conducted in every case where mixed model analysis of pedigreed data is possible.
Shortly after its discovery, researchers recognized that the gametic relationship matrix (Smith and Allaire 1985;Schaeffer et al. 1989) is useful for isolating fractions of the genetic variance in quantitative traits caused by genomically imprinted loci. In the early stages of pedigree-based imprinting analysis, animal models were augmented by an additional vector of paternal (alternatively, maternal) gametic effects, which was usually modeled as uncorrelated with any other effect. In standard practice, the vector's variance was represented as the product of a gametic relationship matrix and a variance component caused by polymorphisms at loci with only paternal (maternal) gene activity. These models were pioneered by De Vries et al. (1994) and used for more than a decade. However, they can account only for a single type of classical imprinting, where either the maternal or paternal alleles are fully silenced. A proposal by Hill and Keightly (1988) to consider both types of imprinting simultaneously was not implemented in any pedigree-based analyses of empirical data. Furthermore, there was uncertainty regarding the methods used to account for the effects of partially imprinted loci, where both alleles are active, but at different strengths depending on their parental origins.
A model for parent-of-origin analysis was subsequently developed (Neugebauer et al. 2010a(Neugebauer et al. , 2010b and it is comprehensive in the sense that it accounts for all types of imprinting, i.e., full or partial, and maternal or paternal (Blunk and Reinsch 2014). This so-called reduced imprinting model relates observations from nonparents (final progeny, e.g., animals used for meat) to the transmitting abilities (half of the breeding values) of their parents. There are two correlated genetic effects per parent comprising the transmitting ability as sire and transmitting ability as dam, which reflect an animal's genetic effect on its offspring under a paternal or maternal imprinting pattern. These two genetic effects are different in the presence of genomically imprinted loci. The variance in these differences is referred to as the imprinting variance because it summarizes the contributions from all types of possible imprinted loci. A numerator relationship matrix is necessary for parents only because the final progeny with observations, but without offspring, do not appear in the underlying pedigree and the resulting relationship matrix. The null hypothesis comprising the absence of polymorphic imprinted loci with an effect on the trait under investigation (i.e., a zero imprinting variance) can be tested with a restricted maximum likelihood (REML) ratio test (RLRT).
Alternatively, a comprehensive gametic model can be used to estimate the same set of genetic covariances, including the imprinting variance Meyer and Tier 2012), where one must estimate four gametic effects per individual, i.e., two as sire and two as dam, and the relationships include the final progeny with observations. The gametic model allows for records from parents, which is an advantage compared with the reduced imprinting model. Furthermore, it is possible to extend this model with maternal genetic effects.
According to the principle applied, this maternal genetic component of the variance is a particular challenge because it is difficult to separate from the imprinting variance. Okamoto et al. (2019) showed that the resulting imprinting variance estimated with a model variant that uses information only for the sire and maternal grandsire (Blunk et al. 2017a;Okamoto et al. 2019) can also be interpreted as maternal genetic variance. Similarly, for the reduced imprinting model, it is possible to show that the imprinting variance and maternal genetic variance cannot be disentangled when both are present, and thus it is only possible to infer a composite component of variance (for a theoretical derivation, see Appendix A4).
The utilization of measured genotypes in genomic best linear unbiased prediction (gBLUP) models that include imprinting effects was outlined by Nishio and Satoh (2015). The first of the two variants of the proposed model (GBLUP-I1) contains an imprinting effect at all markers, which is modeled as independent of the actions of all markers as un-imprinted Mendelian loci. An additive genetic effect summarizes the latter per marker. The second model (GBLUP-I2) considers a paternal and maternal gametic effect with zero mutual correlation. Although not considered by Nishio and Satoh (2015), it is clearly possible to change this into a comprehensive model by abandoning the assumption of a zero correlation and by replacing pedigree-derived gametic relationships with a genomic counterpart of equal size and structure. In cases where not all pedigreed individuals are genotyped, this would then allow combined analysis of the genotyped and un-genotyped individuals in a single-step approach (Legarra et al. 2009;Aguilar et al. 2010;Christensen and Lund 2010). By contrast, it is not possible to easily extend the first model (GBLUP-I1) to a pedigree-derived counterpart.
The disadvantage of the gametic model is the large number of equations (Smith and Allaire 1985) used to represent the random genetic effects, particularly when estimating the variance components. A pedigree with a size of approximately half a million is a technical barrier for REML estimation in animal models using the currently available software packages (Shor et al. 2019). With a gametic parent-of-origin model, only a quarter of individuals require the same number of equations. Therefore, the question arises whether an option is available where models retain the flexibility of the gametic model while allowing for a considerably smaller number of equations for random genetic effects, and it must be as close as possible to the reduced imprinting model.
As a solution, we propose a generalization of the gametic model with a much smaller redefined vector of genetic effects obtained by linear transformation of the original gametic effects. We refer to the corresponding relationship matrix as a generalized gametic relationship matrix and present rules for its rapid inversion from the pedigree. As a result, the size of the gametic model becomes more manageable while retaining all of its advantages. We also show how the same type of transformation can be applied to measured genotypes to obtain conformable genomic and pedigree-derived versions of the new relationship matrix. In the following, we theoretically derive the generalization of the gametic model. Worked examples (available in the Supplementary material) are provided as illustrations for all of the models described. Finally, pedigreed mouse data are employed to demonstrate the utility of the generalized gametic model and to allow for conclusions regarding the influence of POEs on body mass.

Materials and methods
Theory Generalized gametic relationships: In gametic models under Mendelian inheritance, each individual i is represented by the additive genetic effects of its paternal gamete g i;1 and maternal gamete g i;2 (Schaeffer et al. 1989). These effects are usually arranged in a pairwise manner in a vector g of length 2t, which is twice the number t of individuals in the pedigree. The equation for a phenotypic observation y i of individual i is: where l i ¼ x 0 i b is a place-holder for any combination of explanatory variables in vector x 0 i with fixed effects b, and the residual e i . Thus, the gametic model splits the additive genetic value (breeding value) b i of individual i into paternally derived and maternally derived parts The basic idea of reducing the equations in gametic models by a considerable number involves replacing the two gametic effects of a subset of u individuals by their pairwise average: which is known as the transmitting ability (half the breeding value) of individual i. All gametic effects in the vector g can be arranged such that the gametic effects of all u individuals precede the gametic effects of those v that are bound in order to retain their distinct gametic effects for imprinting analysis, as explained in the next paragraph. The corresponding subdivision of g is: The subvectors g u and g v have lengths of 2u and 2v, respectively. The covariances of all gametic effects in g are the elements of the 2t Â 2t gametic relationship matrix G (Schaeffer et al. 1989), which can be partitioned into sections that correspond to the relationships between the gametic effects in g u and g v .
The required average gametic effects can be obtained by a linear transformation, which is defined by a matrix K 0 such that: This operation replaces the gametic effects of all individuals in g u by their transmitting abilities in a u . The upper-left partition K 0 u of the transformation matrix K 0 has dimensions of u Â 2u, and it is defined as the Kronecker product of a u Â u identity matrix I u and a row vector with two elements equal to 1 2 : Furthermore, K 0 comprises a 2v Â 2v identity matrix I v and two null matrices, 0 1 and 0 2 , with dimensions of u Â 2v and 2v Â 2u, respectively.
The covariance matrix of the transformed vector of gametic effects a then becomes: which is called a generalized gametic relationship matrix in the following. In the context of imprinting analyses, a natural choice involves retaining the gametic effects of all individuals with their own phenotypes in vector g v and representing all their ancestors without records by their transmitting abilities as a u . The subdivisions of G are then: The upper-left part 1 2 A u is equal to the coancestry matrix (half the numerator relationship matrix) of all ancestors without their own records, while G vv reflects the relationships between the gametic effects of all individuals with their own observations. Finally, S uv contains the covariances between the transmitting abilities and gametic effects. We consider a small example involving four individuals (IDs). There are three transmitting abilities for individuals 1, 2, and 3, with corresponding pairwise elements of 1 2 in the transformation matrix K 0 and two gametic effects, where the elements in K 0 are one. The resulting generalized gametic relationship matrix G has dimensions of 5 Â 5: Generalized gametic relationships in a gametic model: Based on the descriptions above, the equation for an observation y i in a model that uses generalized gametic relationships and considers POEs remains the same as that in the classical gametic model: where the superscripts s and d indicate the paternal (g s i;1 ) and maternal (g d i;1 ) expression patterns of the candidate's gametes of paternal and maternal descent (indicated by subscripts 1 and 2), respectively. Gametes of the same parental descent, but opposite mode of expression (g d i;1 and g s i;2 ), are not directly related to any observation, but they can be estimated through their covariances with other effects. It is possible to average each pair of gametic effects with the same expression pattern from any individual without records, thereby obtaining the transmission abilities as sire A mixed model that considers POEs and uses the generalized relationship matrix then becomes: where Y is a vector of observations, b comprises the fixed effects, and X is the corresponding incidence matrix. The assumed covariance of random effects is: This generalized gametic model contains the size-reduced transformed gametic effect vectors a s and a d , which are the counterparts of the full-size gametic effect vectors g s and g d , respectively, and each has a length of 2t. Consequently, the model uses the corresponding relationship matrix G instead of the classical gametic relationships of G: The vector of genetic effects under a paternal (maternal) mode of expression a s (a d ) for genetic effects has an associated gametic variance component r 2 s (r 2 d ) and the covariance between the expression patterns is r sd . Thus, the imprinting variance (Neugebauer et al. 2010a(Neugebauer et al. , 2010b) is: which is equivalent to the variance in the differences between gametic effects under alternative modes of expression. Furthermore, the incidence matrices Z s and Z d link observations to the random gametic effects in a s and a d , respectively, whereas no observation is linked to any of the transmitting abilities in the latter vectors. As a result, any incidence matrix Z a ¼ ½ 0 u Z v that links observations to gametic effects in the generalized vector of genetic effects a 0 ¼ ½ a 0 u g 0 v can be considered a converted incidence matrix Z g ¼ ½ 0 2u Z v from a classical gametic model that links the observations to the gametic effects This transformation retains all columns in the partition Z v , i.e., one per gametic effect of individuals with phenotypes, and the number of null columns in 0 u of Z a collapses to half of that of 0 2u in Z g . In the same manner, both incidence matrices Z s and Z d from the previous model equation are converted versions of their counterparts in the classical gametic model, which forms the basis for the proof of equivalence for the classical and generalized gametic models involving G (see Appendix A1). For a worked example based on a small data set analyzed with the generalized gametic model, see part 1 of the Supplementary material.

Reduced gametic model:
The reduced imprinting model initially described by Neugebauer et al. (2010aNeugebauer et al. ( , 2010b relates each observation from a final progeny i to the transmitting abilities as sire a s si and dam a d di for the parents si (sire of i) and di (dam of i), respectively. For a single observation y i , we have the equation: where the residual r i is a sum of the Mendelian sampling effects of both parents (m si and m di ) and the measurement noise (e i ). The latter is identical to the residual of the gametic model. Thus, we have: Its variance is a function of the inbreeding coefficients F si and F di for the parents of i: After rewriting the transmitting abilities of the parents as the averages of the respective gametic effects, i.e., a s si ¼ 1 2 ðg s 1;si þ g s 2;si Þ and a d di ¼ 1 2 ðg d 1;di þ g d 2;di Þ, we obtain an equation in terms of gametic effects: Then, the covariance of the gametic effects is: where the relationship matrix G of the gametic effects that define the transmitting abilities involved includes only the parents and their ancestors. The advantage of this reduced gametic model compared with the previously published version that uses only transmitting abilities and their relationship matrix 1 2 A is that it allows us to easily integrate observations from parents by linking them to the respective gametic effects. Hence, for observations of any parent i, the observation equation becomes: Part 1 of Supplementary A presents a worked example based on a small data set analyzed with the reduced gametic model.

Generalized reduced gametic model:
The disadvantage of the reduced gametic model is that it employs twice the number of equations compared with a version obtained using 1 2 A. However, for all individuals without their own records, it is possible to reduce the number of equations for random genetic effects by representing the individuals through their transmitting abilities (average gametic effects), while retaining separate gametic effects for all parents with phenotypes, i.e., the vectors of gametic effects g s and g d are replaced by appropriately transformed counterparts a s and a d , respectively. Consequently, the covariance matrix of random genetic effects in a parsimonious generalized reduced gametic model that allows for parents with phenotypes is: Furthermore, we need a diagonal matrix W of weights equal to w i ¼ 1 for observations from parents to which observation equation (3) applies, and: for the final progeny, where parents without their own records are represented by the transmitting abilities or both parents have a record and are represented by gametic effects [the equations are (1) and (2), respectively]. The same weight applies to mixed types of representations that arise in cases where one parent of the final progeny has a record whereas the other does not. The corresponding equations for observations y i of the final progeny are: and For the generalized reduced gametic model, a detailed worked example based on a small data set is also presented in the Supplementary Part S1.

General model for parent-of-origin analyses:
A general comprehensive model for parent-of-origin analyses is based on the generalized gametic relationship matrix. Special cases of the generalized gametic relationship matrix G are the classical gametic relationship matrix G ¼ G in the gametic model and G ¼ 1 2 A in the reduced imprinting model. Correspondingly, the matrix W of weights can be an identity matrix that fits the classical gametic model, or a matrix of weights that differ from one like those in the reduced model for records of the final progeny. A general model can be specified for parent-of-origin analyses containing these two basic types of comprehensive imprinting models as well as models with any combination of gametic effects and transmitting abilities that can be obtained using our transformation matrix K 0 . In matrix notation, the general model is: where e is a vector of residuals, i.e., e i ¼ e i for records from individuals represented by two gametic effects or e i ¼ r i for observations from final progeny linked to the genetic effects of their parents. The respective weights are: : Random genetic effects and residuals have the assumed covariance: The resulting mixed model equations are: The general model allows for any combination of observation equations (1) to (5) to provide a large degree of flexibility in parent-of-origin analyses. One may choose a model variant to minimize the number of equations for random genetic effects by using as many reduced observation equations as possible, but at the cost of recomputing the weights when estimating the components of variance. Alternatively, one may avoid the repeated recomputation of weights by representing all individuals with an observation using gametic effects. The underlying reason for this flexibility is that for the given data (observations, fixed effects, and pedigree), each possible general imprinting model has as an equivalent unique classical gametic model (as shown in Appendices A2 and A3). Consequently, any two general models that share the same equivalent classical model are also equivalent, and can replace each other, especially when estimating the components of variance.
Direct inversion of the generalized gametic relationship matrix: Setting up the inverse generalized gametic relationship matrix is crucial for any large-scale application. One can derive rules for direct inversion by factoring the inverse G À1 into inverses of a matrix T 0 and a diagonal matrix D of inverse Mendelian sampling variances: This is a known principle based on the direct inversion of the numerator relationship matrix (Henderson 1976;Quaas 1976) and the classical gametic relationship matrix (Schaeffer et al. 1989). The matrix ðT 0 Þ À1 is lower triangular, as shown in Supplementary Figure S2.1. The underlying pedigree for this example (see Supplementary Part S2) comprises 12 individuals. Single transmitting abilities represent nine individuals and a pair of two gametic effects denotes each of the remaining three. The last column of the pedigree file indicates these two alternative types of representations by values of one and two, respectively. Consequently, the dimensions of the inverse of the example are 15 Â 15 and each of the 15 rows of ðT 0 Þ À1 pertains to one of these effects. However, for the direct inversion, it is necessary to assess the types of genetic effects for all individuals and also for their parents. An individual's transmitting ability may be derived from two unknown parents (a-00) or a single unknown parent, where the known parent may be represented by a transmitting ability (a-0a, a-a0) or two gametic effects (a-0gg, a-gg0). Two known parents may be represented as any combination of transmitting abilities or gametic effects (a-aa, a-agg, a-gga, a-gggg). Similarly, a gametic effect may be derived from an unknown parent (g-0), or a known parent denoted by either a transmitting ability or two gametic effects (g-a, g-gg). These 12 cases need to be distinguished for the direct inversion of the generalized gametic relationship matrix. Each of these cases appears at least once in the example pedigree. The last column of Supplementary Figure S2.1 indicates the respective case for each effect that relates to a particular row of the lower-triangular matrix. It should be noted that the six cases comprising a-0gg, a-gg0, a-agg, a-gga, a-gggg, and g-a are specific to the generalized gametic relationship matrix because they do not appear in the direct inversion of the numerator relationship matrix (involving only a-00, a-0a, a-a0, and a-aa) or the classical gametic relationship matrix, where only g-0 and g-gg need to be distinguished.
The Mendelian sampling variances that define the diagonal elements of D are different for the transmitting abilities and gametic effects. Furthermore, they depend on the occurrence of unknown parents and the inbreeding coefficients of the known ones. In particular, we have F known parent when an individual with a transmitting ability in the matrix has only one known parent, or F sire and F dam in the case of full parentage information. For gametes, we need to account for the inbreeding coefficient F parent of the known parent from which a gamete is derived. Accordingly, the 12 cases (a-00, a-0a, . . ., g-00) are grouped into the following five classes with distinct formulae for the inverse Mendelian sampling variance d: a À 00 d ¼ 2 a À 0a; a À a0; a À 0gg; a À gg0 a À aa; a À agg; a À gga; a À gggg d ¼ 2 1 It is possible to construct the inverse generalized gametic relationship matrix from the pedigree in a step-by-step manner for any arbitrary order of genetic effects. In each step, a matrix contribution U i is added for genetic effect i to a matrix comprising an inverse G À1 iÀ1 that already includes the preceding i À 1 effects and zeroes: where 0 is a column vector of i À 1 zeros and: is the contribution made for each genetic effect i. The row vector u 0 i comprises all zeros, except for the elements that correspond to the genetic effects of the respective parent(s). The i-th element (equal to unity) at least is a nonzero in this vector. If present, all other nonzero elements are negative, with values of either À 1 2 or À 1 4 . Thus, the number of nonzero entries varies from one to five, which can be derived from the rows in the example triangular matrix ðT 0 Þ À1 in Supplementary Figure S2.1. Table 1 summarizes the nonzero coefficients in u 0 i and their indices for all 12 possible cases. The nonzero elements of the resulting matrix U i ¼ u i u 0 i d i correspond to the (scaled) cross-products of the elements of the nonzero vector, and their coordinates in the matrix are the respective combinations of indices.
Transforming measured genotypes in a generalized genomic gametic relationship matrix: Parent-of-origin analyses may also use genomic relationships, or combined genomic and pedigree relationships. However, a specific feature of this approach is that ordinary marker genotypes (AA, AB, BB) are not sufficient. Instead the parental origins of the marker alleles at each locus must be inferred (Lawson et al. 2013, and references therein) and summarized as ordered genotypes of AA, AB, BA, and BB, where the first allele is paternally derived, but this is not always possible for all members of a genealogy. In this case, the principles described above are beneficial for integrating ordered and unordered genomic information into a single genomic version of the generalized gametic relationship matrix. Let us assume that all tindividuals are genotyped with p markers and all genotypes phased into 2t haplotypes. Information regarding the number (zero or one at each locus before centering) of minor alleles for all marker loci on each haplotype can be summarized in a 2t Â p matrix C, which is meancentered column by column. For this matrix, each individual i contributes two p-row vectors c 0 i1 and c 0 i2 , with centered allele indicators for its first and second haplotypes. Matrix C can then be split into two submatrices C v and C u : For imprinting analyses, at least all u individuals with phenotypes need to have paternal and maternal haplotypes identified in C u . Thus, we must add at least one preceding generation without records but with genotypes. All haplotypes in matrix partition C v are unordered in the case of a single generation. If the additional v genotyped individuals contain more than a single successive generation, then only part of their genotypes may qualify as ordered where the exceptions come from the founders.
A genomic gametic relationship matrix can be derived from C: where s is a scaling factor, s ¼ P p j ð1 À p j Þ, and p j is the allele frequency at marker j.
In all cases where the parental origin of the two haplotypes can be traced back, the first haplotype of each individual is assumed to be paternal and the second maternal (c 0 i1 ¼ c 0 ip and c 0 i2 ¼ c 0 im ); otherwise, the ordering of haplotypes is arbitrary. We apply the concept of generalization described above. One can define a transformation matrix K 0 such that for all individuals i with unordered genomic information, the two row vectors c 0 i1 and c 0 i2 are replaced by their averages: The vector c i does not depend on the order or the parental origin of the haplotypes of an individual: , an individual's paternal gamete is assumed. For a maternal gamete, the indices of the genetic effects of the respective effects of the known parent are indexed in the same manner as for a sire.
i.e., c i is also the vector of average centered paternal and maternal allele indicators. Consequently, a generalized genomic gametic relationship matrix can be defined as: where K 0 is defined as above. The partition C u C 0 u of G g can be used to determine only the ordered genomic information of all individuals with phenotypes, and thus, it is sufficient to estimate the components of genetic variance in a parent-of-origin analysis. It is possible to estimate all of the respective gametic effects of these individuals. The entire matrix G g represents the gametic effects (as sire and dam) for all individuals, including those with no phenotypes. By design, the generalized variant G g is also appropriate for parent-of-origin analyses and there are no other requirements for K 0 in the same manner as the pedigree-derived counterpart. Thus, the general model for parent-of-origin analyses is also applicable to genomic relationships provided that the marker haplotypes of individuals with observations are ordered.

Example data in mice
The mouse line DUKs was maintained for 153 generations as a control for a long-term growth selection experiment (Bü nger et al. 1998) without any intentional selection conducted at the Leibniz-Institute for Farm Animal Biology. Young DUKs mice were weaned at 21 days of age, and up to five males and four females from each litter were then further reared in two separate cages. At an age of 42 days, the body mass (BM42) was measured for two randomly selected males (BM42 varied from 9.33 g to 41.73 g, with an average of 29.38 g). Randomly selected males and females from about 50% of all litters formed each next generation. This percentage varied from generation to generation due to random fluctuations in the pregnancy rate. Males with a record for BM42 generally did not become a sire, but there were some exceptions when excessively few other males were left in a litter. Therefore, the vast majority (97.67%) of all 13,077 observations were obtained from final progeny. The pedigree size, including all animals with phenotypes and their ancestors was 28,150 animals. Founders were assigned to generation zero. Inbreeding increased up to an average inbreeding coefficient of 0.62 in the last generation, with an average inbreeding coefficient of 0.40 over all generations. In total, 110 generations were included in data analyses as fixed effects (because body weight was not recorded in every generation) as well as 6648 uncorrelated random litter effects. The genetic part of the model was alternatively covered by: (1) breeding values in an animal model (AM); (2) a classical gametic model (ICM); (3) a generalized gametic model (IGM); and (4) a reduced version (IRM) of the IGM. The IGM comprised gametic effects only for animals with phenotypes, and transmitting abilities for all others. For the IRM, the underlying pedigree did not include all phenotyped animals without offspring (12,772 in number), thereby leaving only phenotyped parents (305 animals) and their ancestors (15,378 animals). The imprinting models ICM, IGM, and IRM considered genomic imprinting by design, and MCM, MGM, and MRM, respectively, represented the null hypothesis of purely Mendelian inheritance. Furthermore, additional maternal gametic effects augmented the gametic models ICM and IGM to separate them from possible imprinting effects. Inverse relationships of all types were computed using our own Fortran program and REML estimates of variance components were obtained with the software packages ASReml version 4.1 (Gilmour et al. 2015) and Echidna version 1.32 (Gilmour 2018). The R-packages "pedigree" version 1.4 (Coster 2013) and "readxl" version 1.3.1 (Wickham et al. 2019) in R version 4.0.0 were used to prepare the data. All data, command files, and output files are stored in the RADAR repository. The significance of the imprinting variance was tested by comparing the REML log-likelihood of each imprinting model (ICM, IGM, and IRM) with the REML log-likelihood outcome of the corresponding Mendelian model (MCM, MGM, and MRM). An approximate RLRT with two degrees of freedom was performed (Neugebauer et al. 2010a(Neugebauer et al. , 2010b. The existence of maternal genetic variance was tested by comparing the REML log-likelihood of an AM without maternal effect with the REML log-likelihood of an AM with maternal effect. Detailed descriptions of the data, methods, and results are presented in the Supplementary Part S3.

Software and data availability
A collection of six worked toy examples is provided (Part S1 of the Supplementary material provided at https://doi.org/10.25387/g3. 14046479) to illustrate the various gametic models with generalized gametic relationships. For each example, the R-code is explicitly presented and used to solve the corresponding mixed model equations.
The same examples are part of the detailed Guide to Practical Implementation. We also demonstrate in detail how to implement the various mixed models with generalized gametic relationships using the ASReml package. The Guide to Practical Implementation is available via the RADAR repository (https://www. doi.org/10.22000/284) and it also includes the source code of a program for directly setting up the inverse of the generalized gametic relationship matrix based on a pedigree file, a detailed program description, and example input and output files.

Results
When nonimprinted Mendelian inheritance was assumed in our mouse example (see Table 2, upper-left part), the classical gametic model (MCM) used 56,300 gametic effects (100%) and the equivalent animal model with a numerator relationship (AM) employed exactly half of that number (28,150; 50%). The generalized gametic model with two gametic effects per mouse with record and a single transmitting ability for all others (MGM) was almost exactly in between with 41,227 equations (73%). The number of equations decreased to only 15,683, i.e., 28% of the benchmark, when all final progeny with trait values were represented through the transmitting abilities of their parents (MRM). The corresponding numbers of lower triangle nonzero elements in the inverse were 184,046 (100%) for the MCM and 92,023 (50%) for the AM. The MGM had 103,125 nonzero elements in the lower triangle, which was close to the AM (56%). Only half of that amount was needed by the model with reduced observation equations (MRM), with only 51,372 nonzero elements (28%). When the model allowed for POEs, the absolute number of equations doubled for each model variant (Table 2, upper-right part), and this also applied to the number of saved equations, but their share was the same as that with simple Mendelian inheritance.
It should be noted that the REML log-likelihood outcomes and the genetic parameters were equal for the Mendelian models and for the imprinting models. The reduced model versions (MRM, IRM) yielded the same REML log-likelihood as the gametic and generalized gametic versions, but with only a single iteration and identical (co-)variance parameters. The results differed when analyses with adapted weightings were repeated until the REML log-likelihood outcomes stabilized (Table 2).
Excluding maternal effects, the imprinting analyses yielded significant imprinting variances with an RLRT of 46.06 (P ¼ 9.96 Â 10 À11 with DF ¼ 2) for the ICM and IGM, and an RLRT of 38.74 (P ¼ 3.87 Â 10 À9 with DF ¼ 2) for the IRM. Using the ICM and IGM, POEs explained 31.40% (69.40%) of the total genetic variance and 17.52% (65.68%) of the phenotypic variance (Figure 1). By applying the IRM, POEs explained 39.10% (68.30%) of the total genetic variance (Table 2). Heritability estimates ranged from 55.80% (63.30%; ICM and IGM) to 66.60% (62.50%; IRM). All of the estimated genetic parameters are shown in Table 2. Estimates of the variance and covariance components are summarized in Supplementary Table S3.2. Supplementary Figure S3.3 shows that there was only slight genetic progress in BM42 with about 5 g in 146 generations.
Exceptionally slow convergence was observed when both maternal and imprinting effects were included in the model. The log-likelihood improved little compared with the Mendelian model that included a maternal genetic effect (Table 2), so the imprinting variance was considered to be not significant (RLRT ¼ 2.64; P ¼ 0.45 with DF ¼ 3). By contrast, the maternal genetic variance was found to be significant compared with an imprinting model without maternal genetic effects (RLRT ¼ 25.76; P ¼ 1.07 Â 10 À5 with DF ¼ 3). The direct heritability (19.60%; 60.20%) and relative litter variance (18.80%; 60.90%) decreased further when all types of POEs were considered, and an increase in relative maternal variance (13.30%; 60.20%) was observed (Table 2; Figure 1).

Discussion
The outlined generalization introduces elements of the reduced imprinting model into the gametic model, and vice versa, to obtain increased flexibility and substantial reductions in terms of the number of equations. The latter is especially important for estimating the components of variance (Shor et al. 2019). The matrix G contains two limiting cases that set the boundaries for the ratio of equations that can possibly be eliminated. The first is the Table 2 Logarithmic values of the REML (LogL), overall number of random genetic effects (no. equa.), and total number of nonzero elements (nonzero) in the lower triangle of the inverted variance-covariance matrix of random genetic effects  Figure 1 Phenotypic variance of body mass measured in the DUKs mouse line partitioned into the residual variance (gray), additive genetic variance (purple), litter variance (yellow), maternal genetic variance (green), Mendelian variance (blue), and imprinting variance (red). The Mendelian variance was derived by subtracting the imprinting variance from the (direct) additive genetic variance. The variance components were estimated using a model assuming pure Mendelian inheritance (Men), a model assuming Mendelian inheritance and maternal genetic effects [Men (mat)], a model assuming the existence of genomic imprinting but excluding maternal genetic effects (Imp), and a full model that also includes maternal genetic effects [Imp (mat)]. Error bars indicate standard errors. For the Imp (mat) model, the standard errors could not be estimated for all components. classical gametic relationship matrix itself (dimensions of 2t Â 2t) when K 0 is an identity matrix. The other limiting case is Therefore, the reduction in the number of equations for genetic effects can range from 0% to 50% compared with a classical gametic model. However, the actual reduction depends on the specific details of each data set. Furthermore, a specific fraction of individuals with records (i.e., with phenotypes) might have not reproduced at all or not yet reproduced at the time of data recall, i.e., they appear as final progeny, which allows them to be represented by reduced observational equations rather than having their own gametic effects in the model. This may result in a reduction of even more than 50% of the full set of gametic equations. In general, sex-specific traits such as the litter size, number of eggs, or milk yield allow all males to be represented by their transmitting abilities. Thus, the resulting number of equations is considerably smaller compared with a trait commonly recorded in both sexes. The family structure also has an effect, where it is possible to save more equations with smaller groups of paternal offspring if sires without their phenotype have transmitting abilities as genetic effects. Furthermore, in imprinting analyses, it is reasonable to add a high ratio of ancestors without phenotypes to better reflect inbreeding and the relationships between genetic effects as sire and dam. Thus, we can minimize the burden of additional equations by estimating the transmitting abilities of these ancestors rather than their gametic effects.
In certain cases, we could refrain from using reduced observational equations, which have the advantage that no weights are required that depend on currently undetermined components of the variance. This approach may be beneficial in Bayesian approaches that employ Markov Chain Monte Carlo methods, where the values of the components of variance change from iteration to iteration. By exploiting the flexibility of the generalized approach, weights become obsolete by representing all individuals with phenotypes by two gametic effects, regardless of whether they are final progeny. This helps to offset the computational burden due to repeated reweighting. In addition, we can integrate individuals without observations by single equations. In previous REML estimations of the components of variance with reduced imprinting models (Neugebauer et al. 2010a(Neugebauer et al. , 2010bBlunk et al. 2017aBlunk et al. , 2017b, equal weights were used at the start and later recalculated repeatedly until final convergence was reached. For the mouse example data, this led to an REML loglikelihood that was slightly below the actual maximum (Table 2)  Excluding this difficulty finding the maximum, all of the REML log-likelihood values were identical (Table 2) if we assumed the same underlying genetic model, irrespective of the equivalent representations employed for the gametic relationships. The same was clearly true for all of the estimated variance components (Supplementary Table S3.2). Thus, our mouse example data analyses technically reproduced all of the theoretically derived model equivalences.
Among the DUKs mouse data, none of the dams had a trait value, which can hinder the estimation of variance components in the presence of maternal genetic effects (Heydarpour et al. 2008), and it probably explains the observed slow convergence rate for the full model with three genetic effects. An imprinting model variant with reduced observation equations (IRM) including maternal genetic effects was not applied to the mouse data set because the reduced imprinting model cannot separate the imprinting variance from maternal genetic variance components when both are present, as mentioned in the introduction and proven in Appendix A4. A solution to this issue involves avoiding reduced equations and explicitly representing individuals with phenotypes based on their gametic effects in a model that also includes maternal genetic effects. In principle, it should then be possible to separate the gametic variances as sire and dam from the maternal genetic variance (Appendix A5). However, in practice, limitations in the amount and structure of the data may hinder this approach, as also observed in a human genetic epidemiological study (Blunk et al. 2020) and for Mendelian models (Heydarpour et al. 2008). Similar to maternal effects models, other types of imprinting models may also include more than a single genetic effect as sire and dam per individual, e.g., random regression models or multitrait models. However, they are all hindered by a large number of gametic equations, and thus they benefit even more from generalized relationships.
No significant imprinting effects were found to affect the variations in body mass in the DUKs mouse line, although imprinting is known to have important functions in stem cells, neuronal differentiation, and growth (Plasschaert and Bartolomei 2014). By contrast, maternal effects seemed to play an important role, which was expected because maternal genetic effects on body mass traits in mice have been known for a long time (Hanrahan and Eisen 1973). When maternal effects were neglected in the analysis, direct heritability seemed to be overestimated, as shown in Figure 1 for the Mendelian and the imprinting models. Furthermore, when maternal effects were not considered, the imprinting variance was inflated and the estimates were reduced when maternal effects were included in the model (Figure 1). The litter variance seemed to be largely unaffected by the inclusion of POEs and it explained about a quarter of the phenotypic variance (21-27%).
In applications where all v individuals with phenotypes plus at least one preceding generation have measured genotypes and the variance components need be estimated, it is sufficient to include only the subset of these v individuals with their genomic covariance G gvv . If the genetic effects of the u founders as sire and dam are of interest, then either G g or G g is selected. An example is an F 2 line-cross experiment with phenotypes recorded only in the F 2 generation, where the genotypes of the F 1 and P 0 generations are only required for phasing and determining the line origins of the markers.
To assess genomic imprinting effects, a two-step approach was developed for the reduced imprinting model. In the first step, imprinting effects (differences between transmitting abilities as sires and dams) must be estimated for parents in pedigree-based analyses. After de-regressing these estimates and using the corresponding reliabilities, they can be employed as dependent variables in a genome-wide association study (Blunk et al. 2019) where the marker genotypes of the parents can be unordered. Further investigations are required to determine whether we can extend this two-step approach to generalized gametic relationships, which would require de-regressing the differences between genetic effects of all types (both gametic effects and transmitting abilities as sire and dam) in a similar manner, before they are subjected to a genome scan.
Animal breeders frequently combine large pedigrees comprising smaller cohorts of genotyped individuals. Certain individuals are then the first to be genotyped in their genealogy and one can trace their pedigree further back. In contrast to their own descendants, it is not possible to order the haplotypes of these candidates, and thus, it is uncertain whether the first of two unordered marker haplotypes matches the paternal gametic effect in a pedigree-derived gametic relationship matrix or the maternal one. Consequently, a combined relationship matrix cannot be constructed that is suitable for parent-of-origin analyses. We can solve this problem by collapsing gametic effects into transmitting abilities in the genomic relationships as well as in the pedigree-derived relationships. The generalized pedigree relationships for all animals can then be combined with their matching generalized genomic counterparts G g for the genotyped cohort in a manner that facilitates integration of unordered genomic information. The available theory (Legarra et al. 2009;Christensen and Lund 2010;Aguilar et al. 2010) can be used to combine pedigreederived relationships (G) and genomic relationships (G g ) into a joint matrix, at least in the many cases where candidates with unordered genotypes have no records, such as dairy bulls.
In conclusion, the generalized gametic relationship matrix provides the necessary flexibility to adapt imprinting analyses to specific computational and analytical requirements in many situations by using tailored versions of the general imprinting model. The most important features of this method are the effective estimation of the imprinting variance in REML and Bayesian approaches in case where the parents have records, the inclusion of maternal genetic effects, and genomic relationships that integrate ordered and unordered genomic information. Overall, these new options are expected to stimulate systematic research into the importance of POEs for the genetic variation in quantitative traits in farm animals and other species.
The first term can be rewritten as Finally, In the generalized case, the variance of observations is In the same manner, We then obtain: From Q g ¼ Q c , it follows that VarðyÞ is the same in both models and they are equivalent.