This function fits genetic competition models using asreml::asreml()
asr(
prep.out,
fixed,
random = ~1,
spatial = TRUE,
cor = TRUE,
lrtest = FALSE,
...
)A comprep object.
A formula object specifying the fixed terms in the model, with the
response on the left of a ~ operator, and the terms,
separated by + operators, on the right. If data is given,
all names used in all formulae should appear in the data frame. A
model with the intercept as the only fixed effect can be specified
as ~ 1; there must be at least one fixed effect
specified. If the response (y) evaluates to a matrix then a
factor trait with levels dimnames(y)[[2]] is added to
the model frame, and must be explicitly included in the model
formulae (default = NULL).
The function is set to fit the genetic competition models in the
random term. To consider further random effects, random receives a formula
object specifying them. Otherwise, random = ~1 (default). This argument has
the same general characteristics as fixed, but there can be no left side to
the ~ operator. Variance structures imposed on random terms are specified using
special model functions. See asreml::asreml() for details.
A logical value. If TRUE (default), fits a spatial-genetic competition model
A logical value. If TRUE (default), fits a model considering the correlation
between direct and indirect genetic effects.
A logical value. If TRUE, performs a likelihood ratio test to verify
the significance of the direct and indirect genetic effects. Defaults to FALSE.
Arguments passed on to asreml::asreml
sparseA formula object, specifying the fixed effects for which the full
variance-covariance matrix is not required. This argument has the
same general characteristics as fixed, but there can be no
left side to the ~ expression. Wald statistics are not
available for sparse fixed terms in order to reduce the computing
load (default = ~NULL).
G.paramEither:
A list object derived from the
random formula, holding initial parameter estimates and
boundary constraints for each term, or
A character string naming a comma-delimited file with a
header line and three columns for the variance component name,
initial value and constraint code, respectively. This file can be
created using the start.values = TRUE argument; the internal list
object is then generated from the contents of this file.
On termination, G.param is updated with the final
random component estimates.
R.paramEither:
A list object derived from the random formula, holding
initial parameter estimates and boundary constraints, or
A character string naming a comma-delimited file with a
header line and three columns for the variance component name,
initial value and constraint code, respectively. This file can be
created using the start.values = TRUE argument; the internal list
object is then generated from the contents of this file.
On termination, R.param is updated with the final
residual component estimates.
na.actionA call to na.method() specifying the action to be taken when
missing values are encountered in the response (y) or
explanatory variables (x for factors and/or variates). The function definition for
na.method is:
function(y = c("include", "omit", "fail"),
x = c("fail", "include", "omit"))
The default action is "include" (and estimate) missing values in
the response, and raise an error ("fail" if there are missing values
in any of the explanatory variables.
subsetA logical vector identifying which subset of the rows of data
should be used in the fit. All observations are included by
default.
weightsA character string or name identifying the column of data to
use as weights in the fit.
predictA list object specifying the classifying factors and related
options when forming predictions from the model. This list would
normally be the value returned by a call to the method
predict for asreml objects.
vcmA matrix defining relationships among variance parameters. The
matrix has a row for each original variance parameter and a column
for each new parameter. The default is the identity matrix, that
is, no action. See function vcm.lm for further information
and an example.
vccEquality constraints between variance parameters; a two-column
numeric matrix with a dimnames attribute. The first column
defines the grouping structure of equated components, that
is, components within an equality group are given the same
numeric index, and the second column contains the scaling
coefficients. The dimnames()[[1]] attribute must match the
component names in the asreml parameter vector; see
start.values.
The parameters are scaled relative to the first parameter in its group, so the scaling of the first parameter in each group is one.
For example, the following vcc matrix:
| 1 | 1 |
| 2 | 1 |
| 2 | 2 |
| 3 | 1 |
is equivalent to the vcm matrix:
| 1 | 0 | 0 |
| 0 | 1 | 0 |
| 0 | 2 | 0 |
| 0 | 0 | 1 |
familyA list of functions and expressions for defining the link and variance functions.
Supported families are: gaussian, inverse Gaussian,
binomial, negative binomial, poisson and
Gamma. Family objects are generated
from the asreml family distributions which
prefix the usual function names with "asr_"; for example
asr_gaussian(), asr_binomial(), etc. In addition to
the link argument, these functions take an additional
dispersion argument and a total argument where
relevant; for example:
asr_binomial(dispersion = 1.0, total = counts).
The default for asr_gaussian() is dispersion = NA,
which implies that asreml will estimate the dispersion
parameter, otherwise the scale is fixed at the nominated value.
asmvA character string or name specifying the column in the data that
identifies the traits in a multivariate analysis. If not
NULL, asmv implies that the data for a multivariate
analysis is set up as though it were for a univariate analysis with
the response in a single variate (default = NULL).
mbfA named list specifying sets of covariates to be included with one
or more mbf() model functions. Each component of the list
must in turn contain components named key and cov,
where cov is a character string naming the data frame
holding the covariates, and key is a character vector of
length two naming the columns in data and cov,
respectively, used to match corresponding records in the two data
frames. The default is an empty list.
equate.levelsA character vector of factor names whose levels are to be
equated. For example, if factor A has levels a, b, c, d and
factor B has levels a, b, c, e, the effect of
equate.levels(A, B) is that both A and B have
five levels, with as.numeric(A) = 1, 2, 3, 4 and
as.numeric(B) = 1, 2, 3, 5. This may be necessary if
using the and() model function to overlay columns of the
model's design matrix in forming a compound term. The default is a
zero-length character vector.
start.valuesIf TRUE, asreml exits prior to the fitting process
and returns a list of length three: the G.param and
R.param lists, and a data frame (containing variance
parameter names, initial values and boundary constraints). Initial
values or constraints can then be set in the list or data frame
objects (default = FALSE).
If this is a character string, then a file of that name is created and the
data frame object containing initial parameter values is written
out in comma-separated form. This file can be edited externally and
subsequently specified in the G.param or R.param
arguments.
knot.pointsA named list where each component is a vector of user-supplied knot
points for a particular spline term; the component name is the
object of the spl() model function.
pwr.pointsA named list with each component containing a vector of distances
to be used in a one-dimensional power model. The component
names must correspond to the object arguments of the power
function model terms.
waldA named list with three components: denDF, ssType
and Ftest.
denDF: A character string from the options: "none",
"numeric", "algebraic", and "default" specifying
the calculation of approximate denominator degrees of freedom. The
option "none" is to suppress the computations. Algebraic
computations are not feasible in large analyses; use
"default" to automatically choose numeric or algebraic
computations depending on problem size. The denominator degrees of
freedom are calculated according to Kenward and Roger, 1997
for terms in the fixed model formula (default = "none").
ssType: It can be "incremental" for incremental sum of
squares or "conditional" for F-tests that
respect both structural and intrinsic marginality (default = "incremental".
Ftest: A one-sided formula of the form ~ test_term |
background_terms specifying a conditional Wald test of the
contribution of test_term conditional on those fixed terms
listed in background_terms, and the those in the
random and sparse model formulae.
pruneA named list with each component generated from a call to
Subset(). The argument prune, in conjunction with Subset
and the model function sbs(), forms a new factor from an
existing one by selecting a subset of its levels. The function
Subset is defined as:
function(f, x)
where f is the name of an existing factor and x is a
character or numeric vector of levels to select. The name of the
list component is the new factor that may appear in the model
formulae as the argument to the sbs() model function. For
example,
prune = list(A = Subset(Site, c(2, 3)))
creates a new factor A by selecting the second and third
levels of Site, and would be included in the model as:
sbs(A), for example by using idv(sbs(A)) as part of a random term.
While the actions of prune can be duplicated
outside asreml, sbs() is necessary if the
asreml method predict() is to be used.
combineA named list with each component generated from a call to
Levels(). The argument combine, in conjunction with Levels
and the model function gpf(), forms a new factor from an
existing one by merging a subset of its levels. The function
Levels is defined as:
function(f, x)
where f is the name of an existing factor and x is a
vector of length length(levels(f)) defining the levels
of f to merge. The name of the list component is the new
factor that may appear in the model formulae as the argument to the
gpf() model function. For example, if Site has levels
"1", "2" and "3",
combine = list(A = Levels(Site, c("1", "2", "1")))
creates a new factor A with levels "1" and "2"
by merging levels "1" and "3" of Site, and
would be included in the model as gpf(A). While the actions
of combine can be duplicated outside asreml,
gpf() is necessary if the asreml method
predict() is to be used.
uidA named list with each component generated from a call to
Units(). The argument uid, in conjunction with Units and
the model function uni(), forms a new factor by selecting a
subset of records from an existing one. The function Units is
defined as:
function(f, n = 0)
where f is the name of an existing factor and n is a
character or numeric scalar that determines which records are
selected. The default, n = 0, forms a factor with a level for
each record where f is non-zero (strictly, f != 0).
Otherwise, a factor with a level for each record in
data where f has the value n is formed. For example,
uid = list(A = Units(group, 1))
creates a new factor A with levels from
row.names(data) where group = 1, and would be
included in the model as: uni(A). While the actions of
uid can be duplicated outside asreml, uni() is
necessary if the asreml method predict() is to be
used.
mefA named list linking a relationship matrix (or its inverse) as
specified in the vm() special function with the original
matrix of subject x regressor (typically molecular
marker) scores. If this is not an empty list mef
flags the computation of the regressor (marker) effects from
the subject effects. For example,
mef = list(MM = snp.mat)
links the relationship matrix MM to the original marker
scores found in the file snp.mat.
The mef list would typically be set from a call to the asreml
meff() method.
lastA named list restricting the order equations are solved in the sparse
partition for the nominated model terms. Each component of the list is
named by a model term and contains a scalar \(n\) specifying that
the first \(n\) levels of the term will be solved after all others in the
sparse set. It is intended for use when there are multiple fixed terms
in the sparse equations so that asreml will be consistent in
which effects are identified as singular. A maximum of three factor/level
pairs can be specified.
model.frameIf TRUE, the model frame (a data.table
object with additional attributes derived from the model
specification) is included in the returned object (default = TRUE).
The model frame is required by the asreml summary,
plot, resid and fitted methods.
In large analyses, the model frame is likely to be a large
object. If model.frame is a character string, the model
frame is saved in a file as an RDS object by a call to
saveRDS(), and named by the supplied string with the
extension .RDS. If the model frame is not included in the
returned asreml object, this RDS file is searched for
by the methods noted previously.
An object of class asreml containing the results of the fitted linear model.
Instances of generic methods such as plot(), predict() and summary() return
various derived results of the fit. The method resid(), coef() and fitted()
extract some of its components. See asreml::asreml.object() for details of the
components of the returned list.
A general genetic competition linear mixed model can be represented by:
$$\mathbf{y} = \mathbf{X} \boldsymbol{\beta} + \mathbf{Z}_g \mathbf{g} + \mathbf{Z}_c \mathbf{c} + \mathbf{Z}_p \mathbf{p} + \boldsymbol{\varepsilon}$$
where \(\mathbf{y}\) is the vector of phenotypic records, \(\boldsymbol{\beta}\)
is the vector of fixed effects, \(\mathbf{g}\) is the vector of direct genetic
effects (DGE), \(\mathbf{c}\) is the vector of indirect genetic effects (IGE),
\(\mathbf{p}\) is the vector of other random effects,
and \(\boldsymbol{\varepsilon}\) is the vector of errors.
\(\mathbf{X}\) is the incidence matrix of the fixed effects, \(\mathbf{Z}_g\)
is the DGE incidence matrix, \(\mathbf{Z}_c\) is the IGE incidence matrix (the genetic competition matrix),
and \(\mathbf{Z}_p\) is the design matrix of
other random effects. The dimensions of \(\mathbf{Z}_c\) are the same
as \(\mathbf{Z}_g\). If spatial = TRUE, \(\varepsilon\) is a vector of spatially correlated errors, distributed as
\(\boldsymbol{\varepsilon} \sim N\{\mathbf{0}, \sigma^2_\varepsilon[\mathbf{AR1}(\rho_C) \otimes \mathbf{AR1}(\rho_R)]\}\),
where \(\sigma^2_\varepsilon\) is the spatially correlated residual variance,
\(\mathbf{AR1}(\rho_C)\) and \(\mathbf{AR1}(\rho_R)\) are the first-order autoregressive
correlation matrices in the column and row directions, and \(\otimes\) is the
Kronecker product. If cor = TRUE, the function will fit a model in which
\(\mathbf{g}\) and \(\mathbf{c}\) are correlated outcomes of the genotypic effects'
decomposition. They both follow a Gaussian distribution, with mean centred in zero, and
covariance given by:
$$\mathbf{\Sigma_g} = \begin{bmatrix}\sigma_{\text{g}}^2 & \sigma_{\text{gc}}\\\sigma_{\text{gc}} & \sigma_{\text{c}}^2\\\end{bmatrix}\otimes {{\mathbf I_V}}$$
where \(\sigma_{\text{g}}^2\) is the DGE variance, \(\sigma_{\text{c}}^2\) is the IGE variance, and \(\sigma_{\text{gc}}\) is the covariance between DGE and IGE.
The likelihood ratio test is performed using a model without the correlation between DGE and IGE.
# \donttest{
library(gencomp)
comps = prepcrop(data = potato, gen = "gen", row = "row", col = "col",
plt = NULL, effs = c("rep", 'matur'), trait = "yield",
direction = "row", verbose = TRUE)
#>
Running through the grid --> = | 1%
Running through the grid --> == | 2%
Running through the grid --> === | 4%
Running through the grid --> ==== | 5%
Running through the grid --> ===== | 6%
Running through the grid --> ====== | 8%
Running through the grid --> ====== | 9%
Running through the grid --> ======= | 10%
Running through the grid --> ======== | 11%
Running through the grid --> ========= | 12%
Running through the grid --> ========== | 14%
Running through the grid --> =========== | 15%
Running through the grid --> ============ | 16%
Running through the grid --> ============= | 18%
Running through the grid --> ============== | 19%
Running through the grid --> =============== | 20%
Running through the grid --> ================ | 21%
Running through the grid --> ================= | 22%
Running through the grid --> ================== | 24%
Running through the grid --> ================== | 25%
Running through the grid --> =================== | 26%
Running through the grid --> ==================== | 28%
Running through the grid --> ===================== | 29%
Running through the grid --> ====================== | 30%
Running through the grid --> ======================= | 31%
Running through the grid --> ======================== | 32%
Running through the grid --> ========================= | 34%
Running through the grid --> ========================== | 35%
Running through the grid --> =========================== | 36%
Running through the grid --> ============================ | 38%
Running through the grid --> ============================= | 39%
Running through the grid --> ============================== | 40%
Running through the grid --> =============================== | 41%
Running through the grid --> =============================== | 42%
Running through the grid --> ================================ | 44%
Running through the grid --> ================================= | 45%
Running through the grid --> ================================== | 46%
Running through the grid --> =================================== | 48%
Running through the grid --> ==================================== | 49%
Running through the grid --> ===================================== | 50%
Running through the grid --> ====================================== | 51%
Running through the grid --> ======================================= | 52%
Running through the grid --> ======================================== | 54%
Running through the grid --> ========================================= | 55%
Running through the grid --> ========================================== | 56%
Running through the grid --> =========================================== | 57%
Running through the grid --> =========================================== | 59%
Running through the grid --> ============================================ | 60%
Running through the grid --> ============================================= | 61%
Running through the grid --> ============================================== | 62%
Running through the grid --> =============================================== | 64%
Running through the grid --> ================================================ | 65%
Running through the grid --> ================================================= | 66%
Running through the grid --> ================================================== | 68%
Running through the grid --> =================================================== | 69%
Running through the grid --> ==================================================== | 70%
Running through the grid --> ===================================================== | 71%
Running through the grid --> ====================================================== | 72%
Running through the grid --> ======================================================= | 74%
Running through the grid --> ======================================================== | 75%
Running through the grid --> ======================================================== | 76%
Running through the grid --> ========================================================= | 78%
Running through the grid --> ========================================================== | 79%
Running through the grid --> =========================================================== | 80%
Running through the grid --> ============================================================ | 81%
Running through the grid --> ============================================================= | 82%
Running through the grid --> ============================================================== | 84%
Running through the grid --> =============================================================== | 85%
Running through the grid --> ================================================================ | 86%
Running through the grid --> ================================================================= | 88%
Running through the grid --> ================================================================== | 89%
Running through the grid --> =================================================================== | 90%
Running through the grid --> ==================================================================== | 91%
Running through the grid --> ==================================================================== | 92%
Running through the grid --> ===================================================================== | 94%
Running through the grid --> ====================================================================== | 95%
Running through the grid --> ======================================================================= | 96%
Running through the grid --> ======================================================================== | 98%
Running through the grid --> ========================================================================= | 99%
Running through the grid --> ==========================================================================|100%
mod = asr(prep.out = comps, fixed = yield ~ matur + rep,
random = ~1, spatial = FALSE, cor = TRUE, lrtest = TRUE)
#> ASReml Version 4.2 21/01/2025 18:21:17
#> LogLik Sigma2 DF wall
#> 1 -93.05792 2.950998 74 18:21:17 ( 1 restrained)
#> 2 -85.00098 1.930523 74 18:21:17 ( 1 restrained)
#> 3 -78.29023 1.322943 74 18:21:17
#> 4 -75.33534 1.113123 74 18:21:17
#> 5 -74.42077 1.018644 74 18:21:17
#> 6 -74.33504 0.9905424 74 18:21:17
#> 7 -74.33169 0.9874756 74 18:21:17
#> ====> Starting likelihood ratio tests
#> ASReml Version 4.2 21/01/2025 18:21:17
#> LogLik Sigma2 DF wall
#> 1 -92.62951 2.913895 74 18:21:17
#> 2 -85.48394 1.954790 74 18:21:17
#> 3 -80.02454 1.331554 74 18:21:17
#> 4 -77.70891 1.058111 74 18:21:17
#> 5 -77.09890 0.9462433 74 18:21:17
#> 6 -77.06740 0.9251240 74 18:21:17
#> 7 -77.06730 0.9236241 74 18:21:17
#> ASReml Version 4.2 21/01/2025 18:21:17
#> LogLik Sigma2 DF wall
#> 1 -97.99999 3.616653 74 18:21:17
#> 2 -97.92777 3.668674 74 18:21:17
#> 3 -97.88110 3.784983 74 18:21:17
#> 4 -97.87936 3.764715 74 18:21:17
#> ASReml Version 4.2 21/01/2025 18:21:17
#> LogLik Sigma2 DF wall
#> 1 -93.69185 3.406970 74 18:21:17
#> 2 -89.60816 2.803129 74 18:21:17
#> 3 -86.30688 2.311391 74 18:21:17
#> 4 -84.89770 2.039909 74 18:21:17
#> 5 -84.57068 1.906676 74 18:21:17
#> 6 -84.55990 1.882292 74 18:21:17
#> ====> LRT results:
#> effect LR-statistic Pr(Chisq)
#> 1 DGE 41.62410 5.531053e-11
#> 2 IGE 14.98519 5.417914e-05
# }