Title: | Forensic Pedigree Analysis and Relatedness Inference |
---|---|
Description: | Forensic applications of pedigree analysis, including likelihood ratios for relationship testing, general relatedness inference, marker simulation, and power analysis. 'forrel' is part of the 'pedsuite', a collection of packages for pedigree analysis, further described in the book 'Pedigree Analysis in R' (Vigeland, 2021, ISBN:9780128244302). Several functions deal specifically with power analysis in missing person cases, implementing methods described in Vigeland et al. (2020) <doi:10.1016/j.fsigen.2020.102376>. Data import from the 'Familias' software (Egeland et al. (2000) <doi:10.1016/S0379-0738(00)00147-X>) is supported through the 'pedFamilias' package. |
Authors: | Magnus Dehli Vigeland [aut, cre]
|
Maintainer: | Magnus Dehli Vigeland <[email protected]> |
License: | GPL (>= 2) |
Version: | 1.7.1.9000 |
Built: | 2025-02-11 18:28:08 UTC |
Source: | https://github.com/magnusdv/forrel |
The checkPairwise()
function provides a convenient way to check for
pedigree errors, given the available marker data. The function calls
ibdEstimate()
to estimate IBD coefficients for all pairs of typed pedigree
members, and uses the estimates to test for potential errors. By default, the
results are shown in a colour-coded plot (based on ribd::ibdTriangle()
)
where unlikely relationships are easy to spot.
checkPairwise( x, ids = typedMembers(x), includeInbred = FALSE, acrossComps = TRUE, plotType = c("base", "ggplot2", "plotly", "none"), GLRthreshold = 1000, pvalThreshold = NULL, nsim = 0, seed = NULL, plot = TRUE, verbose = TRUE, excludeInbred = NULL, ... ) plotCP( cpRes = NULL, plotType = c("base", "ggplot2", "plotly"), labels = FALSE, errtxt = "Potential error", seed = NULL, ... )
checkPairwise( x, ids = typedMembers(x), includeInbred = FALSE, acrossComps = TRUE, plotType = c("base", "ggplot2", "plotly", "none"), GLRthreshold = 1000, pvalThreshold = NULL, nsim = 0, seed = NULL, plot = TRUE, verbose = TRUE, excludeInbred = NULL, ... ) plotCP( cpRes = NULL, plotType = c("base", "ggplot2", "plotly"), labels = FALSE, errtxt = "Potential error", seed = NULL, ... )
x |
A |
ids |
A vector of ID labels; the individuals to include in the check.
Default: All typed members of |
includeInbred |
A logical, by default FALSE, indicating if inbred individuals should be excluded from the analysis. |
acrossComps |
A logical indicating if pairs of individuals in different components should be considered. Default: TRUE. |
plotType |
Either "base" (default), "ggplot2", "plotly" or "none". Abbreviations are allowed. |
GLRthreshold |
A positive number, by default 1000. Threshold for the generalised likelihood ratio (see Details). Scores exceeding this are flagged as potential errors in the output table and encircled in the plot. |
pvalThreshold |
A positive number, or NULL (default). If given, this is
used instead of |
nsim |
A nonnegative number; the number of simulations used to estimate p-values. If 0 (default), this step is skipped. |
seed |
An integer seed for the random number generator (optional, and
only relevant if |
plot |
Deprecated. To suppress the triangle plot, use |
verbose |
A logical. |
excludeInbred |
Deprecated; renamed to ´includeInbred´. |
... |
Further parameters passed on to |
cpRes |
A data frame: the output from |
labels |
A logical (default: FALSE). If TRUE, labels are included in the IBD triangle plot. |
errtxt |
A character string to use for the error legend. |
To identify potential pedigree errors, the function calculates the
generalised likelihood ratio (GLR) of each pairwise relationship.
This compares the likelihood of the estimated coefficients with that of the
coefficients implied by the pedigree. By default, relationships whose GLR
exceed 1000 are flagged as errors and shown with a circle in the plot.
Alternatively, if arguments nsim
and pvalThreshold
are supplied, the
p-value of each score is estimated by simulation, and used as threshold for
calling errors.
By default, inbred individuals are excluded from the analysis, since pairwise
relationships involving inbred individuals have undefined kappa coefficients
(and therefore no position in the triangle). In some cases it may still be
informative to include their estimates; set includeInbred = TRUE
to
enforce this.
If plotType
is "none" or "base": A data frame containing both the
estimated and pedigree-based IBD coefficients for each pair of typed
individuals. The last columns (GLR
, pval
and err
) contain test
results using the GLR scores to identify potential pedigree errors.
If plotType
is "ggplot2" or "plotly", the plot objects are returned.
### Example with realistic data x = avuncularPed() |> profileSim(markers = NorwegianFrequencies, seed = 1729) checkPairwise(x) ### Create an error: sample swap 1 <-> 3 als = getAlleles(x) als[c(1,3), ] = als[c(3,1), ] y = setAlleles(x, alleles = als) checkPairwise(y) # Using p-values instead of GLR nsim = 10 # increase! checkPairwise(y, nsim = nsim, pvalThreshold = 0.05) # Plot can be done separately res = checkPairwise(y, nsim = nsim, pvalThreshold = 0.05, plotType = "none") plotCP(res, plotType = "base", errtxt = "Not good!") # Combined plot of pedigree and check results dev.new(height = 5, width = 8, noRStudioGD = TRUE) layout(rbind(1:2), widths = 2:3) plot(y, margins = 2, title = "Swapped 1 - 3") plotCP(res, labels = TRUE)
### Example with realistic data x = avuncularPed() |> profileSim(markers = NorwegianFrequencies, seed = 1729) checkPairwise(x) ### Create an error: sample swap 1 <-> 3 als = getAlleles(x) als[c(1,3), ] = als[c(3,1), ] y = setAlleles(x, alleles = als) checkPairwise(y) # Using p-values instead of GLR nsim = 10 # increase! checkPairwise(y, nsim = nsim, pvalThreshold = 0.05) # Plot can be done separately res = checkPairwise(y, nsim = nsim, pvalThreshold = 0.05, plotType = "none") plotCP(res, plotType = "base", errtxt = "Not good!") # Combined plot of pedigree and check results dev.new(height = 5, width = 8, noRStudioGD = TRUE) layout(rbind(1:2), widths = 2:3) plot(y, margins = 2, title = "Swapped 1 - 3") plotCP(res, labels = TRUE)
Computes the power (of a single marker, or for a collection of markers) of excluding a claimed relationship, given the true relationship.
exclusionPower( claimPed, truePed, ids, markers = NULL, source = "claim", disableMutations = NA, exactMaxL = Inf, nsim = 1000, seed = NULL, alleles = NULL, afreq = NULL, knownGenotypes = NULL, Xchrom = FALSE, plot = FALSE, plotMarkers = NULL, verbose = TRUE )
exclusionPower( claimPed, truePed, ids, markers = NULL, source = "claim", disableMutations = NA, exactMaxL = Inf, nsim = 1000, seed = NULL, alleles = NULL, afreq = NULL, knownGenotypes = NULL, Xchrom = FALSE, plot = FALSE, plotMarkers = NULL, verbose = TRUE )
claimPed |
A |
truePed |
A |
ids |
Individuals available for genotyping. |
markers |
A vector indicating the names or indices of markers attached
to the source pedigree. If NULL (default), then all markers attached to the
source pedigree are used. If |
source |
Either "claim" (default) or "true", deciding which pedigree is used as source for marker data. |
disableMutations |
This parameter determines how mutation models are treated. Possible values are as follows:
|
exactMaxL |
A positive integer, or |
nsim |
A positive integer; the number of simulations used for markers
whose number of alleles exceeds |
seed |
An integer seed for the random number generator (optional). |
alleles , afreq , Xchrom
|
If these are given, they are used (together with
|
knownGenotypes |
A list of triplets |
plot |
Either a logical or the character "plotOnly". If the latter, a plot is drawn, but no further computations are done. |
plotMarkers |
A vector of marker names or indices whose genotypes are to be included in the plot. |
verbose |
A logical. |
This function implements the formula for exclusion power as defined and discussed in (Egeland et al., 2014).
It should be noted that claimPed
and truePed
may be any (lists of)
pedigrees, as long as they both contain the individuals specified by ids
.
In particular, either alternative may have inbred founders (with the same or
different coefficients), but this must be set individually for each.
If plot = "plotOnly"
, the function returns NULL after producing the
plot.
Otherwise, the function returns an EPresult
object, which is essentially
a list with the following entries:
EPperMarker
: A numeric vector containing the exclusion power of each
marker. If the known genotypes of a marker are incompatible with the true
pedigree, the corresponding entry is NA
.
EPtotal
: The total exclusion power, computed as 1 - prod(1 - EPperMarker, na.rm = TRUE)
.
expectedMismatch
: The expected number of markers giving exclusion,
computed as sum(EPperMarker, na.rm = TRUE)
.
distribMismatch
: The probability distribution of the number of markers
giving exclusion. This is given as a numeric vector of length n+1
, where
n
is the number of nonzero elements of EPperMarker
. The vector has
names 0:n
.
time
: The total computation time.
params
: A list containing the (processed) parameters ids
, markers
and disableMutations
.
Magnus Dehli Vigeland
T. Egeland, N. Pinto and M.D. Vigeland, A general approach to power calculation for relationship testing. Forensic Science International: Genetics 9 (2014): 186-190. doi:10.1016/j.fsigen.2013.05.001
############################################ ### A standard case paternity case: ### Compute the power of exclusion when the claimed father is in fact ### unrelated to the child. ############################################ # Claim: 'AF' is the father of 'CH' claim = nuclearPed(father = "AF", children = "CH") # Attach two (empty) markers claim = claim |> addMarker(alleles = 1:2) |> addMarker(alleles = 1:3) # Truth: 'AF' and 'CH' are unrelated true = singletons(c("AF", "CH")) # EP when both are available for genotyping exclusionPower(claim, true, ids = c("AF", "CH")) # EP when the child is typed; homozygous 1/1 at both markers claim2 = claim |> setGenotype(marker = 1:2, id = "CH", geno = "1/1") exclusionPower(claim2, true, ids = "AF") ############################################ ### Two females claim to be mother and daughter, but are in reality sisters. ### We compute the power of various markers to reject the claim. ############################################ ids = c("A", "B") claim = nuclearPed(father = "NN", mother = "A", children = "B", sex = 2) true = nuclearPed(children = ids, sex = 2) # SNP with MAF = 0.1: PE1 = exclusionPower(claimPed = claim, truePed = true, ids = ids, alleles = 1:2, afreq = c(0.9, 0.1)) stopifnot(round(PE1$EPtotal, 5) == 0.00405) # Tetra-allelic marker with one major allele: PE2 = exclusionPower(claimPed = claim, truePed = true, ids = ids, alleles = 1:4, afreq = c(0.7, 0.1, 0.1, 0.1)) stopifnot(round(PE2$EPtotal, 5) == 0.03090) ### How does the power change if the true pedigree is inbred? trueLOOP = halfSibPed(sex2 = 2) |> addChildren(4, 5, ids = ids) # SNP with MAF = 0.1: PE3 = exclusionPower(claimPed = claim, truePed = trueLOOP, ids = ids, alleles = 1:2, afreq = c(0.9, 0.1)) # Power almost doubled compared with PE1 stopifnot(round(PE3$EPtotal, 5) == 0.00765)
############################################ ### A standard case paternity case: ### Compute the power of exclusion when the claimed father is in fact ### unrelated to the child. ############################################ # Claim: 'AF' is the father of 'CH' claim = nuclearPed(father = "AF", children = "CH") # Attach two (empty) markers claim = claim |> addMarker(alleles = 1:2) |> addMarker(alleles = 1:3) # Truth: 'AF' and 'CH' are unrelated true = singletons(c("AF", "CH")) # EP when both are available for genotyping exclusionPower(claim, true, ids = c("AF", "CH")) # EP when the child is typed; homozygous 1/1 at both markers claim2 = claim |> setGenotype(marker = 1:2, id = "CH", geno = "1/1") exclusionPower(claim2, true, ids = "AF") ############################################ ### Two females claim to be mother and daughter, but are in reality sisters. ### We compute the power of various markers to reject the claim. ############################################ ids = c("A", "B") claim = nuclearPed(father = "NN", mother = "A", children = "B", sex = 2) true = nuclearPed(children = ids, sex = 2) # SNP with MAF = 0.1: PE1 = exclusionPower(claimPed = claim, truePed = true, ids = ids, alleles = 1:2, afreq = c(0.9, 0.1)) stopifnot(round(PE1$EPtotal, 5) == 0.00405) # Tetra-allelic marker with one major allele: PE2 = exclusionPower(claimPed = claim, truePed = true, ids = ids, alleles = 1:4, afreq = c(0.7, 0.1, 0.1, 0.1)) stopifnot(round(PE2$EPtotal, 5) == 0.03090) ### How does the power change if the true pedigree is inbred? trueLOOP = halfSibPed(sex2 = 2) |> addChildren(4, 5, ids = ids) # SNP with MAF = 0.1: PE3 = exclusionPower(claimPed = claim, truePed = trueLOOP, ids = ids, alleles = 1:2, afreq = c(0.9, 0.1)) # Power almost doubled compared with PE1 stopifnot(round(PE3$EPtotal, 5) == 0.00765)
This function computes the expected LR for a single marker, in a kinship test
comparing two hypothesised relationships between a set of individuals. The
true relationship may differ from both hypotheses. Some individuals may
already be genotyped, while others are available for typing. The
implementation uses oneMarkerDistribution()
to find the joint genotype
distribution for the available individuals, conditional on the known data, in
each pedigree.
expectedLR(numeratorPed, denominatorPed, truePed = numeratorPed, ids, marker)
expectedLR(numeratorPed, denominatorPed, truePed = numeratorPed, ids, marker)
numeratorPed |
A |
denominatorPed |
A |
truePed |
A |
ids |
A vector of ID labels corresponding to untyped pedigree members. (These must be members of all three input pedigrees). |
marker |
either a marker object compatible with |
A positive number.
#--------- # Curious example showing that ELR may decrease # by typing additional reference individuals #--------- # Numerator ped numPed = nuclearPed(father = "fa", mother = "mo", child = "ch") # Denominator ped: fa, mo, ch are unrelated. (Hack!) denomPed = halfSibPed() |> relabel(old = 1:3, new = c("mo", "fa", "ch")) # Scenario 1: Only mother is typed; genotype 1/2 p = 0.9 m1 = marker(numPed, mo = "1/2", afreq = c("1" = p, "2" = 1-p)) expectedLR(numPed, denomPed, ids = "ch", marker = m1) 1/(8*p*(1-p)) + 1/2 # exact formula # Scenario 2: Include father, with genotype 1/1 m2 = m1 genotype(m2, id = "fa") = "1/1" expectedLR(numPed, denomPed, ids = "ch", marker = m2) 1/(8*p*(1-p)) + 1/(4*p^2) # exact formula
#--------- # Curious example showing that ELR may decrease # by typing additional reference individuals #--------- # Numerator ped numPed = nuclearPed(father = "fa", mother = "mo", child = "ch") # Denominator ped: fa, mo, ch are unrelated. (Hack!) denomPed = halfSibPed() |> relabel(old = 1:3, new = c("mo", "fa", "ch")) # Scenario 1: Only mother is typed; genotype 1/2 p = 0.9 m1 = marker(numPed, mo = "1/2", afreq = c("1" = p, "2" = 1-p)) expectedLR(numPed, denomPed, ids = "ch", marker = m1) 1/(8*p*(1-p)) + 1/2 # exact formula # Scenario 2: Include father, with genotype 1/1 m2 = m1 genotype(m2, id = "fa") = "1/1" expectedLR(numPed, denomPed, ids = "ch", marker = m2) 1/(8*p*(1-p)) + 1/(4*p^2) # exact formula
Functions for reading .fam files associated with the Familias software for forensic kinship computations.
Deprecated These functions have been moved to a separate package,
pedFamilias
, and will be removed from forrel
in a future version.
readFam(...)
readFam(...)
... |
Arguments passed on to the respective |
Find markers for which the genotypes of a candidate individual is incompatible with a pedigree
findExclusions(x, id, candidate, removeMut = TRUE)
findExclusions(x, id, candidate, removeMut = TRUE)
x |
A |
id |
A character of length 1; the name of an untyped member of |
candidate |
A singleton pedigree, with genotypes for the same markers as |
removeMut |
A logical. If TRUE (default), all mutations models are stripped. |
A character vector containing the names of incompatible markers.
db = NorwegianFrequencies[1:5] # Pedigree with 3 siblings; simulate data for first two x = nuclearPed(3) |> profileSim(ids = 3:4, markers = db, seed = 1) # Simulate random person poi = singleton("POI") |> profileSim(markers = db, seed = 1) # Identify incompatible markers findExclusions(x, id = 5, candidate = poi) # D21S11 # Inspect plotPedList(list(x, poi), marker = "D21S11", hatched = typedMembers)
db = NorwegianFrequencies[1:5] # Pedigree with 3 siblings; simulate data for first two x = nuclearPed(3) |> profileSim(ids = 3:4, markers = db, seed = 1) # Simulate random person poi = singleton("POI") |> profileSim(markers = db, seed = 1) # Identify incompatible markers findExclusions(x, id = 5, candidate = poi) # D21S11 # Inspect plotPedList(list(x, poi), marker = "D21S11", hatched = typedMembers)
A data frame describing (a subset of) the FORCE panel of SNPs designed for applications in forensic genetics (Tillmar et al., 2021). The subset included here are the SNPs recommended for kinship analysis. As the original publication did not include allele frequencies, these were downloaded from Ensembl via the biomaRt package. 15 markers were removed as frequency information could not be retrieved.
FORCE
FORCE
A data frame with 3915 rows and 6 columns:
CHROM
: Chromosome (1-22)
MARKER
: Marker name (rs number)
MB
: Physical position in megabases (build GRCh38)
A1
: First allele
A2
: Second allele
FREQ1
: Allele frequency of A1
To attach the FORCE markers to a pedigree, use pedtools::setSNPs()
(see
Examples).
Tillmar et al. The FORCE Panel: An All-in-One SNP Marker Set for Confirming Investigative Genetic Genealogy Leads and for General Forensic Applications. Genes. (2021)
x = setSNPs(nuclearPed(), snpData = FORCE) summary(x) getMap(x, markers = 1:5) getFreqDatabase(x, markers = 1:5)
x = setSNPs(nuclearPed(), snpData = FORCE) summary(x) getMap(x, markers = 1:5) getFreqDatabase(x, markers = 1:5)
This function produces (parametric or nonparametric) bootstrap estimates of
the IBD coefficients between two individuals; either the three
-coefficients or the nine condensed identity coefficients
(see
ibdEstimate()
).
ibdBootstrap( x = NULL, ids = NULL, param = NULL, kappa = NULL, delta = NULL, N, method = "parametric", freqList = NULL, plot = TRUE, seed = NULL )
ibdBootstrap( x = NULL, ids = NULL, param = NULL, kappa = NULL, delta = NULL, N, method = "parametric", freqList = NULL, plot = TRUE, seed = NULL )
x |
A |
ids |
A pair of ID labels. |
param |
Either NULL (default), "kappa" or "delta". (See below.) |
kappa , delta
|
Probability vectors of length 3 (kappa) or 9 (delta).
Exactly one of |
N |
The number of simulations. |
method |
Either "parametric" (default) or "nonparametric". Abbreviations are allowed. see Details for more information about each method. |
freqList |
A list of probability vectors: The allele frequencies for each marker. |
plot |
A logical, only relevant for bootstraps of kappa. If TRUE, the bootstrap estimates are plotted in the IBD triangle. |
seed |
An integer seed for the random number generator (optional). |
The parameter method
controls how bootstrap estimates are obtained in each
replication:
"parametric": new profiles for two individuals are simulated from the input coefficients, followed by a re-estimation of the coefficients.
"nonparametric": the original markers are sampled with replacement, before the coefficients are re-estimated.
Note that the pedigree itself does not affect the output of this function;
the role of x
is simply to carry the marker data.
A data frame with N
rows containing the bootstrap estimates. The
last column, dist
, gives the Euclidean distance to the original
coefficients (either specified by the user or estimated from the data),
viewed as a point in R^3 (kappa) or R^9 (delta).
# Frequency list of 15 standard STR markers freqList = NorwegianFrequencies[1:15] # Number of bootstrap simulations (increase!) N = 5 # Bootstrap estimates for kappa of full siblings boot1 = ibdBootstrap(kappa = c(0.25, .5, .25), N = N, freqList = freqList) boot1 # Mean deviation mean(boot1$dist) # Same, but with the 9 identity coefficients. delta = c(0, 0, 0, 0, 0, 0, .25, .5, .25) boot2 = ibdBootstrap(delta = delta, N = N, freqList = freqList) # Mean deviation mean(boot2$dist) #### Non-parametric bootstrap. # Requires `x` and `ids` to be provided x = nuclearPed(2) x = markerSim(x, ids = 3:4, N = 50, alleles = 1:10, seed = 123) bootNP = ibdBootstrap(x, ids = 3:4, param = "kappa", method = "non", N = N) # Parametric bootstrap can also be done with this syntax bootP = ibdBootstrap(x, ids = 3:4, param = "kappa", method = "par", N = N)
# Frequency list of 15 standard STR markers freqList = NorwegianFrequencies[1:15] # Number of bootstrap simulations (increase!) N = 5 # Bootstrap estimates for kappa of full siblings boot1 = ibdBootstrap(kappa = c(0.25, .5, .25), N = N, freqList = freqList) boot1 # Mean deviation mean(boot1$dist) # Same, but with the 9 identity coefficients. delta = c(0, 0, 0, 0, 0, 0, .25, .5, .25) boot2 = ibdBootstrap(delta = delta, N = N, freqList = freqList) # Mean deviation mean(boot2$dist) #### Non-parametric bootstrap. # Requires `x` and `ids` to be provided x = nuclearPed(2) x = markerSim(x, ids = 3:4, N = 50, alleles = 1:10, seed = 123) bootNP = ibdBootstrap(x, ids = 3:4, param = "kappa", method = "non", N = N) # Parametric bootstrap can also be done with this syntax bootP = ibdBootstrap(x, ids = 3:4, param = "kappa", method = "par", N = N)
Estimate the IBD coefficients or the condensed identity coefficients
between a pair (or several pairs)
of pedigree members, using maximum likelihood methods. Estimates of
may be visualised with
showInTriangle()
.
ibdEstimate( x, ids = typedMembers(x), param = c("kappa", "delta"), acrossComps = TRUE, markers = NULL, start = NULL, tol = sqrt(.Machine$double.eps), beta = 0.5, sigma = 0.5, contourPlot = FALSE, levels = NULL, maxval = FALSE, verbose = TRUE )
ibdEstimate( x, ids = typedMembers(x), param = c("kappa", "delta"), acrossComps = TRUE, markers = NULL, start = NULL, tol = sqrt(.Machine$double.eps), beta = 0.5, sigma = 0.5, contourPlot = FALSE, levels = NULL, maxval = FALSE, verbose = TRUE )
x |
A |
ids |
Either a vector with ID labels, or a data frame/matrix with two
columns, each row indicating a pair of individuals. The entries are coerced
to characters, and must match uniquely against the ID labels of |
param |
Either "kappa" (default) or "delta"; indicating which set of coefficients should be estimated. |
acrossComps |
A logical indicating if pairs of individuals in different components should be included. Default: TRUE. |
markers |
A vector with names or indices of markers attached to x, indicating which markers to include. By default, all markers are used. |
start |
A probability vector (i.e., with nonnegative entries and sum 1)
of length 3 (if |
tol , beta , sigma
|
Control parameters for the optimisation routine; can usually be left untouched. |
contourPlot |
A logical. If TRUE, contours of the log-likelihood function are plotted overlaying the IBD triangle. |
levels |
(Only relevant if |
maxval |
A logical. If TRUE, the maximum log-likelihood value is included in the output. Default: FALSE |
verbose |
A logical. |
It should be noted that this procedure estimates the realised identity coefficients of each pair, i.e., the actual fractions of the autosomes in each IBD state. These may deviate substantially from the theoretical pedigree coefficients.
Maximum likelihood estimation of relatedness coefficients originates with
Thompson (1975). Optimisation of is done in the
-plane and restricted to the triangle defined by
Optimisation of is done in unit simplex of
, using the
first 8 coefficients.
The implementation optimises the log-likelihood using a projected gradient descent algorithm, combined with a version of Armijo line search.
When param = "kappa"
, the output may be fed directly to showInTriangle()
for visualisation.
An object of class ibdEst
, which is basically a data frame with
either 6 columns (if param = "kappa"
) or 12 columns (if param = "delta"
). The first three columns are id1
(label of first individual),
id2
(label of second individual) and N
(the number of markers with no
missing alleles). The remaining columns contain the coefficient estimates.
If maxval = T
, a column named maxloglik
is added at the end.
Magnus Dehli Vigeland
E. A. Thompson (1975). The estimation of pairwise relationships. Annals of Human Genetics 39.
E. A. Thompson (2000). Statistical Inference from Genetic Data on Pedigrees. NSF-CBMS Regional Conference Series in Probability and Statistics. Volume 6.
### Example 1: Siblings # Create pedigree and simulate 100 markers x = nuclearPed(2) |> markerSim(N = 100, alleles = 1:4, seed = 123) x # Estimate kappa (expectation: (0.25, 0.5, 0.25) k = ibdEstimate(x, ids = 3:4) k # Visualise estimate showInTriangle(k, labels = TRUE) # Contour plot of the log-likelihood function ibdEstimate(x, ids = 3:4, contourPlot = TRUE) ### Example 2: Full sib mating y = fullSibMating(1) |> markerSim(ids = 5:6, N = 1000, alleles = 1:10, seed = 123) # Estimate the condensed identity coefficients ibdEstimate(y, param = "delta") # Exact coefficient by `ribd`: ribd::condensedIdentity(y, 5:6, simplify = FALSE)
### Example 1: Siblings # Create pedigree and simulate 100 markers x = nuclearPed(2) |> markerSim(N = 100, alleles = 1:4, seed = 123) x # Estimate kappa (expectation: (0.25, 0.5, 0.25) k = ibdEstimate(x, ids = 3:4) k # Visualise estimate showInTriangle(k, labels = TRUE) # Contour plot of the log-likelihood function ibdEstimate(x, ids = 3:4, contourPlot = TRUE) ### Example 2: Full sib mating y = fullSibMating(1) |> markerSim(ids = 5:6, N = 1000, alleles = 1:10, seed = 123) # Estimate the condensed identity coefficients ibdEstimate(y, param = "delta") # Exact coefficient by `ribd`: ribd::condensedIdentity(y, 5:6, simplify = FALSE)
Given genotype data from two individuals, computes the log-likelihood of a
single set of IBD coefficients, either kappa
= or the Jacquard coefficients
delta
= .
The
ibdLoglikFUN
version returns an efficient function for computing such
likelihoods, suitable for optimisations such as in ibdEstimate()
.
ibdLoglik(x = NULL, ids = NULL, kappa = NULL, delta = NULL) ibdLoglikFUN(x, ids, input = c("kappa", "kappa02", "delta"))
ibdLoglik(x = NULL, ids = NULL, kappa = NULL, delta = NULL) ibdLoglikFUN(x, ids, input = c("kappa", "kappa02", "delta"))
x |
A |
ids |
A vector of ID labels. |
kappa |
A probability vector of length 3. |
delta |
A probability vector of length 9. |
input |
Either "kappa", "kappa02" or "delta". See Value. |
ibdLoglik()
returns a single number; the total log-likelihood over
all markers included.
ibdLoglikFUN()
returns a function for computing such log-likelihoods. The
function takes a single input vector p
, whose interpretation depends on
the input
parameter:
"kappa": p
is expected to be a set of kappa coefficients
.
"kappa02": p
should be a vector of length 2 containing the coefficients
and
. This is sometimes a convenient shortcut
when working in the IBD triangle.
"delta": Expects p
to be a set of condensed Jacquard coefficients
.
# Siblings typed with 10 markers x = nuclearPed(2) |> markerSim(N = 10, alleles = 1:4) # Calculate log-likelihood at a single point k = c(0.25, 0.5, 0.25) ibdLoglik(x, ids = 3:4, kappa = k) # Or first get a function, and then apply it llFun = ibdLoglikFUN(x, ids = 3:4, input = "kappa") llFun(k)
# Siblings typed with 10 markers x = nuclearPed(2) |> markerSim(N = 10, alleles = 1:4) # Calculate log-likelihood at a single point k = c(0.25, 0.5, 0.25) ibdLoglik(x, ids = 3:4, kappa = k) # Or first get a function, and then apply it llFun = ibdLoglikFUN(x, ids = 3:4, input = "kappa") llFun(k)
This function computes likelihood ratios (LRs) for a list of pedigrees. One of the pedigrees (the last one, by default) is designated as 'reference', to be used in the denominator in all LR calculations. To ensure that all pedigrees use the same data set, one of the pedigrees may be chosen as 'source', from which data is transferred to all the other pedigrees.
kinshipLR( ..., ref = NULL, source = NULL, markers = NULL, linkageMap = NULL, keepMerlin = NULL, verbose = FALSE )
kinshipLR( ..., ref = NULL, source = NULL, markers = NULL, linkageMap = NULL, keepMerlin = NULL, verbose = FALSE )
... |
Pedigree alternatives. Each argument should be either a single
It is also possible to pass a single |
ref |
An index or name indicating which of the input pedigrees should be used as "reference pedigree", i.e., used in the denominator of each LR. If NULL (the default), the last pedigree is used as reference. |
source |
An index or name designating one of the input pedigrees as
source for marker data. If given, marker data is transferred from this to
all the other pedigrees (replacing any existing markers). The default
action ( |
markers |
A vector of marker names or indices indicating which markers should be included. If NULL (the default) all markers are used. |
linkageMap |
If this is non-NULL, the markers are interpreted as being linked, and likelihoods will be computed by an external call to MERLIN. The supplied object should be either:
|
keepMerlin |
Either NULL (default) or the path to an existing folder. If given, MERLIN files are stored here, typically for debugging purposes. |
verbose |
A logical. |
By default, all markers are assumed to be unlinked. To accommodate linkage, a
genetic map may be supplied with the argument linkageMap
. This requires the
software MERLIN to be installed.
A LRresult
object, which is essentially a list with entries
LRtotal
: A vector of length L
, where L
is the number of input
pedigrees. The i'th entry is the total LR (i.e., the product over all
markers) comparing pedigree i
to the reference pedigree. The entry
corresponding to the reference will always be 1.
LRperMarker
: A numerical matrix, where the i'th column contains the
marker-wise LR values comparing pedigree i
to the reference. The product
of all entries in a column should equal the corresponding entry in
LRtotal
.
likelihoodsPerMarker
: A numerical matrix of the same dimensions as
LRperMarker
, but where the entries are likelihood of each pedigree for
each marker.
time
: Elapsed time
Magnus Dehli Vigeland and Thore Egeland
LRpower()
, pedprobr::likelihoodMerlin()
### Example 1: Full vs half sibs # Simulate 5 markers for a pair of full sibs ids = c("A", "B") sibs = nuclearPed(children = ids) sibs = simpleSim(sibs, N = 5, alleles = 1:4, ids = ids, seed = 123) # Create two alternative hypotheses halfsibs = relabel(halfSibPed(), old = 4:5, new = ids) unrel = singletons(c("A", "B")) # Compute LRs. By default, the last ped is used as reference kinshipLR(sibs, halfsibs, unrel) # Input pedigrees can be named, reflected in the output kinshipLR(S = sibs, H = halfsibs, U = unrel) # Select non-default reference (by index or name) kinshipLR(S = sibs, H = halfsibs, U = unrel, ref = "H") # Alternative syntax: List input peds = list(S = sibs, H = halfsibs, U = unrel) kinshipLR(peds, ref = "H", source = "S", verbose = TRUE) # Detailed results res = kinshipLR(peds) res$LRperMarker res$likelihoodsPerMarker ### Example 2: Separating grandparent/halfsib/uncle-nephew # Requires ibdsim2 and MERLIN if(requireNamespace("ibdsim2", quietly = TRUE) && pedprobr::checkMerlin()) { # Load recombination map map = ibdsim2::loadMap("decode19", uniform = TRUE) # unif for speed # Define pedigrees ids = c("A", "B") H = relabel(halfSibPed(), old = c(4,5), new = ids) U = relabel(avuncularPed(), old = c(3,6), new = ids) G = relabel(linearPed(2), old = c(1,5), new = ids) # Attach FORCE panel of SNPs to G G = setSNPs(G, FORCE[1:10, ]) # use all for better results # Simulate recombination pattern in G ibd = ibdsim2::ibdsim(G, N = 1, ids = ids, map = map) # Simulate genotypes conditional on pattern G = ibdsim2::profileSimIBD(G, ibdpattern = ibd) # Compute LR (genotypes are automatically transferred to H and U) kinshipLR(H, U, G, linkageMap = map) }
### Example 1: Full vs half sibs # Simulate 5 markers for a pair of full sibs ids = c("A", "B") sibs = nuclearPed(children = ids) sibs = simpleSim(sibs, N = 5, alleles = 1:4, ids = ids, seed = 123) # Create two alternative hypotheses halfsibs = relabel(halfSibPed(), old = 4:5, new = ids) unrel = singletons(c("A", "B")) # Compute LRs. By default, the last ped is used as reference kinshipLR(sibs, halfsibs, unrel) # Input pedigrees can be named, reflected in the output kinshipLR(S = sibs, H = halfsibs, U = unrel) # Select non-default reference (by index or name) kinshipLR(S = sibs, H = halfsibs, U = unrel, ref = "H") # Alternative syntax: List input peds = list(S = sibs, H = halfsibs, U = unrel) kinshipLR(peds, ref = "H", source = "S", verbose = TRUE) # Detailed results res = kinshipLR(peds) res$LRperMarker res$likelihoodsPerMarker ### Example 2: Separating grandparent/halfsib/uncle-nephew # Requires ibdsim2 and MERLIN if(requireNamespace("ibdsim2", quietly = TRUE) && pedprobr::checkMerlin()) { # Load recombination map map = ibdsim2::loadMap("decode19", uniform = TRUE) # unif for speed # Define pedigrees ids = c("A", "B") H = relabel(halfSibPed(), old = c(4,5), new = ids) U = relabel(avuncularPed(), old = c(3,6), new = ids) G = relabel(linearPed(2), old = c(1,5), new = ids) # Attach FORCE panel of SNPs to G G = setSNPs(G, FORCE[1:10, ]) # use all for better results # Simulate recombination pattern in G ibd = ibdsim2::ibdsim(G, N = 1, ids = ids, map = map) # Simulate genotypes conditional on pattern G = ibdsim2::profileSimIBD(G, ibdpattern = ibd) # Compute LR (genotypes are automatically transferred to H and U) kinshipLR(H, U, G, linkageMap = map) }
This function uses simulations to estimate the likelihood ratio (LR) distribution in a given kinship testing scenario. In the most general setting, three pedigrees are involved: the two pedigrees being compared, and the true relationship (which may differ from the other two). A subset of individuals are available for genotyping. Some individuals may already be genotyped; all simulations are then conditional on these.
LRpower( numeratorPed, denominatorPed, truePed = numeratorPed, ids, markers = NULL, source = "true", nsim = 1, threshold = NULL, disableMutations = NA, alleles = NULL, afreq = NULL, Xchrom = FALSE, knownGenotypes = NULL, plot = FALSE, plotMarkers = NULL, seed = NULL, verbose = TRUE )
LRpower( numeratorPed, denominatorPed, truePed = numeratorPed, ids, markers = NULL, source = "true", nsim = 1, threshold = NULL, disableMutations = NA, alleles = NULL, afreq = NULL, Xchrom = FALSE, knownGenotypes = NULL, plot = FALSE, plotMarkers = NULL, seed = NULL, verbose = TRUE )
numeratorPed , denominatorPed
|
|
truePed |
A |
ids |
Individuals available for genotyping. |
markers |
A vector indicating the names or indices of markers attached
to the source pedigree. If NULL (default), then all markers attached to the
source pedigree are used. If |
source |
Either "true" (default), "numerator" or "denominator", indicating which pedigree is used as source for marker data. |
nsim |
A positive integer: the number of simulations. |
threshold |
A numeric vector with one or more positive numbers used as LR thresholds. |
disableMutations |
Not implemented yet. |
alleles , afreq , Xchrom
|
If these are given, they are used (together with
|
knownGenotypes |
A list of triplets |
plot |
Either a logical or the character "plotOnly". If the latter, a plot is drawn, but no further computations are done. |
plotMarkers |
A vector of marker names or indices whose genotypes are to be included in the plot. |
seed |
An integer seed for the random number generator (optional). |
verbose |
A logical. |
A LRpowerResult
object, which is essentially a list with the
following entries:
LRperSim
: A numeric vector of length nsim
containing the total LR for
each simulation.
meanLRperMarker
: The mean LR per marker, over all simulations.
meanLR
: The mean total LR over all simulations.
meanLogLR
: The mean total log10(LR)
over all simulations.
IP
: A named numeric of the same length as threshold
. For each element
of threshold
, the fraction of simulations resulting in a LR exceeding the
given number.
time
: The total computation time.
params
: A list containing the input parameters missing
, markers
,
nsim
, threshold
and disableMutations
# Paternity LR of siblings ids = c("A", "B") truth = nuclearPed(children = ids) claim = nuclearPed(fa = "A", mo = "NN", children = "B") unrel = singletons(ids) # Simulation parameters nsim = 10 # increase! thresh = 1 # Simulation 1: als = 1:5 afr = runif(5) afr = afr/sum(afr) pow1 = LRpower(claim, unrel, truth, ids = ids, nsim = nsim, threshold = thresh, alleles = als, afreq = afr, seed = 123) pow1 # Simulation 2: Same, but using an attached marker truth = addMarker(truth, alleles = als, afreq = afr) pow2 = LRpower(claim, unrel, truth, ids = ids, nsim = nsim, threshold = thresh, markers = 1, seed = 123) stopifnot(identical(pow1$LRperSim, pow2$LRperSim)) # True pedigree has inbred founders truth2 = setFounderInbreeding(truth, value = 0.5) pow3 = LRpower(claim, unrel, truth2, ids = ids, nsim = nsim, threshold = thresh, markers = 1, seed = 123) # plot = TRUE pow3
# Paternity LR of siblings ids = c("A", "B") truth = nuclearPed(children = ids) claim = nuclearPed(fa = "A", mo = "NN", children = "B") unrel = singletons(ids) # Simulation parameters nsim = 10 # increase! thresh = 1 # Simulation 1: als = 1:5 afr = runif(5) afr = afr/sum(afr) pow1 = LRpower(claim, unrel, truth, ids = ids, nsim = nsim, threshold = thresh, alleles = als, afreq = afr, seed = 123) pow1 # Simulation 2: Same, but using an attached marker truth = addMarker(truth, alleles = als, afreq = afr) pow2 = LRpower(claim, unrel, truth, ids = ids, nsim = nsim, threshold = thresh, markers = 1, seed = 123) stopifnot(identical(pow1$LRperSim, pow2$LRperSim)) # True pedigree has inbred founders truth2 = setFounderInbreeding(truth, value = 0.5) pow3 = LRpower(claim, unrel, truth2, ids = ids, nsim = nsim, threshold = thresh, markers = 1, seed = 123) # plot = TRUE pow3
Simulates marker genotypes conditional on the pedigree structure and known
genotypes. Note: This function simulates independent realisations at a single
locus. Equivalently, it can be thought of as independent simulations of
identical, unlinked markers. For simulations of a set of markers, see
profileSim()
.
markerSim( x, N = 1, ids = NULL, alleles = NULL, afreq = NULL, mutmod = NULL, rate = NULL, partialmarker = NULL, loopBreakers = NULL, seed = NULL, verbose = TRUE )
markerSim( x, N = 1, ids = NULL, alleles = NULL, afreq = NULL, mutmod = NULL, rate = NULL, partialmarker = NULL, loopBreakers = NULL, seed = NULL, verbose = TRUE )
x |
A |
N |
A positive integer: the number of (independent) markers to be simulated. |
ids |
A vector containing ID labels of those pedigree members whose genotypes should be simulated. By default, all individuals are included. |
alleles |
(Only if
|
afreq |
(Only if |
mutmod , rate
|
Arguments specifying a mutation model, passed on to
|
partialmarker |
Either NULL (resulting in unconditional simulation), a
marker object (on which the simulation should be conditioned) or the name
(or index) of a marker attached to |
loopBreakers |
A numeric containing IDs of individuals to be used as
loop breakers. Relevant only if the pedigree has loops, and only if
|
seed |
An integer seed for the random number generator (optional). |
verbose |
A logical. |
This implements (with various time savers) the algorithm used in SLINK of the
LINKAGE/FASTLINK suite. If partialmarker
is NULL, genotypes are simulated
by simple gene dropping, using simpleSim()
.
A ped
object equal to x
except its MARKERS
entry, which
consists of the N
simulated markers.
Magnus Dehli Vigeland
G. M. Lathrop, J.-M. Lalouel, C. Julier, and J. Ott, Strategies for Multilocus Analysis in Humans, PNAS 81(1984), pp. 3443-3446.
x = nuclearPed(2) # Unconditional simulation markerSim(x, N = 2, alleles = 1:3) # Conditional on one child being homozygous 1/1 x = addMarker(x, "3" = "1/1", alleles = 1:3) markerSim(x, N = 2, partialmarker = 1) markerSim(x, N = 1, ids = 4, partialmarker = 1, verbose = FALSE)
x = nuclearPed(2) # Unconditional simulation markerSim(x, N = 2, alleles = 1:3) # Conditional on one child being homozygous 1/1 x = addMarker(x, "3" = "1/1", alleles = 1:3) markerSim(x, N = 2, partialmarker = 1) markerSim(x, N = 1, ids = 4, partialmarker = 1, verbose = FALSE)
This function simulates genotypes for two individuals given their IBD distribution, for N identical markers.
markerSimParametric( kappa = NULL, delta = NULL, states = NULL, N = 1, alleles = NULL, afreq = NULL, seed = NULL, returnValue = c("singletons", "alleles", "genotypes", "internal") )
markerSimParametric( kappa = NULL, delta = NULL, states = NULL, N = 1, alleles = NULL, afreq = NULL, seed = NULL, returnValue = c("singletons", "alleles", "genotypes", "internal") )
kappa |
A probability vector of length 3, giving a set of realised kappa coefficients (between two noninbred individuals). |
delta |
A probability vector of length 9, giving a set of condensed identity coefficients (Jacquard coefficients). |
states |
An integer vector of length |
N |
A positive integer: the number of independent markers to be simulated. |
alleles |
A vector with allele labels. If NULL, the following are tried in order:
|
afreq |
A numeric vector with allele frequencies, possibly named with allele labels. |
seed |
An integer seed for the random number generator (optional). |
returnValue |
Either "singleton" (default) or "alleles". (see Value). |
Exactly one of kappa
, delta
and states
must be given; the other two
should remain NULL.
If states
is given, it explicitly determines the condensed identity state
at each marker. The states are described by integers 1-9, using the tradition
order introduced by Jacquard.
If kappa
is given, the states are generated by the command states = sample(9:7, size = N, replace = TRUE, prob = kappa)
. (Note that identity
states 9, 8, 7 correspond to IBD status 0, 1, 2, respectively.)
If delta
is given, the states are generated by the command states = sample(1:9, size = N, replace = TRUE, prob = delta)
.
The output depends on the value of the returnValue
parameter:
"singletons": a list of two singletons with the simulated marker data attached.
"alleles": a list of four vectors of length N
, named a
, b
, c
and
d
. These contain the simulated alleles, where a/b and c/d are the
genotypes of the to individuals.
"genotypes": a list of two vectors of length N
, containing the
simulated genotypes. Identical to paste(a, b, sep = "/")
and paste(c, d, sep = "/")
, where a
, b
, c
, d
are the vectors returned when
returnValue == "alleles"
.
"internal": similar to "alleles", but using the index integer of each allele. (This option is mostly for internal use.)
# MZ twins markerSimParametric(kappa = c(0,0,1), N = 5, alleles = 1:10) # Equal distribution of states 1 and 2 markerSimParametric(delta = c(.5,.5,0,0,0,0,0,0,0), N = 5, alleles = 1:10) # Force a specific sequence of states markerSimParametric(states = c(1,2,7,8,9), N = 5, alleles = 1:10)
# MZ twins markerSimParametric(kappa = c(0,0,1), N = 5, alleles = 1:10) # Equal distribution of states 1 and 2 markerSimParametric(delta = c(.5,.5,0,0,0,0,0,0,0), N = 5, alleles = 1:10) # Force a specific sequence of states markerSimParametric(states = c(1,2,7,8,9), N = 5, alleles = 1:10)
This is a special case of exclusionPower()
for use in missing person cases.
The function computes the probability that a random person is genetically
incompatible with the typed relatives of the missing person.
missingPersonEP( reference, missing, markers = NULL, disableMutations = NA, verbose = TRUE )
missingPersonEP( reference, missing, markers = NULL, disableMutations = NA, verbose = TRUE )
reference |
A |
missing |
The ID label of the missing pedigree member. |
markers |
A vector indicating the names or indices of markers attached
to the source pedigree. If NULL (default), then all markers attached to the
source pedigree are used. If |
disableMutations |
This parameter determines how mutation models are treated. Possible values are as follows:
|
verbose |
A logical. |
This function is identical to randomPersonEP()
, but with different argument
names. This makes it consistent with missingPersonIP()
and the other
'missing person' functions.
The EPresult
object returned by exclusionPower()
.
randomPersonEP()
, exclusionPower()
# Four siblings; the fourth is missing x = nuclearPed(4) # Remaining sibs typed with 4 triallelic markers x = markerSim(x, N = 4, ids = 3:5, alleles = 1:3, seed = 577, verbose = FALSE) # Add marker with inconsistency in reference genotypes # (by default this is ignored by `missingPersonEP()`) x = addMarker(x, "3" = "1/1", "4" = "2/2", "5" = "3/3") # Compute exclusion power statistics missingPersonEP(x, missing = 6)
# Four siblings; the fourth is missing x = nuclearPed(4) # Remaining sibs typed with 4 triallelic markers x = markerSim(x, N = 4, ids = 3:5, alleles = 1:3, seed = 577, verbose = FALSE) # Add marker with inconsistency in reference genotypes # (by default this is ignored by `missingPersonEP()`) x = addMarker(x, "3" = "1/1", "4" = "2/2", "5" = "3/3") # Compute exclusion power statistics missingPersonEP(x, missing = 6)
This function simulates the LR distribution for the true missing person in a reference family. The output contains both the total and marker-wise LR of each simulation, as well as various summary statistics. If a specific LR threshold is given, the inclusion power is computed as the probability that LR exceeds the threshold.
missingPersonIP( reference, missing, markers, nsim = 1, threshold = NULL, disableMutations = NA, seed = NULL, verbose = TRUE )
missingPersonIP( reference, missing, markers, nsim = 1, threshold = NULL, disableMutations = NA, seed = NULL, verbose = TRUE )
reference |
A |
missing |
The ID label of the missing pedigree member. |
markers |
A vector indicating the names or indices of markers attached
to the source pedigree. If NULL (default), then all markers attached to the
source pedigree are used. If |
nsim |
A positive integer: the number of simulations |
threshold |
A numeric vector with one or more positive numbers used as the likelihood ratio thresholds for inclusion |
disableMutations |
This parameter determines how mutation models are treated. Possible values are as follows:
|
seed |
An integer seed for the random number generator (optional). |
verbose |
A logical. |
A mpIP
object, which is essentially a list with the following
entries:
LRperSim
: A numeric vector of length nsim
containing the total LR for
each simulation.
meanLRperMarker
: The mean LR per marker, over all simulations.
meanLR
: The mean total LR over all simulations.
meanLogLR
: The mean total log10(LR)
over all simulations.
IP
: A named numeric of the same length as threshold
. For each element
of threshold
, the fraction of simulations resulting in a LR exceeding the
given number.
time
: The total computation time.
params
: A list containing the input parameters missing
, markers
,
nsim
, threshold
and disableMutations
# Four siblings; the fourth is missing x = nuclearPed(4) # Remaining sibs typed with 5 triallelic markers x = markerSim(x, N = 5, ids = 3:5, alleles = 1:3, seed = 123, verbose = FALSE) # Compute inclusion power statistics ip = missingPersonIP(x, missing = 6, nsim = 5, threshold = c(10, 100)) ip # LRs from each simulation ip$LRperSim
# Four siblings; the fourth is missing x = nuclearPed(4) # Remaining sibs typed with 5 triallelic markers x = markerSim(x, N = 5, ids = 3:5, alleles = 1:3, seed = 123, verbose = FALSE) # Compute inclusion power statistics ip = missingPersonIP(x, missing = 6, nsim = 5, threshold = c(10, 100)) ip # LRs from each simulation ip$LRperSim
This is a wrapper function for kinshipLR()
for the special case of missing
person identification. A person of interest (POI) is matched against a
reference dataset containing genotypes of relatives of the missing person.
missingPersonLR(reference, missing, poi = NULL, verbose = TRUE, ...)
missingPersonLR(reference, missing, poi = NULL, verbose = TRUE, ...)
reference |
A |
missing |
The ID label of the missing member of |
poi |
A |
verbose |
A logical. |
... |
Optional parameters to be passed on to |
Note that this function accepts two forms of input:
With poi
a typed singleton. This is the typical use case, when you want
to compute the LR for some person of interest.
With poi = NULL
, but missing
being genotyped. The data for missing
is then extracted as a singleton POI. This is especially useful in simulation
procedures, e.g., for simulating the LR distribution of the true missing
person.
See Examples for illustrations of both cases.
The LRresult
object returned by kinshipLR()
, but without the
trivial H2:H2
comparison.
#------------------------------------------------ # Example: Identification of a missing grandchild #------------------------------------------------ # Database with 5 STR markers (increase to make more realistic) db = NorwegianFrequencies[1:5] # Pedigree with missing person (MP); grandmother is genotyped x = linearPed(2) |> relabel(old = 5, new = "MP") |> profileSim(markers = db, ids = "2", seed = 123) ### Scenario 1: Unrelated POI -------------------- # Generate random unrelated profile poi = singleton("POI") |> profileSim(markers = db, seed = 1234) # Compute LR lr = missingPersonLR(x, missing = "MP", poi = poi) lr lr$LRperMarker ### Scenario 2: POI is the missing person -------- # A small simulation example # Simulate profiles for MP conditional on the grandmother N = 10 y = profileSim(x, N = N, ids = "MP", seed = 12345) # Compute LRs for each sim LRsims = lapply(y, missingPersonLR, missing = "MP", verbose = FALSE) # Plot distribution LRtotal = sapply(LRsims, function(a) a$LRtotal) plot(density(LRtotal)) # LRs for each marker LRperMarker = sapply(LRsims, function(a) a$LRperMarker) LRperMarker # Overlaying marker-wise density plots (requires tidyverse) # library(tidyverse) # t(LRperMarker) |> as_tibble() |> pivot_longer(everything()) |> # ggplot() + geom_density(aes(value, fill = name), alpha = 0.6)
#------------------------------------------------ # Example: Identification of a missing grandchild #------------------------------------------------ # Database with 5 STR markers (increase to make more realistic) db = NorwegianFrequencies[1:5] # Pedigree with missing person (MP); grandmother is genotyped x = linearPed(2) |> relabel(old = 5, new = "MP") |> profileSim(markers = db, ids = "2", seed = 123) ### Scenario 1: Unrelated POI -------------------- # Generate random unrelated profile poi = singleton("POI") |> profileSim(markers = db, seed = 1234) # Compute LR lr = missingPersonLR(x, missing = "MP", poi = poi) lr lr$LRperMarker ### Scenario 2: POI is the missing person -------- # A small simulation example # Simulate profiles for MP conditional on the grandmother N = 10 y = profileSim(x, N = N, ids = "MP", seed = 12345) # Compute LRs for each sim LRsims = lapply(y, missingPersonLR, missing = "MP", verbose = FALSE) # Plot distribution LRtotal = sapply(LRsims, function(a) a$LRtotal) plot(density(LRtotal)) # LRs for each marker LRperMarker = sapply(LRsims, function(a) a$LRperMarker) LRperMarker # Overlaying marker-wise density plots (requires tidyverse) # library(tidyverse) # t(LRperMarker) |> as_tibble() |> pivot_longer(everything()) |> # ggplot() + geom_density(aes(value, fill = name), alpha = 0.6)
Visualises the competing hypotheses of a family reunion case. A plot with two panels is generated. The left panel shows a pedigree in which the person of interest (POI) is identical to the missing person (MP). The right panel shows the situation where these two are unrelated. See Details for further explanations.
missingPersonPlot( reference, missing, labs = labels(reference), marker = NULL, hatched = typedMembers(reference), MP.label = "MP", POI.label = "POI", MP.col = "#FF9999", POI.col = "lightgreen", POI.sex = getSex(reference, missing), POI.hatched = NULL, titles = c(expression(H[1] * ": POI = MP"), expression(H[2] * ": POI unrelated")), width = NULL, cex = 1.2, ... )
missingPersonPlot( reference, missing, labs = labels(reference), marker = NULL, hatched = typedMembers(reference), MP.label = "MP", POI.label = "POI", MP.col = "#FF9999", POI.col = "lightgreen", POI.sex = getSex(reference, missing), POI.hatched = NULL, titles = c(expression(H[1] * ": POI = MP"), expression(H[2] * ": POI unrelated")), width = NULL, cex = 1.2, ... )
reference |
A |
missing |
The ID label of the missing pedigree member. |
labs |
A character vector with labels for the pedigree members. See
|
marker |
Optional vector of marker indices to be included in the plot. |
hatched |
A vector of ID labels indicating who should appear with hatched symbols in the plot. By default, all typed members. |
MP.label , POI.label
|
Custom labels of the missing person and the POI. Default: "MP" and "POI". |
MP.col , POI.col
|
Fill colours for MP and POI. |
POI.sex |
The sex of POI. This defaults to that of the missing person, but may be set explicitly. This is particularly useful when the missing person has unknown sex. |
POI.hatched |
Deprecated (ignored). |
titles |
A character of length 2, with subtitles for the two frames. |
width |
A positive number controlling the width of the plot. More specifically this number is the relative width of the reference pedigree, compared to a singleton. |
cex |
Expansion factor for pedigree symbols and font size. |
... |
Extra parameters passed on to |
A standard family reunification case involves the following ingredients:
A reference family with a single missing person ("MP").
Some of the family members have been genotyped
A person of interest ("POI") is to be matched against the reference family
After genotyping of POI, the genetic evidence is typically assessed by computing the likelihood ratio of the following hypotheses:
H1: POI is MP
H2: POI is unrelated to the family
The goal of this function is to illustrate the above hypotheses, using labels, colours and shading to visualise the different aspects of the situation.
This function cannot handle cases with more complicated hypotheses (e.g.
multiple missing persons, or where H2 specifies a different relationship).
However, as it is basically a wrapper of pedtools::plotPedList()
, an
interested user should be able to extend the source code to such cases
without too much trouble.
None
x = nuclearPed(father = "fa", mother = "mo", children = c("b1", "b2")) # Default plot missingPersonPlot(x, missing = "b2") # Open in separate window; explore various options missingPersonPlot(x, missing = "b2", hatched = "b1", deceased = c("fa", "mo"), cex = 1.5, # larger symbols and labels (see ?par()) cex.main = 1.3, # larger frame titles (see ?par()) dev.width = 7, # device width (see ?plotPedList()) dev.height = 3 # device height (see ?plotPedList()) )
x = nuclearPed(father = "fa", mother = "mo", children = c("b1", "b2")) # Default plot missingPersonPlot(x, missing = "b2") # Open in separate window; explore various options missingPersonPlot(x, missing = "b2", hatched = "b1", deceased = c("fa", "mo"), cex = 1.5, # larger symbols and labels (see ?par()) cex.main = 1.3, # larger frame titles (see ?par()) dev.width = 7, # device width (see ?plotPedList()) dev.height = 3 # device height (see ?plotPedList()) )
Estimate the exclusion/inclusion power for various selections of available individuals.
MPPsims( reference, missing = "MP", selections, ep = TRUE, ip = TRUE, addBaseline = TRUE, nProfiles = 1, lrSims = 1, thresholdIP = NULL, disableMutations = NA, numCores = 1, seed = NULL, verbose = TRUE )
MPPsims( reference, missing = "MP", selections, ep = TRUE, ip = TRUE, addBaseline = TRUE, nProfiles = 1, lrSims = 1, thresholdIP = NULL, disableMutations = NA, numCores = 1, seed = NULL, verbose = TRUE )
reference |
A connected |
missing |
The ID label of the missing pedigree member. |
selections |
A list of pedigree member subsets. In the special case that
all subsets consist of a single individual, |
ep |
A logical: Estimate the exclusion power? (Default: TRUE) |
ip |
A logical: Estimate the inclusion power? (Default: TRUE) |
addBaseline |
A logical. If TRUE (default) an empty selection, named
"Baseline", is added as the first element of |
nProfiles |
The number of profile simulations for each selection. |
lrSims , thresholdIP
|
Parameters passed onto |
disableMutations |
This parameter determines how mutation models are treated. Possible values are as follows:
|
numCores |
The number of cores used for parallelisation, by default 1. |
seed |
An integer seed for the random number generator (optional). |
verbose |
A logical. |
An object of class "MPPsim", which is basically a list with one entry
for each element of selections
. Each entry has elements ep
and ip
,
each of which is a list of length nProfiles
.
The output object has various attributes reflecting the input. Note that
reference
and selection
may differ slightly from the original input,
since they may be modified during the function run. (For instance, a
"Baseline" entry is added to selection
if addBaseline
is TRUE.) The
crucial point is that the output attributes correspond exactly to the
output data.
reference
(always a list, of the same length as the selections
attribute
selections
nProfiles
,lrSims
,thresholdIP
,seed
(as in the input)
totalTime
(the total time used)
x = nuclearPed(fa = "Gf", mo = "Gm", children = c("Uncle", "Mother"), sex = 1:2) x = addChildren(x, fa = "Father", mo = "Mother", nch = 3, sex = c(1,2,1), id = c("S1", "S2", "MP")) x = addSon(x, "Father", id = "HS") # Brother S1 is already genotyped with a marker with 4 alleles x = addMarker(x, S1 = "1/2", alleles = 1:4) # Alternatives for additional genotyping sel = list("Father", "S2", "HS", c("Gm", "Uncle")) plot(x, marker = 1, hatched = sel) # Simulate simData = MPPsims(x, selections = sel, nProfiles = 2, lrSims = 2) # Power plot powerPlot(simData, type = 3) ### With mutations # Add inconsistent marker x = addMarker(x, S1 = "1/2", Father = "3/3", alleles = 1:4) # Set mutation models for both mutmod(x, 1:2) = list("equal", rate = 0.1) # By default mutations are disabled for consistent markers MPPsims(x, selections = "Father", addBaseline = FALSE) # Don't disable anything MPPsims(x, selections = "Father", addBaseline = FALSE, disableMutations = FALSE) # Disable all mutation models. SHOULD GIVE ERROR FOR SECOND MARKER # MPPsims(x, selections = "Father", addBaseline = FALSE, # disableMutations = TRUE)
x = nuclearPed(fa = "Gf", mo = "Gm", children = c("Uncle", "Mother"), sex = 1:2) x = addChildren(x, fa = "Father", mo = "Mother", nch = 3, sex = c(1,2,1), id = c("S1", "S2", "MP")) x = addSon(x, "Father", id = "HS") # Brother S1 is already genotyped with a marker with 4 alleles x = addMarker(x, S1 = "1/2", alleles = 1:4) # Alternatives for additional genotyping sel = list("Father", "S2", "HS", c("Gm", "Uncle")) plot(x, marker = 1, hatched = sel) # Simulate simData = MPPsims(x, selections = sel, nProfiles = 2, lrSims = 2) # Power plot powerPlot(simData, type = 3) ### With mutations # Add inconsistent marker x = addMarker(x, S1 = "1/2", Father = "3/3", alleles = 1:4) # Set mutation models for both mutmod(x, 1:2) = list("equal", rate = 0.1) # By default mutations are disabled for consistent markers MPPsims(x, selections = "Father", addBaseline = FALSE) # Don't disable anything MPPsims(x, selections = "Father", addBaseline = FALSE, disableMutations = FALSE) # Disable all mutation models. SHOULD GIVE ERROR FOR SECOND MARKER # MPPsims(x, selections = "Father", addBaseline = FALSE, # disableMutations = TRUE)
A database of Norwegian allele frequencies for 35 STR markers.
NorwegianFrequencies
NorwegianFrequencies
A list of length 35. Each entry is a numerical vector summing to 1, named with allele labels.
The following markers are included:
D3S1358
: 12 alleles
TH01
: 10 alleles
D21S11
: 26 alleles
D18S51
: 23 alleles
PENTA_E
: 21 alleles
D5S818
: 9 alleles
D13S317
: 9 alleles
D7S820
: 19 alleles
D16S539
: 9 alleles
CSF1PO
: 11 alleles
PENTA_D
: 24 alleles
VWA
: 12 alleles
D8S1179
: 12 alleles
TPOX
: 9 alleles
FGA
: 25 alleles
D19S433
: 17 alleles
D2S1338
: 13 alleles
D10S1248
: 9 alleles
D1S1656
: 17 alleles
D22S1045
: 9 alleles
D2S441
: 13 alleles
D12S391
: 23 alleles
SE33
: 55 alleles
D7S1517
: 11 alleles
D3S1744
: 8 alleles
D2S1360
: 10 alleles
D6S474
: 6 alleles
D4S2366
: 7 alleles
D8S1132
: 12 alleles
D5S2500
: 8 alleles
D21S2055
: 18 alleles
D10S2325
: 10 alleles
D17S906
: 78 alleles
APOAI1
: 41 alleles
D11S554
: 51 alleles
Dupuy et al. (2013): Frequency data for 35 autosomal STR markers in a Norwegian, an East African, an East Asian and Middle Asian population and simulation of adequate database size. Forensic Science International: Genetics Supplement Series, Volume 4 (1).
This function offers four different visualisations of exclusion/inclusion
powers, particularly for missing person cases. Output from MPPsims()
may be
fed directly as input to this function. The actual plotting is done with
ggplot2
.
powerPlot( ep, ip = NULL, type = 1, majorpoints = TRUE, minorpoints = TRUE, ellipse = FALSE, col = NULL, labs = NULL, jitter = FALSE, alpha = 1, stroke = 1.5, shape = "circle", size = 1, hline = NULL, vline = NULL, xlim = NULL, ylim = NULL, xlab = NULL, ylab = NULL )
powerPlot( ep, ip = NULL, type = 1, majorpoints = TRUE, minorpoints = TRUE, ellipse = FALSE, col = NULL, labs = NULL, jitter = FALSE, alpha = 1, stroke = 1.5, shape = "circle", size = 1, hline = NULL, vline = NULL, xlim = NULL, ylim = NULL, xlab = NULL, ylab = NULL )
ep , ip
|
Lists of equal length, with outputs from one or more runs of
|
type |
Plot type; either 1, 2, 3 or 4. |
majorpoints |
A logical indicating whether "major" points should be drawn (see Details). |
minorpoints |
A logical indicating whether "minor" points should be drawn (see Details). |
ellipse |
A logical. If TRUE, data ellipsis are drawn for each group containing more than 1 element. NB: This fails with a warning if all points in a group fall on a line. |
col |
A colour vector, recycle to match the top level length of |
labs |
A character of the same length as |
jitter |
A logical (default: FALSE). If TRUE, a small jitter is added to the major points. |
alpha |
Transparency for minor points (see Details). |
stroke |
Border width for major points (see Details). |
shape |
Either "circle", "square", "diamond", "triangleUp" or "triangleDown", determining the shapes of both minor and major points. |
size |
Point size. |
hline , vline
|
Single numbers indicating positions for horizontal/vertical "threshold" lines. If NULL (default), no lines are drawn. |
xlim , ylim
|
Axis limits; automatically chosen if NULL. |
xlab , ylab
|
Axis labels; automatically chosen if NULL. |
The plot types are as follows:
type = 1
: x = Exclusion power; y = Inclusion power
type = 2
: x = Exclusion odds ratio; y = Inclusion odds ratio
type = 3
: x = Expected number of exclusions; y = average log(LR)
type = 4
: x = Exclusion power; y = average LR
In the most general case ep
(and similarly for ip
) can be a list of lists
of EPresult
objects. We refer to the inner lists as "groups". A group may
consist of a single output, or several (typically many simulations of the
same situation). Points within the same group are always drawn with the same
colour and shape.
When plotting several groups, two sets of points are drawn by default:
Major points: Group means.
Minor points: Individual points in groups with more than one element.
The parameters majorpoints
and minorpoints
control which of the above
points are included.
A ggplot2
plot object.
MPPsims()
, missingPersonEP()
, missingPersonEP()
### Example 1: Comparing the power of 3 reference families ### # Frequencies for 2 STR markers db = NorwegianFrequencies[1:2] # Increase! # Define pedigrees and simulate data PAR = nuclearPed(1, child = "MP") |> profileSim(markers = db, ids = 1) SIB = nuclearPed(2) |> relabel(old = 4, new = "MP") |> profileSim(markers = db, ids = 3) GRA = linearPed(2) |> relabel(old = 5, new = "MP") |> profileSim(markers = db, ids = 1) # Collect in list and plot peds = list(PAR = PAR, SIB = SIB, GRA = GRA) plotPedList(peds, marker = 1, hatched = typedMembers, frames = FALSE, col = list(red = "MP")) # Compute exclusion/inclusion powers: ep = lapply(peds, function(y) missingPersonEP(y, missing = "MP", verbose = FALSE)) ip = lapply(peds, function(y) # increase nsim! missingPersonIP(y, missing = "MP", nsim = 5, threshold = 10, verbose = FALSE)) # Plot powerPlot(ep, ip, size = 2) # Different plot type, not dependent of `threshold` powerPlot(ep, ip, size = 2, type = 3) ### Example 2: Exploring powers for different sets of available relatives # Create trio pedigree ref = nuclearPed(father = "fa", mother = "mo", child = "MP") # Add empty marker with 5 alleles ref = addMarker(ref, alleles = 1:5) # Alternatives for genotyping sel = list("fa", c("fa", "mo")) # Simulate power for each selection simData = MPPsims(ref, selections = sel, nProfiles = 3, lrSims = 5, thresholdIP = 2, seed = 123, numCores = 1) # Power plot 1: EP vs IP powerPlot(simData, type = 1) powerPlot(simData, type = 1, minorpoints = FALSE, hline = 0.8) # Change shape, and modify legend order powerPlot(simData[3:1], type = 1, shape = c("ci", "sq", "di")) # Zoom in, and add threshold lines powerPlot(simData, type = 1, xlim = c(0.2, 1), ylim = c(0.5, 1), hline = 0.8, vline = 0.8) # Power plot 3: Expected number of exclusions vs E[log LR] powerPlot(simData, type = 3) # With horizontal/vertical lines powerPlot(simData, type = 3, hline = log10(2), vline = 1) # Plot 4: Illustrating the general inequality ELR > 1/(1-EP) powerPlot(simData, type = 4)
### Example 1: Comparing the power of 3 reference families ### # Frequencies for 2 STR markers db = NorwegianFrequencies[1:2] # Increase! # Define pedigrees and simulate data PAR = nuclearPed(1, child = "MP") |> profileSim(markers = db, ids = 1) SIB = nuclearPed(2) |> relabel(old = 4, new = "MP") |> profileSim(markers = db, ids = 3) GRA = linearPed(2) |> relabel(old = 5, new = "MP") |> profileSim(markers = db, ids = 1) # Collect in list and plot peds = list(PAR = PAR, SIB = SIB, GRA = GRA) plotPedList(peds, marker = 1, hatched = typedMembers, frames = FALSE, col = list(red = "MP")) # Compute exclusion/inclusion powers: ep = lapply(peds, function(y) missingPersonEP(y, missing = "MP", verbose = FALSE)) ip = lapply(peds, function(y) # increase nsim! missingPersonIP(y, missing = "MP", nsim = 5, threshold = 10, verbose = FALSE)) # Plot powerPlot(ep, ip, size = 2) # Different plot type, not dependent of `threshold` powerPlot(ep, ip, size = 2, type = 3) ### Example 2: Exploring powers for different sets of available relatives # Create trio pedigree ref = nuclearPed(father = "fa", mother = "mo", child = "MP") # Add empty marker with 5 alleles ref = addMarker(ref, alleles = 1:5) # Alternatives for genotyping sel = list("fa", c("fa", "mo")) # Simulate power for each selection simData = MPPsims(ref, selections = sel, nProfiles = 3, lrSims = 5, thresholdIP = 2, seed = 123, numCores = 1) # Power plot 1: EP vs IP powerPlot(simData, type = 1) powerPlot(simData, type = 1, minorpoints = FALSE, hline = 0.8) # Change shape, and modify legend order powerPlot(simData[3:1], type = 1, shape = c("ci", "sq", "di")) # Zoom in, and add threshold lines powerPlot(simData, type = 1, xlim = c(0.2, 1), ylim = c(0.5, 1), hline = 0.8, vline = 0.8) # Power plot 3: Expected number of exclusions vs E[log LR] powerPlot(simData, type = 3) # With horizontal/vertical lines powerPlot(simData, type = 3, hline = log10(2), vline = 1) # Plot 4: Illustrating the general inequality ELR > 1/(1-EP) powerPlot(simData, type = 4)
Simulation of DNA profiles for specified pedigree members. Some pedigree
members may already be genotyped; in that case the simulation is conditional
on these. The main work of this function is done by markerSim()
.
profileSim( x, N = 1, ids = NULL, markers = NULL, seed = NULL, numCores = 1, simplify1 = TRUE, verbose = TRUE, ... )
profileSim( x, N = 1, ids = NULL, markers = NULL, seed = NULL, numCores = 1, simplify1 = TRUE, verbose = TRUE, ... )
x |
A |
N |
The number of complete simulations to be performed. |
ids |
A character (or coercible to character) with ID labels indicating whose genotypes should be simulated. |
markers |
Either a vector indicating a subset of markers attached to
|
seed |
An integer seed for the random number generator (optional). |
numCores |
The number of cores to be used. The default is 1, i.e., no parallelisation. |
simplify1 |
A logical, by default TRUE, removing the outer list layer
when |
verbose |
A logical, by default TRUE. |
... |
Further arguments passed on to |
A list of N
objects similar to x
, but with simulated genotypes.
Any previously attached markers are replaced by the simulated profiles. If
the indicated markers contained genotypes for some pedigree members, these
are still present in the simulated profiles.
If N = 1
and simplify1 = TRUE
, the outer list layer is removed, i.e.,
profileSim(..., N = 1, simplify1 = T)
is equivalent to profileSim(..., N = 1, simplify1 = F)[[1]]
. This is usually the desired object in
interactive use, and works well with piping.
When using profileSim()
in other functions, it is recommended to add
simplify1 = FALSE
to safeguard against issues with N = 1
.
# Example pedigree with two brothers x = nuclearPed(children = c("B1", "B2")) ### Simulate profiles using built-in freq database profileSim(x, markers = NorwegianFrequencies[1:3]) ### Conditioning on known genotypes for one brother # Attach two SNP markers with genotypes for B1 y = x |> addMarker(B1 = "1/2", alleles = 1:2) |> addMarker(B1 = "1", alleles = 1:2, chrom = "X") # Simulate 2 profiles of B2 conditional on the above profileSim(y, N = 2, ids = "B2", seed = 123)
# Example pedigree with two brothers x = nuclearPed(children = c("B1", "B2")) ### Simulate profiles using built-in freq database profileSim(x, markers = NorwegianFrequencies[1:3]) ### Conditioning on known genotypes for one brother # Attach two SNP markers with genotypes for B1 y = x |> addMarker(B1 = "1/2", alleles = 1:2) |> addMarker(B1 = "1", alleles = 1:2, chrom = "X") # Simulate 2 profiles of B2 conditional on the above profileSim(y, N = 2, ids = "B2", seed = 123)
This function generalises markerSimParametric()
in the same way that
profileSim()
generalises markerSim()
.
profileSimParametric( kappa = NULL, delta = NULL, states = NULL, N = 1, freqList = NULL, seed = NULL, returnValue = c("singletons", "alleles", "genotypes", "internal") )
profileSimParametric( kappa = NULL, delta = NULL, states = NULL, N = 1, freqList = NULL, seed = NULL, returnValue = c("singletons", "alleles", "genotypes", "internal") )
kappa |
A probability vector of length 3, giving a set of realised kappa coefficients (between two noninbred individuals). |
delta |
A probability vector of length 9, giving a set of condensed identity coefficients (Jacquard coefficients). |
states |
An integer vector of length |
N |
A positive integer: the number of complete profiles to be simulated |
freqList |
A list of numeric vectors. Each vector is the allele frequencies of a marker. |
seed |
An integer seed for the random number generator (optional). |
returnValue |
Either "singleton" (default) or "alleles". (see Value). |
A list of length N
, whose entries are determined by returnValue
,
as explained in markerSimParametric()
.
# A single profile with 9 markers, each with forced identity state profileSimParametric(states = 1:9, freqList = NorwegianFrequencies[1:9])
# A single profile with 9 markers, each with forced identity state profileSimParametric(states = 1:9, freqList = NorwegianFrequencies[1:9])
A thin wrapper around kinshipLR()
for the common scenario of testing a pair
of individuals for paternity and/or sibship, against being unrelated.
quickLR(x, ids = typedMembers(x), test = c("pat", "sib", "half"))
quickLR(x, ids = typedMembers(x), test = c("pat", "sib", "half"))
x |
A ped object or a list of such. |
ids |
A vector of two typed members of |
test |
The hypotheses to be tested (against 'unrelatedness'). Allowed values are "pat" (=paternity), "sib" (=full siblings), and "half" (=half siblings). By default, all three are included. |
A (slightly simplified) LRresult
object, as described in
kinshipLR()
.
# Simulate 100 markers for half siblings x = halfSibPed() |> markerSim(N = 100, ids = 4:5, alleles = 1:3, seed = 1) # Test paternity, full sib, half sib quickLR(x)
# Simulate 100 markers for half siblings x = halfSibPed() |> markerSim(N = 100, ids = 4:5, alleles = 1:3, seed = 1) # Test paternity, full sib, half sib quickLR(x)
This is a special case of exclusionPower()
, computing the power to exclude
a random person as a given pedigree member. More specifically, the function
computes the probability of observing, in an individual unrelated to the
family individual, a genotype incompatible with the typed family members.
randomPersonEP(x, id, markers = NULL, disableMutations = NA, verbose = TRUE)
randomPersonEP(x, id, markers = NULL, disableMutations = NA, verbose = TRUE)
x |
A |
id |
The ID label of a single pedigree member. |
markers |
A vector indicating the names or indices of markers attached
to the source pedigree. If NULL (default), then all markers attached to the
source pedigree are used. If |
disableMutations |
This parameter determines how mutation models are treated. Possible values are as follows:
|
verbose |
A logical. |
The EPresult
object returned by exclusionPower()
.
# Four siblings: x = nuclearPed(4) # First 3 sibs typed with 4 triallelic markers x = markerSim(x, N = 4, ids = 3:5, alleles = 1:3, seed = 577, verbose = FALSE) # Probability that a random man is excluded as the fourth sibling randomPersonEP(x, id = 6)
# Four siblings: x = nuclearPed(4) # First 3 sibs typed with 4 triallelic markers x = markerSim(x, N = 4, ids = 3:5, alleles = 1:3, seed = 577, verbose = FALSE) # Probability that a random man is excluded as the fourth sibling randomPersonEP(x, id = 6)
This function is re-exported from the ribd
package. For documentation see
ribd::showInTriangle()
.
Unconditional simulation of unlinked markers
simpleSim( x, N, alleles, afreq, ids, Xchrom = FALSE, mutmod = NULL, seed = NULL, verbose = TRUE )
simpleSim( x, N, alleles, afreq, ids, Xchrom = FALSE, mutmod = NULL, seed = NULL, verbose = TRUE )
x |
A |
N |
A positive integer: the number of markers to be simulated. |
alleles |
A vector with allele labels. |
afreq |
A numeric vector of allele frequencies. If missing, the alleles are assumed to be equifrequent. |
ids |
A vector containing ID labels of those pedigree members whose genotypes should be simulated. |
Xchrom |
A logical: X linked markers or not? |
mutmod |
A list of mutation matrices named 'female' and 'male'. |
seed |
An integer seed for the random number generator (optional). |
verbose |
A logical. |
Simple genotype simulation, performed by first distributing alleles randomly
to all founders, followed by Mendelian gene dropping down throughout the
pedigree (i.e., for each non-founder a random allele is selected from each of
the parents). Finally, genotypes of individuals not included in ids
are
removed.
A ped
object equal to x
except its MARKERS
entry, which
consists of the N
simulated markers.
x = nuclearPed(1) simpleSim(x, N = 3, afreq = c(0.5, 0.5)) y = cousinPed(1, child = TRUE) simpleSim(y, N = 3, alleles = LETTERS[1:10])
x = nuclearPed(1) simpleSim(x, N = 3, afreq = c(0.5, 0.5)) y = cousinPed(1, child = TRUE) simpleSim(y, N = 3, alleles = LETTERS[1:10])