This document is a tutorial for the Pedixplorer
package, with examples
of creating Pedigree objects and kinship matrices and other Pedigree
utilities.
The Pedixplorer
package is an updated version of the
Kinship2
package, featuring a
change in maintainer and repository from CRAN to Bioconductor for
continued development and support.
It contains the routines to handle family data with a Pedigree object. The initial purpose was to create correlation structures that describe family relationships such as kinship and identity-by-descent, which can be used to model family data in mixed effects models, such as in the coxme function. It also includes tools for Pedigree drawing and filtering which is focused on producing compact layouts without intervention. Recent additions include utilities to trim the Pedigree object with various criteria, and kinship for the X chromosome.
Supplementary vignettes are available to explain:
vignette("pedigree_object", package = "Pedixplorer")
vignette("pedigree_alignment", package = "Pedixplorer")
vignette("pedigree_kinship", package = "Pedixplorer")
vignette("pedigree_plot", package = "Pedixplorer")
The \(Pedixplorer\) package is available on Bioconductor and can be installed with the following command:
if (!requireNamespace("BiocManager", quietly = TRUE)) {
install.packages("BiocManager")
}
BiocManager::install("Pedixplorer")
The package can then be loaded with the following command:
library(Pedixplorer)
The \(Pedigree\) object is a list of dataframes that describe the family structure. It contains the following components:
help(Ped)
.help(Rel)
.help(Scales)
.help(Hints)
.Two datasets are provided within the \(Pedixplorer\) package: + minnbreast: 17 families from a breast cancer study + sampleped: two sample pedigrees, with 41 and 14 subjects
This vignette uses the two pedigrees in \(sampleped\). For more
information on these datasets, see help(minnbreast)
and
help(sampleped)
.
First, we load \(sampleped\) and look at some of the values in the dataset,
and create a \(Pedigree\) object using the Pedigree()
function. This
function automaticaly detect the necessary columns in the dataframe. If
necessary you can modify the columns names with cols_ren. To create a
\(Pedigree\) object, with multiple families, the dataframe just need a
family column in the ped_df dataframe. When this is the case, the
famid column will be pasted to the id of each individuals separated by
an underscore to create a unique id for each individual in the \(Pedigree\)
object.
data("sampleped")
print(sampleped[1:10, ])
## famid id dadid momid sex affection avail
## 1 1 101 <NA> <NA> 1 0 0
## 2 1 102 <NA> <NA> 2 1 0
## 3 1 103 135 136 1 1 0
## 4 1 104 <NA> <NA> 2 0 0
## 5 1 105 <NA> <NA> 1 NA 0
## 6 1 106 <NA> <NA> 2 NA 0
## 7 1 107 <NA> <NA> 1 1 0
## 8 1 108 <NA> <NA> 2 0 0
## 9 1 109 101 102 2 0 1
## 10 1 110 103 104 1 1 1
ped <- Pedigree(sampleped[c(3, 4, 10, 35, 36), ])
print(ped)
## Pedigree object with:
## Ped object with 5 individuals and 12 metadata columns:
## id dadid momid sex famid steril status
## <character> <character> <character> <c("ordered", "factor")> <character> <logical> <logical>
## 1_103 1_103 1_135 1_136 male 1 <NA> <NA>
## 1_104 1_104 <NA> <NA> female 1 <NA> <NA>
## 1_110 1_110 1_103 1_104 male 1 <NA> <NA>
## 1_135 1_135 <NA> <NA> male 1 <NA> <NA>
## 1_136 1_136 <NA> <NA> female 1 <NA> <NA>
## avail affected useful kin isinf num_child_tot num_child_dir num_child_ind |
## <logical> <logical> <logical> <numeric> <logical> <numeric> <numeric> <numeric> |
## 1_103 FALSE TRUE <NA> <NA> <NA> 1 1 0 |
## 1_104 FALSE FALSE <NA> <NA> <NA> 1 1 0 |
## 1_110 TRUE TRUE <NA> <NA> <NA> 0 0 0 |
## 1_135 FALSE <NA> <NA> <NA> <NA> 1 1 0 |
## 1_136 FALSE <NA> <NA> <NA> <NA> 1 1 0 |
## family indId fatherId motherId gender affection available error
## <character> <character> <character> <character> <integer> <integer> <integer> <character>
## 1_103 1 103 135 136 1 1 0 <NA>
## 1_104 1 104 <NA> <NA> 2 0 0 <NA>
## 1_110 1 110 103 104 1 1 1 <NA>
## 1_135 1 135 <NA> <NA> 1 <NA> 0 <NA>
## 1_136 1 136 <NA> <NA> 2 <NA> 0 <NA>
## sterilisation vitalStatus affection_mods avail_mods
## <logical> <logical> <numeric> <numeric>
## 1_103 <NA> <NA> 1 0
## 1_104 <NA> <NA> 0 0
## 1_110 <NA> <NA> 1 1
## 1_135 <NA> <NA> NA 0
## 1_136 <NA> <NA> NA 0
## Rel object with 0 relationshipswith 0 MZ twin, 0 DZ twin, 0 UZ twin, 0 Spouse:
## id1 id2 code famid
## <character> <character> <c("ordered", "factor")> <character>
For more information on the Pedigree()
function, see help(Pedigree)
.
The \(Pedigree\) object can be subset to individual pedigrees by their family id. The \(Pedigree\) object has a print, summary and plot method, which we show below. The print method prints the \(Ped\) and \(Rel\) object of the pedigree. The summary method prints a short summary of the pedigree. Finally the plot method displays the pedigree.
ped <- Pedigree(sampleped)
print(famid(ped))
## [1] "1" "1" "1" "1" "1" "1" "1" "1" "1" "1" "1" "1" "1" "1" "1" "1" "1" "1" "1" "1" "1" "1" "1" "1"
## [25] "1" "1" "1" "1" "1" "1" "1" "1" "1" "1" "1" "1" "1" "1" "1" "1" "1" "2" "2" "2" "2" "2" "2" "2"
## [49] "2" "2" "2" "2" "2" "2" "2"
ped1 <- ped[famid(ped) == "1"]
summary(ped1)
## Pedigree object with
## [1] "Ped object with 41 individuals and 12 metadata columns"
## [1] "Rel object with 0 relationshipswith 0 MZ twin, 0 DZ twin, 0 UZ twin, 0 Spouse"
plot(ped1)
You can add a title and a legend to the plot with the following command:
plot(ped1, title = "Pedigree 1", legend = TRUE, leg_loc = c(5, 15, 4.5, 5))
To “break” the pedigree, we can manipulate the sex value to not match the parent value (in this example, we change \(203\) from a male to a female, even though \(203\) is a father). To do this, we first subset \(datped2\), locate the id column, and match it to a specific id (in this case, \(203\)). Within id \(203\), then locate in the sex column. Assign this subset to the incorrect value of 2 (female) to change the original/correct value of 1 (male).
To further break the pedigree, we can delete subjects who seem irrelevant to the pedigree (in this example, we delete \(209\) because he is a married-in father). To do this, we subset \(datped2\) and use the -which() function to locate and delete the specified subject (in this case, \(209\)). Reassign this code to \(datped22\) to drop the specified subject entirely.
datped2 <- sampleped[sampleped$famid == 2, ]
datped2[datped2$id %in% 203, "sex"] <- 2
datped2 <- datped2[-which(datped2$id %in% 209), ]
An error occurs when the Pedigree()
function notices that id \(203\) is
not coded to be male (1) but is a father. To correct this, we simply
employ the fix_parents()
function to adjust the sex value to match
either momid or dadid. fix_parents()
will also add back in any
deleted subjects, further fixing the Pedigree.
tryout <- try({
ped2 <- Pedigree(datped2)
})
## Error in validObject(.Object) :
## invalid class "Ped" object: dadid values '2_209' should be in '2_201', '2_202', '2_203', '2_204', '2_205'...
fixped2 <- with(datped2, fix_parents(id, dadid, momid, sex))
fixped2
## id momid dadid sex
## 1 201 <NA> <NA> 1
## 2 202 <NA> <NA> 2
## 3 203 <NA> <NA> 1
## 4 204 202 201 2
## 5 205 202 201 1
## 6 206 202 201 2
## 7 207 202 201 2
## 8 208 202 201 2
## 9 210 204 203 1
## 10 211 204 203 1
## 11 212 208 209 2
## 12 213 208 209 1
## 13 214 208 209 1
## 14 209 <NA> <NA> 1
ped2 <- Pedigree(fixped2)
plot(ped2)
If the fix is straightforward (changing one sex value based on either
being a mother or father), fix_parents()
will resolve the issue. If
the issue is more complicated, say if \(203\) is coded to be both a father
and a mother, fix_parents()
will not know which one is correct and
therefore the issue will not be resolved.
A common use for pedigrees is to make a matrix of kinship coefficients that can be used in mixed effect models. A kinship coefficient is the probability that a randomly selected allele from two people at a given locus will be identical by descent (IBD), assuming all founder alleles are independent. For example, we each have two alleles per autosomal marker, so sampling two alleles with replacement from our own DNA has only \(p=0.50\) probability of getting the same allele twice.
We use kinship()
to calculate the kinship matrix for \(ped2\). The
result is a special symmetrix matrix class from the Matrix R
package, which is stored
efficiently to avoid repeating elements.
kin2 <- kinship(ped2)
kin2[1:9, 1:9]
## 9 x 9 sparse Matrix of class "dsCMatrix"
## 201 202 203 204 205 206 207 208 210
## 201 0.500 . . 0.25 0.250 0.250 0.250 0.250 0.125
## 202 . 0.500 . 0.25 0.250 0.250 0.250 0.250 0.125
## 203 . . 0.50 . . . . . 0.250
## 204 0.250 0.250 . 0.50 0.250 0.250 0.250 0.250 0.250
## 205 0.250 0.250 . 0.25 0.500 0.250 0.250 0.250 0.125
## 206 0.250 0.250 . 0.25 0.250 0.500 0.250 0.250 0.125
## 207 0.250 0.250 . 0.25 0.250 0.250 0.500 0.250 0.125
## 208 0.250 0.250 . 0.25 0.250 0.250 0.250 0.500 0.125
## 210 0.125 0.125 0.25 0.25 0.125 0.125 0.125 0.125 0.500
For family 2, see that the row and column names match the id in the figure below, and see that each kinship coefficient with themselves is \(0.50\), siblings are \(0.25\) (e.g. \(204-205\)), and pedigree marry-ins only share alleles IBD with their children with coefficient \(0.25\) (e.g. \(203-210\)). The plot can be used to verify other kinship coefficients.
The kinship()
function also works on a \(Pedigree\) object with multiple
families. We show how to create the kinship matrix, then show a snapshot
of them for the two families, where the row and columns names are the
ids of the subject.
ped <- Pedigree(sampleped)
kin_all <- kinship(ped)
kin_all[1:9, 1:9]
## 9 x 9 sparse Matrix of class "dsCMatrix"
## 1_101 1_102 1_103 1_104 1_105 1_106 1_107 1_108 1_109
## 1_101 0.50 . . . . . . . 0.25
## 1_102 . 0.50 . . . . . . 0.25
## 1_103 . . 0.5 . . . . . .
## 1_104 . . . 0.5 . . . . .
## 1_105 . . . . 0.5 . . . .
## 1_106 . . . . . 0.5 . . .
## 1_107 . . . . . . 0.5 . .
## 1_108 . . . . . . . 0.5 .
## 1_109 0.25 0.25 . . . . . . 0.50
kin_all[40:43, 40:43]
## 4 x 4 sparse Matrix of class "dsCMatrix"
## 1_140 1_141 2_201 2_202
## 1_140 0.50 0.25 . .
## 1_141 0.25 0.50 . .
## 2_201 . . 0.5 .
## 2_202 . . . 0.5
kin_all[42:46, 42:46]
## 5 x 5 sparse Matrix of class "dsCMatrix"
## 2_201 2_202 2_203 2_204 2_205
## 2_201 0.50 . . 0.25 0.25
## 2_202 . 0.50 . 0.25 0.25
## 2_203 . . 0.5 . .
## 2_204 0.25 0.25 . 0.50 0.25
## 2_205 0.25 0.25 . 0.25 0.50
Specifying twin relationships in a Pedigree with multiple families object is complicated by the fact that the user must specify the family id to which the id1 and id2 belong. We show below the relation matrix requires the family id to be in the last column, with the column names as done below, to make the plotting and kinship matrices to show up with the monozygotic twins correctly. We show how to specify monozygosity for subjects \(206\) and \(207\) in \(ped2\), and subjects \(125\) and \(126\) in \(ped1\). We check it by looking at the kinship matrix for these pairs, which are correctly at \(0.5\).
reltwins <- as.data.frame(rbind(c(206, 207, 1, 2), c(125, 126, 1, 1)))
colnames(reltwins) <- c("indId1", "indId2", "code", "family")
ped <- Pedigree(sampleped, reltwins)
kin_all <- kinship(ped)
kin_all[24:27, 24:27]
## 4 x 4 sparse Matrix of class "dsCMatrix"
## 1_124 1_125 1_126 1_127
## 1_124 0.5000 0.0625 0.0625 0.0625
## 1_125 0.0625 0.5000 0.5000 0.1250
## 1_126 0.0625 0.5000 0.5000 0.1250
## 1_127 0.0625 0.1250 0.1250 0.5000
kin_all[46:50, 46:50]
## 5 x 5 sparse Matrix of class "dsCMatrix"
## 2_205 2_206 2_207 2_208 2_209
## 2_205 0.50 0.25 0.25 0.25 .
## 2_206 0.25 0.50 0.50 0.25 .
## 2_207 0.25 0.50 0.50 0.25 .
## 2_208 0.25 0.25 0.25 0.50 .
## 2_209 . . . . 0.5
Note that subject \(113\) is not in \(ped1\) because they are a marry-in without children in the \(Pedigree\). Subject \(113\) is in their own \(Pedigree\) of size 1 in the \(kin_all\) matrix at index \(41\). We later show how to handle such marry-ins for plotting.
We use \(ped2\) from \(sampleped\) to sequentially add optional information to the \(Pedigree\) object.
The example below shows how to specify a \(status\) indicator, such as
vital status. The \(sampleped\) data does not include such an
indicator, so we create one to indicate that the first generation of
\(ped2\), subjects \(1\) and \(2\), are deceased. The \(status\) indicator is
used to cross out the individuals in the Pedigree plot.
df2 <- sampleped[sampleped$famid == 2, ]
names(df2)
## [1] "famid" "id" "dadid" "momid" "sex" "affection" "avail"
df2$status <- c(1, 1, rep(0, 12))
ped2 <- Pedigree(df2)
summary(status(ped(ped2)))
## Mode FALSE TRUE
## logical 12 2
plot(ped2)
Here we show how to use the \(label\) argument in the plot method to add additional information under each subject. In the example below, we add names to the existing plot by adding a new column to the \(elementMetadata\) of the \(Ped\) object of the \(Pedigree\).
As space permits, more lines and characters per line can be made using the a {/em } character to indicate a new line.
mcols(ped2)$Names <- c(
"John\nDalton", "Linda", "Jack", "Rachel", "Joe", "Deb",
"Lucy", "Ken", "Barb", "Mike", "Matt",
"Mindy", "Mark", "Marie\nCurie"
)
plot(ped2, label = "Names")
We show how to specify affected status with a single indicator and
multiple indicators. First, we use the affected indicator from
\(sampleped\), which contains \(0\)/\(1\) indicators and \(NA\) as missing, and let it
it indicate blue eyes. Next, we create a vector as an indicator for
baldness. And add it as a second filling scale for the plot with
generate_colors(add_to_scale = TRUE)
. The plot shapes for each subject
is therefore divided into two equal parts and shaded differently to
indicate the two affected indicators.
mcols(ped2)$bald <- as.factor(c(0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 1))
ped2 <- generate_colors(ped2, col_aff = "bald", add_to_scale = TRUE)
plot(ped2, legend = TRUE)
Special pedigree relationships can be specified in a matrix as the \(relation\) argument. There are 4 relationships that can be specified by numeric codes:
The spouse relationship can indicate a marry-in when a couple does not have children together.
Below, we create a matrix of relationships for monozygotic and unknown-zygosity twins in the most recent generation of \(ped2\). The twin relationships are both represented with diverging lines from a single point. The monozygotic twins have an additional line connecting the diverging lines, while the other twins have a question mark to indicate unknown zygosity.
## create twin relationships
rel_df <- data.frame(
indId1 = c("210", "212"),
indId2 = c("211", "213"),
code = c(1, 3),
family = c("2", "2")
)
rel(ped2) <- upd_famid_id(with(rel_df, Rel(indId1, indId2, code, family)))
plot(ped2)
Another special relationship is inbreeding. Inbreeding of founders implies the founders’ parents are related (the maternal and paternal genes descended from a single ancestral gene). One thing we can do is add more people to the pedigree to show this inbreeding.
To show that a pair of founders (subjects \(201\) and \(202\)) are inbred, we must show that their parents are siblings. To do this, we create subjects \(197\) and \(198\) to be the parents of \(201\) and also create subjects \(199\) and \(200\) to be the parents of \(202\). To make subjects \(198\) and \(199\) siblings, we give them the same parents, creating subjects \(195\) and \(196\). This results in subjects \(201\) and \(202\) being first cousins, and therefore inbred.
indid <- 195:202
dadid <- c(NA, NA, NA, 196, 196, NA, 197, 199)
momid <- c(NA, NA, NA, 195, 195, NA, 198, 200)
sex <- c(2, 1, 1, 2, 1, 2, 1, 2)
ped3 <- data.frame(
id = indid, dadid = dadid,
momid = momid, sex = sex
)
ped4df <- rbind.data.frame(df2[-c(1, 2), 2:5], ped3)
ped4 <- Pedigree(ped4df)
plot(ped4)
Spouse with no child can also be specified with the \(rel_df\) argument by setting the code value to \(Spouse\) or \(4\). If we use the \(ped2\) from earlier and add a new spouse relationship between the individuals \(212\) and \(211\) we get the following plot.
## create twin relationships
rel_df2 <- data.frame(
id1 = "211",
id2 = "212",
code = 4,
famid = "2"
)
new_rel <- c(rel(ped2), with(rel_df2, Rel(id1, id2, code, famid)))
rel(ped2) <- upd_famid_id(new_rel)
plot(ped2)
The plot method attempts to adhere to many standards in pedigree plotting, as presented by Bennet et al. 2008.
To show some other tricks with pedigree plotting, we use \(ped1\) from \(sampleped\), which has 41 subjects in 4 generations, including a generation with double first cousins. After the first marriage of \(114\), they remarried subject \(113\) without children between them. If we do not specify the marriage with the \(relation\) argument, the plot method excludes subject \(113\) from the plot. The basic plot of \(ped1\) is shown in the figure below.
df1 <- sampleped[sampleped$famid == 1, ]
relate1 <- data.frame(
id1 = 113,
id2 = 114,
code = 4,
famid = 1
)
ped1 <- Pedigree(df1, relate1)
plot(ped1)
The plot method does a decent job aligning subjects given the order of the subjects when the Pedigree object is made, and sometimes has to make two copies of a subject. If we change the order of the subjects when creating the Pedigree, we can help the plot method reduce the need to duplicate subjects, as Figure~ no longer has subject \(110\) duplicated.
df1reord <- df1[c(35:41, 1:34), ]
ped1reord <- Pedigree(df1reord, relate1)
plot(ped1reord)
You can modify the colors of each modality used for the filling as well as for the bordering by modifying the \(Scales\) data.frame.
To do so you can do as follow:
scales(ped1)
## An object of class "Scales"
## Slot "fill":
## order column_values column_mods mods labels affected fill density angle
## 1 1 affection affection_mods 0 Healthy <= to 0.5 FALSE white NA NA
## 2 1 affection affection_mods 1 Affected > to 0.5 TRUE red NA NA
## 3 1 affection affection_mods NA <NA> NA grey NA NA
##
## Slot "border":
## column_values column_mods mods labels border
## 1 avail avail_mods NA NA grey
## 2 avail avail_mods 1 Available green
## 3 avail avail_mods 0 Non Available black
fill(ped1)$fill <- c("green", "blue", "purple")
fill(ped1)$density <- c(30, 15, NA)
fill(ped1)$angle <- c(45, 0, NA)
border(ped1)$border <- c("red", "black", "orange")
plot(ped1, legend = TRUE)
A main features of a Pedigree object are vectors with an element for
each subject. It is sometimes useful to extract these vectors from the
Pedigree object into a \(data.frame\) with basic information that can be
used to construct a new \(Pedigree\) object. This is possible with the
as.data.frame()
method, as shown below.
dfped2 <- as.data.frame(ped(ped2))
dfped2
Pedigrees with large size can be a bottleneck for programs that run calculations on them. The Pedixplorer package contains some routines to identify which subjects to remove. We show how a subject (e.g. subject 210) can be removed from ped2, and how the Pedigree object is changed by verifying that the relation dataframe no longer has the twin relationship between subjects 210 and 211, as indicated by \(id1\) and id2.
ped2_rm210 <- ped2[-10]
rel(ped2_rm210)
## Rel object with 2 relationshipswith 0 MZ twin, 0 DZ twin, 1 UZ twin, 1 Spouse:
## id1 id2 code famid
## <character> <character> <c("ordered", "factor")> <character>
## 1 2_212 2_213 UZ twin 2
## 2 2_211 2_212 Spouse 2
rel(ped2)
## Rel object with 3 relationshipswith 1 MZ twin, 0 DZ twin, 1 UZ twin, 1 Spouse:
## id1 id2 code famid
## <character> <character> <factor> <character>
## 1 2_210 2_211 MZ twin 2
## 2 2_212 2_213 UZ twin 2
## 3 2_211 2_212 Spouse 2
The steps above also works by the id of the subjects themselves.
We provide subset(), which trims subjects from a Pedigree by their
\(id\) or other argument. Below is an example of removing subject 110, as
done above, then we further trim the Pedigree by a vector of subject
ids. We check the trimming by looking at the \(id\) vector and the
\(relation\) matrix.
ped2_trim210 <- subset(ped2, "2_210", keep = FALSE)
id(ped(ped2_trim210))
## [1] "2_201" "2_202" "2_203" "2_204" "2_205" "2_206" "2_207" "2_208" "2_209" "2_211" "2_212" "2_213"
## [13] "2_214"
rel(ped2_trim210)
## Rel object with 2 relationshipswith 0 MZ twin, 0 DZ twin, 1 UZ twin, 1 Spouse:
## id1 id2 code famid
## <character> <character> <c("ordered", "factor")> <character>
## 1 2_212 2_213 UZ twin 2
## 2 2_211 2_212 Spouse 2
ped2_trim_more <- subset(ped2_trim210, c("2_212", "2_214"), keep = FALSE)
id(ped(ped2_trim_more))
## [1] "2_201" "2_202" "2_203" "2_204" "2_205" "2_206" "2_207" "2_208" "2_209" "2_211" "2_213"
rel(ped2_trim_more)
## Rel object with 0 relationshipswith 0 MZ twin, 0 DZ twin, 0 UZ twin, 0 Spouse:
## id1 id2 code famid
## <character> <character> <c("ordered", "factor")> <character>
An additional function in Pedixplorer is shrink(), which shrinks a Pedigree to a specified bit size while maintaining the maximal amount of information for genetic linkage and association studies. Using an indicator for availability and affected status, it removes subjects in this order: + unavailable with no available descendants + available and are not parents + available who have missing affected status + available who are unaffected + available who are affected
We show how to shrink Pedigree 1 to bit size \(30\), which happens to be the bit size after removing only the unavailable subjects. We show how to extract the shrunken Pedigree object from the \(shrink\) result, and plot it.
set.seed(200)
shrink1_b30 <- shrink(ped1, max_bits = 30)
print(shrink1_b30[c(2:8)])
## $id_trim
## [1] "1_101" "1_102" "1_107" "1_108" "1_111" "1_113" "1_121" "1_122" "1_123" "1_131" "1_132" "1_134"
## [13] "1_139"
##
## $id_lst
## $id_lst$unavail
## [1] "1_101" "1_102" "1_107" "1_108" "1_111" "1_113" "1_121" "1_122" "1_123" "1_131" "1_132" "1_134"
## [13] "1_139"
##
##
## $bit_size
## [1] 46 29
##
## $avail
## [1] FALSE FALSE FALSE FALSE TRUE TRUE FALSE FALSE FALSE TRUE FALSE TRUE TRUE FALSE TRUE TRUE
## [17] TRUE TRUE TRUE TRUE TRUE TRUE FALSE FALSE FALSE FALSE TRUE TRUE
##
## $pedSizeOriginal
## [1] 41
##
## $pedSizeIntermed
## [1] 28
##
## $pedSizeFinal
## [1] 28
plot(shrink1_b30$pedObj)
Now shrink Pedigree 1 to bit size \(25\), which requires removing subjects who are informative. If there is a tie between multiple subjects about who to remove, the method randomly chooses one of them. With this seed setting, the method removes subjects \(140\) then \(141\).
set.seed(10)
shrink1_b25 <- shrink(ped1, max_bits = 25)
print(shrink1_b25[c(2:8)])
## $id_trim
## [1] "1_101" "1_102" "1_107" "1_108" "1_111" "1_113" "1_121" "1_122" "1_123" "1_131" "1_132" "1_134"
## [13] "1_139" "1_140" "1_141"
##
## $id_lst
## $id_lst$unavail
## [1] "1_101" "1_102" "1_107" "1_108" "1_111" "1_113" "1_121" "1_122" "1_123" "1_131" "1_132" "1_134"
## [13] "1_139"
##
## $id_lst$affect
## [1] "1_140" "1_141"
##
##
## $bit_size
## [1] 46 29 27 23
##
## $avail
## [1] FALSE FALSE FALSE FALSE TRUE TRUE FALSE FALSE FALSE TRUE FALSE TRUE TRUE FALSE TRUE TRUE
## [17] TRUE TRUE TRUE TRUE TRUE TRUE
##
## $pedSizeOriginal
## [1] 41
##
## $pedSizeIntermed
## [1] 28
##
## $pedSizeFinal
## [1] 22
plot(shrink1_b25$pedObj)
sessionInfo()
## R version 4.4.0 beta (2024-04-15 r86425)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 22.04.4 LTS
##
## Matrix products: default
## BLAS: /home/biocbuild/bbs-3.19-bioc/R/lib/libRblas.so
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.10.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8
## [4] LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C
## [10] LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## time zone: America/New_York
## tzcode source: system (glibc)
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] Pedixplorer_1.0.0 BiocStyle_2.32.0
##
## loaded via a namespace (and not attached):
## [1] sass_0.4.9 utf8_1.2.4 generics_0.1.3 tidyr_1.3.1
## [5] stringi_1.8.3 lattice_0.22-6 digest_0.6.35 magrittr_2.0.3
## [9] evaluate_0.23 grid_4.4.0 bookdown_0.39 fastmap_1.1.1
## [13] plyr_1.8.9 jsonlite_1.8.8 Matrix_1.7-0 brio_1.1.5
## [17] tinytex_0.50 BiocManager_1.30.22 purrr_1.0.2 fansi_1.0.6
## [21] scales_1.3.0 jquerylib_0.1.4 cli_3.6.2 rlang_1.1.3
## [25] munsell_0.5.1 withr_3.0.0 cachem_1.0.8 yaml_2.3.8
## [29] tools_4.4.0 dplyr_1.1.4 colorspace_2.1-0 ggplot2_3.5.1
## [33] BiocGenerics_0.50.0 vctrs_0.6.5 R6_2.5.1 magick_2.8.3
## [37] stats4_4.4.0 lifecycle_1.0.4 stringr_1.5.1 S4Vectors_0.42.0
## [41] pkgconfig_2.0.3 pillar_1.9.0 bslib_0.7.0 gtable_0.3.5
## [45] glue_1.7.0 Rcpp_1.0.12 xfun_0.43 tibble_3.2.1
## [49] tidyselect_1.2.1 highr_0.10 knitr_1.46 htmltools_0.5.8.1
## [53] rmarkdown_2.26 testthat_3.2.1.1 compiler_4.4.0 quadprog_1.5-8