# Overview

This package provides the data used in the package Harman. It contains three datasets. The data and its processing is described below. For usage of the data for batch correction analyses please refer to the Harman vignette.

The HarmanData package is available from Bioconductor (HarmanData) or Github (HarmanData).

Overview of the datasets included in HarmanData:

object description
IMR90 cell-line data examining whether exposing mammalian cells to nitric oxide stabilizes mRNAs
NPM mouse data testing the skin penetration of metal oxide nanoparticles following topical application of sunscreens
OLF human olfactory stem cell line data on response to ZnO nanoparticle exposure

## Usage

## load package
library(HarmanData)
data(IMR90)
data(NPM)
data(OLF)
head(olf.data)
##         c1      c2      c3      c4      c5      c6      c7      c8      c9
## p1 5.05866 4.58076 5.58438 2.90481 5.39752 4.24041 2.46891 5.34241 2.86128
## p2 4.23886 4.08143 3.21386 3.53045 4.18741 3.70027 3.05552 4.62957 4.09687
## p3 3.66121 2.79664 4.13699 2.86271 3.17795 2.92988 3.05603 3.42135 2.70507
## p4 8.61399 9.09654 9.16841 9.10928 8.94949 8.70754 8.75558 9.31429 8.63934
## p5 2.84004 2.66609 3.03612 3.26561 3.22945 3.32247 3.05079 3.02775 3.18419
## p6 3.12234 3.05058 3.85761 3.07707 3.67759 3.72965 3.43910 3.15980 2.40544
##        c10     c11     c12     c13     c14     c15     c16     c17     c18
## p1 3.03601 4.16908 2.58603 3.14912 2.80076 5.84228 4.51905 3.63710 5.40139
## p2 3.59235 3.61548 3.87856 3.28656 4.12426 4.75150 4.44983 4.59084 4.87332
## p3 3.02844 3.11758 2.78865 3.82057 2.98588 4.34385 3.04461 3.47405 2.96032
## p4 8.67534 8.68344 9.06311 8.57974 9.01710 8.80506 8.82719 9.17436 8.72230
## p5 3.39400 3.27891 2.20645 3.26020 3.10178 3.72065 3.11529 3.75199 3.43958
## p6 3.33047 2.81670 3.34086 2.61524 2.38151 3.06240 4.51134 3.69132 3.08967
##        c19     c20     c21     c22     c23     c24     c25     c26     c27
## p1 4.61271 5.07018 5.11652 3.11507 3.63363 4.00769 4.57562 4.34872 4.81612
## p2 4.24876 4.43532 5.59789 2.64399 3.54669 3.26678 3.31107 4.03487 3.60095
## p3 3.31022 2.89191 3.84660 3.24867 2.56718 2.99377 3.18528 2.99099 3.14193
## p4 9.14509 9.27561 9.17592 8.79834 8.92382 9.42164 9.07371 8.91382 8.76901
## p5 3.41088 3.42294 3.77549 3.47989 2.79997 2.41906 2.64609 3.14506 3.39344
## p6 5.20479 4.35899 3.80101 2.95787 2.82707 2.76646 2.89902 3.28622 4.27340
##        c28
## p1 2.64101
## p2 4.10095
## p3 2.74735
## p4 8.30663
## p5 3.82901
## p6 3.21685
dim(olf.data)
## [1] 33297    28
table(olf.info)
##          Batch
## Treatment 1 2 3 4
##         1 1 1 1 1
##         2 1 1 1 1
##         3 1 1 1 1
##         4 1 1 1 1
##         5 1 1 1 1
##         6 1 1 1 1
##         7 1 1 1 1

## Data structure

All datasets in the package are represented in two data.frame’s. One containing the data, the other containing information on the phenotype and batch structure.

### IMR90

data.frame description
imr90.data Affymetrix HG-U133A Arrays with 22,223 probesets (rows) and 12 biological samples (columns).
imr90.info A description of the samples, with two columns, treatment and batch.

Data used in the batch effect correction paper of Johnson, Li and Rabinovich. The data are from a cell-line experimental designed to reveal whether exposing mammalian cells to nitric oxide (NO) stabilizes mRNAs. The data comprises one treatment, one control and 2 time points (0 h and 7.5 h), resulting in 4 distinct (2 treatment x 2 time points) experimental conditions. There were 3 batches and a total of 12 samples, with each batch consisting of 1 replicate from each of the experimental conditions. Affymetrix HG-U133A Arrays were normalised and background adjusted as a whole using the RMA procedure in MATLAB.

### NPM

data.frame description
npm.data Affymetrix MoGene 1.0 ST array data, with 35,512 probesets (rows) and 24 biological samples (columns).
npm.info A description of the samples, with two columns, treatment and batch.

An experiment to test skin penetration of metal oxide nanoparticles following topical application of sunscreens. The data comprises three treatment groups plus a control group, with six replicates in each group, making a total of 24 Affymetrix MoGene 1.0 ST arrays. There were a total of three processing batches of eight arrays, each consisting of 2 replicates per group. Arrays were normalised and background adjusted as a whole using the RMA procedure in MATLAB.

### OLF

data.frame description
olf.data has 33,297 probesets (rows) and 28 biological samples (columns).
olf.info A description of the samples, with two columns, treatment and batch.

An experiment to gauge the response of human olfactory neurosphere-derived (hONS) cells established from adult donors to ZnO nanoparticles. The data comprises six treatment groups plus a control group, each consisting of four replicates, giving a total number of 28 Affymetrix HuGene 1.0 ST arrays. The arrays were broken up into four processing batches of seven arrays each, consisting of one replicate from each of the groups. Arrays were normalised and background adjusted as a whole using the RMA procedure in MATLAB.

## References

1. Johnson et al. Biostatistics (2007). doi: 10.1093/biostatistics/kxj037.
2. Osmond-McLeod et al. Nanotoxicology (2014). doi: 10.3109/17435390.2013.855832.
3. Osmond-McLeod et al. Part Fibre Toxicol. (2013). doi: 10.1186/1743-8977-10-54.