| Type: | Package |
| Title: | Detection of Statistically Significant Combinations of SNPs in Association Mapping |
| Version: | 0.6.1 |
| Description: | A significant pattern mining-based toolbox for region-based genome-wide association studies and higher-order epistasis analyses, implementing the methods described in Llinares-López et al. (2017) <doi:10.1093/bioinformatics/btx071>. |
| Depends: | R (≥ 3.0.2) |
| Imports: | methods, Rcpp |
| LinkingTo: | Rcpp |
| Encoding: | UTF-8 |
| LazyData: | true |
| License: | GPL-2 | GPL-3 [expanded from: GPL (≥ 2)] |
| NeedsCompilation: | yes |
| RoxygenNote: | 6.0.1 |
| SystemRequirements: | C++11 |
| Suggests: | testthat, knitr, rmarkdown |
| Author: | Felipe Llinares-López [aut, cph], Laetitia Papaxanthos [aut, cph], Damian Roqueiro [aut, cph], Matthew Baker [ctr], Mikołaj Rybiński [ctr], Uwe Schmitt [ctr], Dean Bodenham [aut, cre, cph], Karsten Borgwardt [aut, fnd, cph] |
| Maintainer: | Dean Bodenham <deanbodenhambsse@gmail.com> |
| VignetteBuilder: | knitr |
| Packaged: | 2020-05-04 18:15:36 UTC; dean |
| Repository: | CRAN |
| Date/Publication: | 2020-05-05 18:10:02 UTC |
Constructor for CASMAP class object.
Description
Constructor for CASMAP class object.
Details
Constructor for CASMAP class object, which needs the mode
parameter to be set by the user. Please see the examples.
Fields
modeEither
'regionGWAS'or'higherOrderEpistasis'.alphaA numeric value setting the Family-wise Error Rate (FWER). Must be strictly between
0and1. Default value is0.05.max_comb_sizeA numeric specifying the maximum length of combinations. For example, if set to
4, then only combinations of size between1and4(inclusive) will be considered. To consider combinations of arbitrary (maximal) length, use value0, which is the default value.
Base method, for both modes
readFilesRead the data, label and possibly covariates files. Parameters are
genotype_file, for the data,phenotype_filefor the labels and (optional)covariates_filefor the covariates. The optionplink_file_rootis not supported in the current version, but will be supported in future versions.setModeCan set/change the mode, but note that any data files will need to read in again using the
readFilescommand.setTargetFWERCan set/change the Family-wise Error Rate (FWER). Takes a numeric parameter
alpha, strictly between0and1.executeOnce the data files have been read, can execute the algorithm. Please note that, depending on the size of the data files, this could take a long time.
getSummaryReturns a data frame with a summary of the results from the execution, but not any significant regions/itemsets. See
getSignificantRegions,getSignificantInteractions, andgetSignificantClusterRepresentatives.writeSummaryDirectly write the information from
getSummaryto file.
regionGWAS Methods
getSignificantRegionsReturns a data frame with the the significant regions. Only valid when
mode='regionGWAS'.getSignificantClusterRepresentativesReturns a data frame with the the representatives of the significant clusters. This will be a subset of the regions returned from
getSignificantRegions. Only valid whenmode='regionGWAS'.writeSignificantRegionsWrites the data from
getSignificantRegionsto file, which must be specified in the parameterpath. Only valid whenmode='regionGWAS'.writeSignificantClusterRepresentativesWrites the data from
getSignificantClusterRepresentativesto file, which must be specified in the parameterpath. Only valid whenmode='regionGWAS'.
higherOrderEpistasis Methods
getSignificantInteractionsReturns the frame from
getSignificantInteractionsto file, which must be specified in the parameterpath. Only valid whenmode='higherOrderEpistasis'.writeSignificantInteractionsWrites a data frame with the significant interactions. Only valid when
mode='higherOrderEpistasis'.
References
A. Terada, M. Okada-Hatakeyama, K. Tsuda and J. Sese Statistical significance of combinatorial regulations, Proceedings of the National Academy of Sciences (2013) 110 (32): 12996-13001
F. Llinares-Lopez, D. G. Grimm, D. Bodenham, U. Gieraths, M. Sugiyama, B. Rowan and K. Borgwardt, Genome-wide detection of intervals of genetic heterogeneity associated with complex traits, ISMB 2015, Bioinformatics (2015) 31 (12): i240-i249
L. Papaxanthos, F. Llinares-Lopez, D. Bodenham, K .Borgwardt, Finding significant combinations of features in the presence of categorical covariates, Advances in Neural Information Processing Systems 29 (NIPS 2016), 2271-2279.
F. Llinares-Lopez, L. Papaxanthos, D. Bodenham, D. Roqueiro and K .Borgwardt, Genome-wide genetic heterogeneity discovery with categorical covariates. Bioinformatics 2017, 33 (12): 1820-1828.
Examples
## An example using the "regionGWAS" mode
fastcmh <- CASMAP(mode="regionGWAS") # initialise object
datafile <- getExampleDataFilename() # file name of example data
labelsfile <- getExampleLabelsFilename() # file name of example labels
covfile <- getExampleCovariatesFilename() # file name of example covariates
# read the data, labels and covariate files
fastcmh$readFiles(genotype_file=getExampleDataFilename(),
phenotype_file=getExampleLabelsFilename(),
covariate_file=getExampleCovariatesFilename() )
# execute the algorithm (this may take some time)
fastcmh$execute()
#get the summary results
summary_results <- fastcmh$getSummary()
#get the significant regions
sig_regions <- fastcmh$getSignificantRegions()
#get the clustered representatives for the significant regions
sig_cluster_rep <- fastcmh$getSignificantClusterRepresentatives()
## Another example of regionGWAS
fais <- CASMAP(mode="regionGWAS") # initialise object
# read the data and labels, but no covariates
fastcmh$readFiles(genotype_file=getExampleDataFilename(),
phenotype_file=getExampleLabelsFilename())
## Another example, doing higher order epistasis search
facs <- CASMAP(mode="higherOrderEpistasis") # initialise object
Global variables environment
Description
An environment to store a few global variables. Internal.
Usage
CASMAPenv
Format
An object of class environment of length 3.
Approximate fast significant interval search
Description
Class for approximate significant intervals search with Tarone correction for bounding intermediate FWERs.
Internal class for search for significant regions
Description
Please use the CASMAP constructor.
Fast significant interval search with categorical covariates
Description
Internal class, please use CASMAP constructor.
Significant itemsets search with categorical covariates
Description
Internal class, please use CASMAP constructor.
Check if a variable is boolean or not
Description
Checks if a variable is boolean, if not throws an error, otherwise returns boolean.
Usage
checkIsBoolean(var, name)
Arguments
var |
The variable to be checked (if boolean). |
name |
The name of the variable to appear in any error message. |
Value
If not boolean (or NA), throws error.
If NA, return FALSE. Otherwise return
boolean value of var.
Get the path to the example covariates file for regionGWAS mode
Description
Path to CASMAP_example_covariates_1.txt in inst/extdata.
The covariates categories for the data set
CASMAP_example_data_1.txt, the path to which is given by
getExampleDataFilename.
Usage
getExampleCovariatesFilename()
Format
A single column vector of 100 labels, each of which
is 0 or 1 (same format as labels file).
Details
Path to the file containing the labels, for reading in to
CASMAP object using the readFiles function.
See Also
getExampleDataFilename,
getExampleLabelsFilename
Examples
covfile <- getExampleCovariatesFilename()
Get the path to the example data file for regionGWAS mode
Description
Path to CASMAP_example_data_1.txt in inst/extdata.
A dataset containing binary samples for the regionGWAS method.
There are accompanying labels and covariates dataset.
Usage
getExampleDataFilename()
Format
A matrix of 0s and 1s, with 1000 rows (features)
and 100 columns
(samples). In other words, each column is a sample, and each sample
has 1000 binary features.
Details
Path to the file containing the data, for reading in to
CASMAP object using the readFiles function.
Note that the significant region is [99, 102].
See Also
getExampleLabelsFilename,
getExampleCovariatesFilename
Examples
datafile <- getExampleDataFilename()
Get the path to the example labels file for regionGWAS mode
Description
Path to CASMAP_example_labels_1.txt in inst/extdata.
A dataset containing the binary labels for the data in the file
CASMAP_example_data_1.txt, the path to which is given by
getExampleDataFilename.
Usage
getExampleLabelsFilename()
Format
A single column of 100 labels, each of which is either 0
or 1.
Details
Path to the file containing the labels, for reading in to
CASMAP object using the readFiles function.
See Also
getExampleDataFilename,
getExampleCovariatesFilename
Examples
labelsfile <- getExampleLabelsFilename()
Get the path to the example significant intervals file
Description
Path to CASMAP_example_covariates_1.txt in
inst/extdata.
Usage
getExampleSignificantRegionsFilename()
Examples
sigregfile <- getExampleSignificantRegionsFilename()
Gets the higherOrderEpistasis string
Description
A getter for the global higherOrderEpistasis value, a string
for the mode parameter.
Usage
getHigherOrderEpistasisString()
Gets the minModeLength
Description
A getter for the global minModeLength value, a string
for the mode parameter.
Gets the minimum mode character length (should be 3)
Usage
getMinModeLength()
getMinModeLength()
Get the function name
Description
Uses match.call and as.character.
Usage
getParentFunctionName()
Gets the regionGWAS string
Description
A getter for the global regionGWAS value, a string
for the mode parameter.
Usage
getRegionGWASString()
Checks if substring is part of higherOrderEpistasis
Description
Using grep to search through vector of strings
Usage
isHigherOrderEpistasisString(x)
Arguments
x |
The string which will be compared to 'higherOrderEpistasis' |
Details
Uses grep to search for exact match.
Value
TRUE if the string is a substring of 'higherOrderEpistasis',
otherwise returns FALSE.
A method to check value is numeric and in open interval
Description
Checks if a value is numeric and strictly between two other values.
Usage
isInOpenInterval(x, lower = 0, upper = 1)
Arguments
x |
Value to be checked. Needs to be numeric. |
lower |
Lower bound. Default value is |
upper |
Upper bound. Default value is |
Value
If numeric, and strictly greater than lower and
strictly smaller than upper, then return TRUE.
Else return FALSE.
Checks if substring is part of regionGWAS
Description
Usinggrepl to compare strings, ignoring case.
Usage
isRegionGWASString(x)
Arguments
x |
The string which will be compared to 'regionGWAS' |
Details
Uses grepl to search for exact match. Case will be ignored.
Value
TRUE if the string is a substring of 'regionGWAS',
otherwise returns FALSE.
Internal function
Description
Internal function
Usage
lib_delete_search_chi(inst)
Internal function
Description
Internal function
Usage
lib_delete_search_e(inst)
Internal function
Description
Internal function
Usage
lib_delete_search_facs(inst)
Internal function
Description
Internal function
Usage
lib_delete_search_fastcmh(inst)
Internal function
Description
Internal function
Usage
lib_execute_int(inst, alpha, l_max)
Internal function
Description
Internal function
Usage
lib_execute_iset(inst, alpha, l_max)
Internal function
Description
Internal function
Usage
lib_filter_intervals_write_to_file(inst, output_file)
Internal function
Description
Internal function
Usage
lib_get_filtered_intervals(inst)
Internal function
Description
Internal function
Usage
lib_get_result_facs(inst)
Internal function
Description
Internal function
Usage
lib_get_result_fais(inst)
Internal function
Description
Internal function
Usage
lib_get_result_int(inst)
Internal function
Description
Internal function
Usage
lib_get_result_iset(inst)
Internal function
Description
Internal function
Usage
lib_get_significant_intervals(inst)
Internal function
Description
Internal function
Usage
lib_get_significant_itemsets(inst)
Internal function
Description
Internal function
Usage
lib_new_search_chi()
Internal function
Description
Internal function
Usage
lib_new_search_e()
Internal function
Description
Internal function
Usage
lib_new_search_facs()
Internal function
Description
Internal function
Usage
lib_new_search_fastcmh()
Internal function
Description
Internal function
Usage
lib_profiler_write_to_file(inst, output_file)
Internal function
Description
Internal function
Usage
lib_pvals_significant_ints_write_to_file(inst, output_file)
Internal function
Description
Internal function
Usage
lib_pvals_significant_isets_write_to_file(inst, output_file)
Internal function
Description
Internal function
Usage
lib_pvals_testable_ints_write_to_file(inst, output_file)
Internal function
Description
Internal function
Usage
lib_pvals_testable_isets_write_to_file(inst, output_file)
Internal function
Description
Internal function
Usage
lib_read_covariates_file_facs(inst, cov_filename)
Internal function
Description
Internal function
Usage
lib_read_covariates_file_fastcmh(inst, cov_filename)
Internal function
Description
Internal function
Usage
lib_read_eth_files(inst, x_filename, y_filename, encoding)
Internal function
Description
Internal function
Usage
lib_read_eth_files_with_cov_facs(inst, x_filename, y_filename, covfilename,
encoding)
Internal function
Description
Internal function
Usage
lib_read_eth_files_with_cov_fastcmh(inst, x_filename, y_filename, covfilename,
encoding)
Internal function
Description
Internal function
Usage
lib_read_plink_files(inst, base_filename, encoding)
Internal function
Description
Internal function
Usage
lib_read_plink_files_with_cov_facs(inst, base_filename, covfilename, encoding)
Internal function
Description
Internal function
Usage
lib_read_plink_files_with_cov_fastcmh(inst, base_filename, covfilename,
encoding)
Internal function
Description
Internal function
Usage
lib_summary_write_to_file_facs(inst, output_file)
Internal function
Description
Internal function
Usage
lib_summary_write_to_file_fais(inst, output_file)
Internal function
Description
Internal function
Usage
lib_summary_write_to_file_fastcmh(inst, output_file)
Internal function
Description
Internal function
Usage
lib_write_eth_files_int(inst, x_filename, y_filename)
Internal function
Description
Internal function
Usage
lib_write_eth_files_iset(inst, x_filename, y_filename)
Internal function
Description
Internal function
Usage
lib_write_eth_files_with_cov_facs(inst, x_filename, y_filename, covfilename)
Internal function
Description
Internal function
Usage
lib_write_eth_files_with_cov_fastcmh(inst, x_filename, y_filename, covfilename)
Internal class
Description
in internal class
Internal class
Description
Internal class
Internal class
Description
An internal class
Internal class
Description
An internal class.
Internal class
Description
Internal class
Error message for mode
Description
Return the appropriate error message for incorrect mode input
Usage
modeErrorMessage()
Error message for mode, if too short
Description
Return the appropriate error message for incorrect mode input
Usage
modeLengthErrorMessage()
Checks mode string is long enough
Description
Checks mode string is at least minimum length
Usage
modeNeedsMoreChars(mode)