# geoCancerDiagnosticDatasetsRetriever GEO Cancer Diagnostic Datasets Retriever is a bioinformatics tool for cancer diagnostic dataset retrieval from the GEO website. ## Summary
Gene Expression Omnibus (GEO) Cancer Diagnostic Datasets Retriever is a Bioinformatics tool for cancer diagnostic dataset retrieval from the GEO database. It requires a GeoDatasets input file listing all GSE dataset entries for a specific cancer (for example, Myelodysplastic syndrome), obtained as a download from the GEO database. This Bioinformatics tool functions by applying keyword filters to examine individual GSE dataset entries listed in a GEO DataSets input file. The first Diagnostic text filter flags for diagnostic keywords (for example, “diagnosis” or “health”) used by clinical science researchers and present in the title/abstract entries. Next, a flagged dataset is examined (by a second Diagnostic text filter) for diagnostic keywords, which may be present in the "Overall design" section of a GSE dataset. If found, this tool outputs the GSE code of the likely diagnostic dataset. If not found by the second filter, a more intensive filtering stage is performed. Here, this tool runs an R script (healthyControlsPresentInputParams.r) whose function is to detect desired keywords in the .SOFT file of this dataset and identify if it is a likely diagnostic dataset.
## Installation geoCancerDiagnosticDatasetsRetriever can be used on any Linux or macOS machines. To run the program, you need to have the following programs installed on your computer:Help information can be read by typing the following command:
```diff geoCancerDiagnosticDatasetsRetriever -h ```This command will print the following instructions:
```diff Usage: geoCancerDiagnosticDatasetsRetriever -h Mandatory arguments: CANCER_TYPE type of the cancer as query search term PLATFORM_CODES list of GPL platform codes Optional arguments: -h show help message and exit ``` ## Copyright and License Copyright 2021 by Abbas Alameer (Kuwait University) and Davide Chicco (University of Toronto) This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License, version 2 (GPLv2). ## ContactgeoCancerDiagnosticDatasetsRetriever was developed by:
Abbas Alameer (Kuwait University) and Davide Chicco (University of Toronto)
For information, please contact Abbas Alameer at abbas.alameer(AT)ku.edu.kw or Davide Chicco at davidechicco(AT)davidechicco.it