Dear Editor,

Although multiple clinical studies have correlated microRNA (miR) expression profiles with clinical outcome in cancer patients,1 most data sets in the public domain have a complex interface, which is difficult to use by a non-specialist public and requires advanced informatics and statistical skills. Here, we present MIRUMIR, a tool which performs survival analyses and draws Kaplan–Meier (KM) plots for submitted ‘microRNA’ across several available data sets, which cover more than 800 patients. A robust statistical procedure is implemented to account for multiple testing. MIRUMIR is incorporated into BioProfiling.de, analytical portal for high-throughput cell biology,2 and is freely available at http://www.bioprofiling.de/MIRUMIR.

Similar tools were developed recently to assess the relationship between expression levels of various genes with clinical outcome.3 KM plots provides meta-analyses by combining multiple gene expression data sets. MIRUMIR is the first tool specifically designed for miRs. Unlike KM plots, MIRUMIR screens publicly available miR expression data sets one by one implementing robust statistical procedures to adjust P-value for multiple testing. Data sets have been downloaded from the GEO omnibus repository.4 To be selected the data set must be a clinical data set (patients) and have at least 50 profiled samples annotated with survival information. Currently, MIRUMIR comprises nine data sets covering breast, prostate and ovarian cancers that we plan to expand shortly.

MIRUMIR exploits rank information from expression data sets. For each available data set samples are grouped with respect to expression rank of the user specified miR. The ‘low expression’ and ‘high expression’ groups are those where expression rank of miR is less or more than average expression rank across the data set, respectively. This separation of patients into ‘low’ and ‘high’ groups along with survival information is next used to find any statistical differences in survival outcome. The R statistical package is used to perform survival analyses5 and to draw KM plots, see Supplementary Material for MIRUMIR practical analyses of several miR examples. There are sources of multiple testing that should be accounted for. Obviously, the miR hypothesis is tested across each available data set. Therefore, the final number of tested hypotheses is equal to the number of screened expression data sets. In addition, for some data sets, the miR can be mapped to several probes, which means several hypotheses tested for one data set. False discovery rate6 control procedure is implemented to automatically adjust P-values for multiple testing.

In summary, MIRUMIR supports the need of biomedical researchers to estimate the power of miR to serve as potential biomarker to predict survival of cancer patients. MIRUMIR provides such analyses based on several publicly available clinical miR data sets annotated with patient survival information. To our knowledge this is a first simple tool of this kind and, as shown by several examples (see Supplementary Materials), MIRUMIR is ideal for fast validation of the clinical relevance of a selected miR.