PubMed Text Data Mining Automation for Biological Validation on Lists of Genes and Pathways
Abstract — A prognostic cancer marker is helpful in oncology to identify the abnormal cancer cells from the collected sample. This marker can be used as an indicator to determine a disease outcome, cancer treatment, and drug discovery. Identifying cancer markers is also beneficial to improve cancer patients’ survival rate in receiving the treatment decision-making. Cancer markers can be determined by testing every gene or pathway in the wet lab manually or using the text mining automation method. The use of text mining techniques effectively investigates hidden information and gathers new knowledge from many existing sources. Unfortunately, querying relevant text to excavate important information is a challenging task. PubMed text data mining is one of the applications that help explore potential cancer markers as the trend of scientific articles in PubMed is steadily increased. Besides, it can support biologists to concentrate on the identified small set of genes or pathways. PubMed identifiers (PMIDs) are then obtained as evidence to ascertain the relationship between diseases and genes (or pathways) used as biological validation. Thus, this technique can discover the biological relationship between disease and genes or pathways. Therefore, the PubMed text data mining automation is invented to link to the websites for saving time instead of manually.
Keywords — PubMed, text data mining, biological validation, cancer markers, diseases, genes, pathways.