TY - GEN
T1 - The impact of directionality in predications on text mining
AU - Leroy, Gondy
AU - Fiszman, Marcelo
AU - Rindflesch, Thomas C.
PY - 2008
Y1 - 2008
N2 - The number of publications in biomedicine is increasing enormously each year. To help researchers digest the information in these documents, text mining tools are being developed that present co-occurrence relations between concepts. Statistical measures are used to mine interesting subsets of relations. We demonstrate how directionality of these relations affects interestingness. Support and confidence, simple data mining statistics, are used as proxies for interestingness metrics. We first built a test bed of 126,404 directional relations extracted from biomedical abstracts, which we represent as graphs containing a central starting concept and 2 rings of associated relations. We manipulated directionality in four ways and randomly selected 100 starting concepts as a test sample for each graph type. Finally, we calculated the number of relations and their support and confidence. Variation in directionality significantly affected the number of relations as well as the support and confidence of the four graph types.
AB - The number of publications in biomedicine is increasing enormously each year. To help researchers digest the information in these documents, text mining tools are being developed that present co-occurrence relations between concepts. Statistical measures are used to mine interesting subsets of relations. We demonstrate how directionality of these relations affects interestingness. Support and confidence, simple data mining statistics, are used as proxies for interestingness metrics. We first built a test bed of 126,404 directional relations extracted from biomedical abstracts, which we represent as graphs containing a central starting concept and 2 rings of associated relations. We manipulated directionality in four ways and randomly selected 100 starting concepts as a test sample for each graph type. Finally, we calculated the number of relations and their support and confidence. Variation in directionality significantly affected the number of relations as well as the support and confidence of the four graph types.
UR - http://www.scopus.com/inward/record.url?scp=51549108313&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=51549108313&partnerID=8YFLogxK
U2 - 10.1109/HICSS.2008.443
DO - 10.1109/HICSS.2008.443
M3 - Conference contribution
AN - SCOPUS:51549108313
SN - 0769530753
SN - 9780769530758
T3 - Proceedings of the Annual Hawaii International Conference on System Sciences
BT - Proceedings of the 41st Annual Hawaii International Conference on System Sciences 2008, HICSS
T2 - 41st Annual Hawaii International Conference on System Sciences 2008, HICSS
Y2 - 7 January 2008 through 10 January 2008
ER -