TY - JOUR
T1 - Internet Categorization and Search
T2 - A Self-Organizing Approach
AU - Chen, Hsinchun
AU - Schuffels, Chris
AU - Orwig, Richard
N1 - Funding Information:
This project is supported by a Research Initiation Award grant awarded by the Division of Information, Robotics, and Intelligent Systems, NSF (‘‘Building a Concept Space for an Electronic Community System,’’ PI: H. Chen, 1992–1994, IRI9211418), a National Collaboratory grant awarded by NSF (‘‘Systems Technology for Building a National Collabo-ratory’’, PI: B. Schatz, 1990–1994), a Digital Library Initiative grant awarded by NSF/ARPA/NASA (‘‘Building the Interspace: Digital Library Infrastructure for a University Engineering Community,’’ PIs: B. Schatz, H. Chen, et al., 1994–1998, IRI9411318), and an NSF/CISE grant (‘‘Concept-based Categorization and Search on Internet: A Machine Learning, Parallel Computing Approach,’’ PI: H. Chen, 1995–1995, IRI9525790).
PY - 1996/3
Y1 - 1996/3
N2 - The problems of information overload and vocabulary differences have become more pressing with the emergence of increasingly popular Internet services. The main information retrieval mechanisms provided by the prevailing Internet WWW software are based on either keyword search (e.g., the Lycos server at CMU, the Yahoo server at Stanford) or hypertext browsing (e.g., Mosaic and Netscape). This research aims to provide an alternative concept-based categorization and search capability for WWW servers based on selected machine learning algorithms. Our proposed approach, which is grounded on automatic textual analysis of Internet documents (homepages), attempts to address the Internet search problem by first categorizing the content of Internet documents. We report results of our recent testing of a multilayered neural network clustering algorithm employing the Kohonen self-organizing feature map to categorize (classify) Internet homepages according to their content. The category hierarchies created could serve to partition the vast Internet services into subject-specific categories and databases and improve Internet keyword searching and/or browsing.
AB - The problems of information overload and vocabulary differences have become more pressing with the emergence of increasingly popular Internet services. The main information retrieval mechanisms provided by the prevailing Internet WWW software are based on either keyword search (e.g., the Lycos server at CMU, the Yahoo server at Stanford) or hypertext browsing (e.g., Mosaic and Netscape). This research aims to provide an alternative concept-based categorization and search capability for WWW servers based on selected machine learning algorithms. Our proposed approach, which is grounded on automatic textual analysis of Internet documents (homepages), attempts to address the Internet search problem by first categorizing the content of Internet documents. We report results of our recent testing of a multilayered neural network clustering algorithm employing the Kohonen self-organizing feature map to categorize (classify) Internet homepages according to their content. The category hierarchies created could serve to partition the vast Internet services into subject-specific categories and databases and improve Internet keyword searching and/or browsing.
UR - http://www.scopus.com/inward/record.url?scp=0030104572&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=0030104572&partnerID=8YFLogxK
U2 - 10.1006/jvci.1996.0008
DO - 10.1006/jvci.1996.0008
M3 - Article
AN - SCOPUS:0030104572
SN - 1047-3203
VL - 7
SP - 88
EP - 102
JO - Journal of Visual Communication and Image Representation
JF - Journal of Visual Communication and Image Representation
IS - 1
ER -