Embedding User Behavioral Aspect in TF-IDF Like Representation

Ligaj Pradhan, Chengcui Zhang, Steven Bethard, Xin Chen

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Scopus citations

Abstract

Term Frequency - Inverse Document Frequency (TF-IDF) computes weight for each word in a document which increases proportionally to the number of times the word appears in a specific document but is counterbalanced by the number of times it occurs in the collection of documents. TF-IDF is the state-of-the-art for computing relevancy scores between documents. However, it is based on statistical learning alone and doesn't directly capture the conceptual contents of the text or the behavioral aspects of the writer. Hence, in this work we show how relatively low dimensional user behavioral vectors extracted from the same text, from which TF-IDF vectors are extracted, can be used to enrich the performance of TF-IDF. We extract User-Concerns embedded in user reviews and append them to TF-IDF vectors to train a deep rating prediction model. Our experiments show that adding such conceptual knowledge to TF-IDF vectors can significantly enhance the performance of TF-IDF vectors by only adding very little complexity.

Original languageEnglish (US)
Title of host publicationProceedings - IEEE 1st Conference on Multimedia Information Processing and Retrieval, MIPR 2018
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages262-267
Number of pages6
ISBN (Electronic)9781538618578
DOIs
StatePublished - Jun 26 2018
Externally publishedYes
Event1st IEEE Conference on Multimedia Information Processing and Retrieval, MIPR 2018 - Miami, United States
Duration: Apr 10 2018Apr 12 2018

Publication series

NameProceedings - IEEE 1st Conference on Multimedia Information Processing and Retrieval, MIPR 2018

Conference

Conference1st IEEE Conference on Multimedia Information Processing and Retrieval, MIPR 2018
Country/TerritoryUnited States
CityMiami
Period4/10/184/12/18

Keywords

  • TF-IDF
  • rating prediction
  • topic modeling
  • user behavior
  • user concerns

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Signal Processing
  • Media Technology

Fingerprint

Dive into the research topics of 'Embedding User Behavioral Aspect in TF-IDF Like Representation'. Together they form a unique fingerprint.

Cite this