Skip to content

Latest commit

 

History

History
47 lines (36 loc) · 4.17 KB

PRODUCT_SUBSTITUTABILITY.md

File metadata and controls

47 lines (36 loc) · 4.17 KB

Integrating Text Matching and Product Substitutability

Data collection

We use the Amazon product data published by McAuley et al. at SIGIR 2015. You can obtain the data by following the provided instructions.

We make use of the Pet_Supplies, Sports_and_Outdoors, Toys_and_Games and Electronics reviews and metadata data files (full, not the 5-core). For more information, see the CIKM 2016 paper where the evaluation sets were first introduced.

Pet Supplies Sports & Outdoors Toys & Games Electronics
Product lists product_list product_list product_list product_list
Product substitute relations substitutes substitutes substitutes substitutes
Topics topics topics topics topics
Relevance qrel_test qrel_validation qrel_test qrel_validation qrel_test qrel_validation qrel_test qrel_validation

Usage

To replicate the experiments of the paper on integrating text matching and product substitutability, build an Indri index where every product is represented by the union of its description and its reviews as described in the experimental setup section of the CIKM 2018 paper. Secondly, follow the tutorial for the 2018 TOIS paper on neural vector spaces that can be found here.

The substitution relations (or, in fact, any document/document similarity relations) can be passed to cuNVSMTrainModel as a second positional argument that follows the path to the Indri index. The --entity_similarity_weight flag controls the mixture weight between text and substitutability signals.

Citation

If you use cuNVSM to produce results for your scientific publication, please refer to our TOIS and CIKM 2018 papers:

@article{VanGysel2018nvsm,
  title={Neural Vector Spaces for Unsupervised Information Retrieval},
  author={Van Gysel, Christophe and de Rijke, Maarten and Kanoulas, Evangelos},
  publisher={ACM},
  journal={TOIS},
  year={2018},
}

@inproceedings{VanGysel2018substitutability,
  title={Mix ’n Match: Integrating Text Matching and Product Substitutability within Product Search},
  author={Van Gysel, Christophe and de Rijke, Maarten and Kanoulas, Evangelos},
  booktitle={CIKM},
  volume={2018},
  year={2018},
  organization={ACM}
}