HomeSMN Digestvol. 7 no. 2 (2021)

Travel Package Recommendation System Using Topic Modeling Approach

Marianne P. Vitug

Discipline: Artificial Intelligence

 

Abstract:

The goal of the study is to build a recommender system that will incorporate important factors in designing a travel package – the inherent attributes of tourist destinations, and their distance from other tourist destinations. This project sourced its primary data from reviews from Tripadvisor collected using the package rvest from R. 37 current tourist destinations were included in the data gathering with 50 reviews gathered for each location. Distance matrices using Geodesic distance calculation and driving distance via Google API were gathered for this project. These were used as a penalizing factor to produce the hierarchy of the final set of recommendations. The Geodesic distances between the various tourist destinations were gathered using R ggmap and lmap libraries. For the Google driving distance, Google Cloud platform’s Distance Matrix API registration had been necessary to get the key that was used for the R code using gmapdistance library. The Google Distance Matrix API gave travel distance based on the recommended route for a supplied matrix of origins (start) and destinations (end point). For the modeling technique, since the data had characteristics that were unsuitable for the algorithms commonly used in recommender systems, topic modeling was used as an alternative method of extracting the intrinsic features of both the tourists and the locations. Latent Dirichlet Allocation (LDA) and Bidirectional Encoder Representations from Transformers (BERT) were used for topic modeling. The results were evaluated using a mix of eyeballing on top N words and intrinsic evaluation metrics through topics interpretability. Aside from this, the final list of recommendations was sent to Ark Travel’s President and the head of local tour operations for evaluation and successfully fitted their requirements. The model was able to produce recommendations which are deemed acceptable based on these criteria. The solution made using this recommender system can help not only the main stakeholders— the travelers and the travel agency, but also the business owners on less popular or just-emerging tourist destinations since they can also be recommended as long as they are part of the dataset.



References:

  1. Aletras, N., & Stevenson, M. (2014). Measuring the Similarity between Automatically Generated Topics. Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, pages 22–27. Doi: 10.3115/v1/E14-4005
  2. Blei, D., Ng, A., & Jordan, M. (2003). Latent Dirichlet Allocation. Journal of Machine Learning Research. 3(4–5), 993–1022. doi:10.1162/jmlr.2003.3.4-5.993.
  3. Burgos, P. (2021, July 3). ‘DOT eyes more ‘bubbles’ to boost domestic tourism. Inquirer.Net. https://newsinfo.inquirer.net/1423158/dot-eyes-more-bubbles-to-boost-domestic-tourism
  4. Coelho, J., Nitu, P., & Madiraju, P. (2018, September 10). A Personalized Travel Recommendation System Using Social Media Analysis. Institute of Electrical and Electronic Engineers (IEEE). Retrieved from https://epublications.marquette.edu/comp_fac/6/
  5. Devlin, J., Chang, M., Lee, K., Toutanova, K. (2019). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. Google AI Language. Retrieved from: https://arxiv.org/pdf/1810.04805.pdf
  6. Duca, A., & Marchetti, A. (2019). Open data for tourism: the case of Tourpedia. Journal of Hospitality and Tourism Technology. 10. 10.1108/JHTT-07-2017-0042.
  7. Floridi, L. (2016) The method of levels of abstraction. In: Floridi L (eds), The Routledge Handbook of Philosophy of Information (pp 67–72). Routledge
  8. Horev, R. (2018). BERT Explained: State of the art language model for NLP. Retrieved from: https://towardsdatascience.com/bert-explained-state-of-the-art-language-model-for-nlp-f8b21a9b6270
  9. Igarashi, Y. (2020). Minimal Requirements to Pretend You Are Familiar with BERT. Retrieved from https://towardsdatascience.com/minimal-requirements-to-pretend-you-are-familiar-with-bert- 3889023e4aa9
  10. Khatri, I. (2019). Information Technology in Tourism & Hospitality Industry: A Review of Ten Years’ Publications. Journal of Tourism & Hospitality Education, 9, 74-87.
  11. Koene, A., Perez, E., Carter, C.J., Statache, R., Adolphs, S., O’Malley, C., & McAuley, D. (2015) Ethics of Personalized Information Filtering. In: Tiropanis T., Vakali A., Sartori L., Burnap P. (eds) Internet Science. INSCI 2015. Lecture Notes in Computer Science, vol 9089. Springer, Cham. https://doi.org/10.1007/978-3-319-18609-2_10
  12. Liu, Q., Ge, Y., Li, Z., Chen, E., & Xiong, H. (2014). Personalized Travel Package Recommendation. IEEE 11th International Conference on Data Mining, 2011, pp. 407-416, doi: 10.1109/ICDM.2011.118.
  13. Pantano,E., Priporas, C., & Stylos, N. (2017). ‘You will like it!’ using open data to predict tourists’ response to a tourist attraction. Tourism Management, 60, 430-438.
  14. Prabhakaran, S. (2018). Cosine Similarity – Understanding the math and how it works (with python codes). Retrieved from https://www.machinelearningplus.com/nlp/cosine-similarity/
  15. Sangram, S., Pratik, K., Akshay, V., & Vishwajit, G. (2018). TRAVELMATE Travel Package Recommendation System. International Research Journal of Engineering and Technology (IRJET), 5, 4095- 4097.
  16. Shu, L., Long, B., & Meng, W. (2009). A Latent Topic Model for Complete Entity Resolution (PDF). 880-891. doi: 10.1109/ICDE.2009.29.
  17. Tirona, A. (2021). Tourism contribution to GDP lowest in at least 2 decades. BusinessWorld. Retrieved from https://www.bworldonline.com/tourism-contribution-to-gdp-lowest-in-at-least-2-decades/
  18. Department of Tourism. (2017). Tourism Guidebook [PDF file]. Retrieved from http://www.tourism.gov.ph/Guidebook_Manual/TourismGuidebook.pdf.
  19. Yan, X., Guo, J., Lan, Y., & Cheng, X. (2013). A biterm topic model for short texts. WWW 2013 – Proceedings of the 22nd International Conference on World Wide Web. 1445-1456. 10.1145/2488.388.2488514.