Discovering in-depth tourist behaviour and demand using social media data in Bonaire island
Tourism has not only brought an economic fortune to Bonaire island but also has a detrimental effect on its natural ecosystem. Studying tourist behaviour might be a good precaution step so that the stakeholders can manage better tourism in Bonaire island. This internship research tried to utilize machine learning on social media data to study tourist behaviour and tried to look at tourist demand in the future.
From 2003 to 2019, there are 13,706 geotagged Flickr data which was cleaned and converted into keywords in this internship research to study the tourist’s behaviour. The cleaned keywords then were weighted using TF- DF (Term Frequency-Inverse Document Frequency) and clustered based on keywords similarity with DBSCAN (Density-Based Spatial Clustering of Noise Applications). The most relevant and least relevant keywords in a cluster then determined the tourist activities/interest of that same cluster, but in respect to other keywords in all clusters. There are nine clusters which this internship research found make sense and useful for interpreting Bonaire tourist behaviour.
For tourism demand, this internship research has forecasted time-series of tourist arrival using both Flickr data and CBS (Centraal Bureau voor de Statistiek) data. Although the number was unrealistic, Flickr data could show which continent the tourist came at which seasons (Winter, Spring, Summer, Autumn) from 2015 to the end of 2021. At the same time, CBS data could not show which continent the tourist came, but could show which seasons the tourist come from 2012 until the end of 2021 with a realistic number of government official data.
The attempt that this research has done could provide an insight into the stakeholder of Bonaire island to manage tourism by studying tourist behaviour using free social media data and an automatic method of machine learning.