Destination classification

On the basis of descriptive texts, excursion destinations and sights are to be divided into different classes in order to suggest them to tourists as suitable travel destinations.
/
Development of a first classification algorithm within 2 weeks
Classification accuracy (Accuracy) of up to 81 %
Identification of the top 20 influencing factors (words) per class

Challenge

  • An automobile club faces the challenge of automatically classifying destinations for tourists based on descriptions.
  • In the first step, 12,000 different destinations are to be classified into 12 different classes (e.g. opening hours).

Solution

  • Focus on 800 destinations manually labelled by the department.
  • Preparation of the description texts in order to bring them into a readable format for a machine learning model (tokenisation, lematizing, punctuation, word dictionary, ...) using suitable Python packages from the field of natural language processing.
  • Use of an algorithm to classify the destinations and evaluation of the results
  • Comparison of the results from the machine learning model with a simple heuristic ("strategic guessing")

Result

  • destinations into 12 different classes (Python script) was developed.
  • Descriptive analyses to generate transparency of destinations for tourists and analysis for the client's department (Jupyter Notebook)

Are you interested in your own use cases?