Classification of tourism data

On the basis of descriptive texts, excursion destinations and sights are to be divided into different classes in order to suggest them to tourists as suitable travel destinations.
/
Development of a first classification algorithm within 2 weeks
Classification accuracy (Accuracy) of up to 81 %
Identification of the top 20 influencing factors (words) per class

Challenge

  • An automobile club faces the challenge of automatically classifying destinations for tourists based on descriptions.
  • In the first step, 12,000 different destinations are to be classified into 12 different classes (e.g. opening hours).

Solution

  • Focus on 800 destinations manually labelled by the department.
  • Preparation of the description texts in order to bring them into a readable format for a machine learning model (tokenisation, lemmatising, punctuation, word dictionary, ...) using suitable Python packages from the field of natural language processing.
  • Use of an algorithm to classify the destinations and evaluation of the results
  • Comparison of the results from the machine learning model with a simple heuristic ("strategic guessing")

Result

  • destinations into 12 different classes (Python script) was developed.
  • Descriptive analyses to generate transparency of destinations for tourists and analysis for the client's department (Jupyter Notebook)

Are you interested in your own use cases?

Challenge

An automotive company would like to visualise various market-specific data in order to create a Competitive analysis for the US market.

Solution

There will be a interactive and Flexible application, including of different maps with two different views implemented.

Result

Relevant markets are identifies, analyses and visualises. The dealer or the respective sales department have the possibility to compare the direct competition with their own product and to visualise the relevant data.