Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
36 commits
Select commit Hold shift + click to select a range
b6eebc4
Add files via upload
ChVlachakis Dec 8, 2025
39c7419
christos branch Monday commit
Dec 8, 2025
3cfcf2a
safina branch monday
safinajahan-byte Dec 8, 2025
7585904
safina branch monday
safinajahan-byte Dec 8, 2025
98dcca8
HoaiThuong Branch Monday commit
hthuong2111 Dec 8, 2025
08b3c91
Merge pull request #1 from ChVlachakis/hoaithuong
ChVlachakis Dec 9, 2025
30e2c92
Merge branch 'main' into safina
ChVlachakis Dec 9, 2025
493a9f5
Merge pull request #2 from ChVlachakis/safina
ChVlachakis Dec 9, 2025
d7ef575
Merge main into christos
Dec 9, 2025
5132158
Merge pull request #3 from ChVlachakis/christos
ChVlachakis Dec 9, 2025
3521598
Remove unused untitled notebooks
Dec 9, 2025
006eddc
Remove unused username notebooks
Dec 9, 2025
4d7fd7a
thuong clean
hthuong2111 Dec 9, 2025
f931582
thuong clean accommodation
hthuong2111 Dec 9, 2025
4316367
Merge pull request #5 from ChVlachakis/hoaithuong
ChVlachakis Dec 9, 2025
4d016ed
Add cleaning and analysis notebooks for travel trips
Dec 9, 2025
f36dc11
Add cleaning and analysis notebooks for travel trips
Dec 9, 2025
8322af3
Merge pull request #6 from ChVlachakis/christos-day2
ChVlachakis Dec 9, 2025
1561fc4
Add analysis for travel trip HoaiThuong
hthuong2111 Dec 9, 2025
c95369b
christos analysis day2
Dec 9, 2025
81afa7b
Merge pull request #8 from ChVlachakis/christos-day2
ChVlachakis Dec 10, 2025
7076073
Merge pull request #7 from ChVlachakis/hoaithuong_day2
ChVlachakis Dec 10, 2025
02e9766
first_project_update
safinajahan-byte Dec 10, 2025
162bea9
update
safinajahan-byte Dec 10, 2025
707754d
Fixed - Add analysis for travel trip HoaiThuong
hthuong2111 Dec 10, 2025
0bc00dc
Merge pull request #9 from ChVlachakis/hoaithuong_day3
ChVlachakis Dec 10, 2025
89af549
Add compressed Hogwarts slides for evaluation
Dec 10, 2025
9d879bd
Merge pull request #10 from ChVlachakis/christos_day3
ChVlachakis Dec 11, 2025
0cff737
Final presentation and final jupyter notebook added
Dec 11, 2025
3a55a1d
Merge pull request #11 from ChVlachakis/christos_day3
ChVlachakis Dec 11, 2025
798baed
Update README with project description
Dec 11, 2025
a386c84
Merge pull request #12 from ChVlachakis/christos_day3
ChVlachakis Dec 11, 2025
c41aa09
Delete obsolete PNG from notebooks
Dec 11, 2025
69614de
Merge pull request #13 from ChVlachakis/christos_day3
ChVlachakis Dec 11, 2025
0f5e924
Final to present slides uploaded
Dec 12, 2025
51875c9
Merge pull request #14 from ChVlachakis/christos_day3
ChVlachakis Dec 12, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
49 changes: 35 additions & 14 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,19 @@
# Project overview
...
Objective:
Turning a “blind” travel business policy into a data-driven one.

Our problem statement (draft - subject to change):
Many travel agencies / tourism service providers struggle to use their historical booking/trip data effectively. Without structured insight into when, where, and for how long travelers go, companies lack evidence to:
optimize marketing and promotions by season / region,
forecast demand to negotiate with accommodation or transport providers,
tailor travel packages to traveler behavior (e.g. trip duration, popular destinations), or
detect under-served / emerging travel destinations that might deserve new offerings.

Business Use-Cases:
Demand forecasting & capacity planning: by anticipating high-season destinations and months.
Marketing & promotions optimization: align campaigns with seasons, demographics, destinations.
New product/offer development: design destination-specific travel packages (city-break, long trip, destination-bundle) based on observed traveler behavior.
Cost & supplier negotiation leverage: if you know high-traffic destinations ahead of time, you can negotiate better deals with hotels, transport providers, or local vendors. This is similar to what firms in “business travel analytics” aim for.

# Installation

Expand Down Expand Up @@ -55,23 +69,30 @@ If you're a Windows user type:
uv pip install -r requirements.txt
```

# Questions
...
# Hypothesis
1. Seasonality patterns vary between continents Europe and Asia
2. Different traveler demographics (age, nationality) show different destination and trip duration preferences.
3. Travelers choose different accommodation types for different trip durations, or based on their age.
4. Transportation type preferences differ per travelers’ age
5. Accommodation and transportation prices in high demand continents follow seasonality patterns.

# Dataset
...
https://www.kaggle.com/datasets/rkiattisak/traveler-trip-data

## Main dataset issues

- ...
- ...
- ...

## Solutions for the dataset issues
...
## Main dataset issues & Solutions provided
- Dataset had no major issues
- Needed some basic cleaning across some columns, especially the numerical ones, where we could find currency symbols, text, and commas.
- We needed to prepare and format the date columns in a ways that we can extrapolate new columns specifying the month, so that we can use it in seasonality research.
- The column "destination" was appending city and country together, we actually decided to split that into two new columns with separate information for each. However, many country information was missing (almost the half), therefore we needed to import a library that gets those values. Before that, we needed to format and replace a lot of the cities values in a way that they could be understood by the library in order for it to work.
- We also created a "Boolean" type column called "is_intercontinental_trip", essentially assessing whether the nationality of the traveler and their destination were within the same continent.
- The data set is not big enouhg to be able to draw confident conclusions about different cities, countries, or nationalities of travelers. Therefore, for each of these three columns we created a new column that clustered them into continents.

# Conclussions
...
- The dataset is limited, therefore, we need to be careful with the conclusions we make.
- Businesses in Europe and need to adjust their pricing for the month of May till September, in order to enjoy a more balanced demand. For Asian businesses, those month would be March, April, and September.
- For months of low demand, do marketing campaigns and promotions, highlighting the low prices that a traveler could benefits for. This way, businesses can create demand and revenue, even in months with historically low demand.
- To better address the needs of Student-Early Professionals, create discount packages for short term trips in Asia, using Hostels as a primary accommodation. For the Young Professionals, create long trip offerings focusing on Asia and Oceania. For Young Parents, focus on weekend getaways in Europe using the train.
- Follow the demand curves, and make sure to buy less of equipment needed during low demand periods, and buy, early in advance, using economies of scale discounts when you expect a high demand.

# Next steps
...
- We would like to find more observations on trvel trip data, especially for older generations, in order to be able to have a more statistically signigicant reserarch outcome, while further curating new products for a larger subset of traveler demographics.
Loading