2021
Public Transportation
Challenge provided by PSE

Model of integrated transport for senior citizens

Understanding senior citizens' needs to improve the public transport system.

Increasing levels of life expectancy, and decreasing levels of fertility are the two leading causes that influence the age structure of the global population, according to the UN World Population Ageing 2020 Highlights. In recent decades, the elderly population has seen a steady increase as a share of the total population, pointing to an estimated 727 million people aged 65 or over worldwide.

With increasing age, it becomes harder to perform certain tasks, such as driving, leading to the elderly increasingly using the public transportation system. With very specific needs and interests, this share of the population is characterized by moving closer to home or within an accessible range by public transport. Simultaneously, they tend to avoid rush hours, so a dense offer of stops is crucial.

Goal

Understand mobility patterns of senior citizens, focusing on providing better public transport conditions and accessibility to their points of interest.

United Nations SDG 

GOAL 11: Sustainable Cities and Communities

  • Target 11.2.1: Provide access to safe, affordable, accessible, and sustainable transport systems for all.

Datasets

The following datasets were provided to the participants:

  • Traffic Intensity Model - the daily average number of senior citizens traveling on road network links between April 2019 and March 2020, provided by PSE.
  • Road segments that are part of the different bus routes, provided by PSE.

Data

In addition to the provided datasets, more data was used by the teams, such as the purchasing power (which can be related to the cost and quality of life), the criminality rate (which, depending on the type of crime, can be a deterrent from using public transport), the dependency index of seniors (because the greater the dependency, the smaller the ability to use public transport) and weather data (precipitation leads to the non-use of buses).

One team used public data about the train routes to complement the provided dataset of bus routes.

Another team suggested adding to the provided data information about the destination county and the reason why they are using public transport to identify if the activity being performed could be a key factor - for example, if there was a county where the elderly used public transport to go to the hospital in another county, then increasing travel times or even creating new lines to that hospital could be a good decision. That same team suggested increasing the granularity of the Traffic Intensity Model to a time scale of hours to detect the peak hours and the coordinates of senior-abundant residential zones to increase the number of lines and/or stops. Increasing the granularity in terms of the city would also enable a much more detailed analysis since it was proven that people have different behaviors across cities. 

Methods and Techniques

Since this challenge had a more descriptive goal, all teams focused on doing an extensive data analysis step. This analysis assessed the correlation between several variables and the average number of senior users of buses for different cities in Portugal. 

One team tried identifying clusters in the data, using K-Means and Agglomerative Clustering, with five dependent variables by the district of origin and the average number of senior users, but could not identify any relevant clusters. That same team tried building a predictive model  using Linear Regression, but it yielded a very low accuracy.

Another team focused extensively on Graph Network analysis to represent mobility between counties. Their analysis considered the population density of a county, the connectivity between countries in terms of public transportation, and the average usage by senior citizens.

One team approached the problem by trying to identify the best possible location for bus stops using GridSearch, considering their distance from points of interest typically associated with the elderly population. A use case was done for healthcare centers, but the team stated that it could be scaled to an even more encompassing set of points of interest, provided that the data was available.

Main Insights from Data

Most teams focused on identifying the variables that influence mobility and the use of buses by the elderly and, in some cases, create a model based on those variables that predict the number of senior people who use buses daily. Unsurprisingly, the variable that seems to be more influential is the number of links and routes that exist - a more robust network, in general, will always lead to more usage. However, in Lisbon, purchasing power, senior independence, and the number of crimes are also influential.

One team found that, on average, the intra-county mobility is bigger than the inter-county mobility (only one county was an exception to this rule), proving that people tend to move more within their county than to another county.

Specifically, in Lisbon, the public transportation network covers the senior mobility location hotspots extremely well, as seen in Figure 1. Most of these locations are, in fact, within 400 meters from a form of public transport, with a few exceptions in the mountain of Monsanto. However, not all counties benefit from such a strong network, and one team described in detail  the advantages and disadvantages of the public transportation network of each county around the city of Lisbon.

Figure 1 - Map showing the senior mobility for the city of Lisbon (in blue) against the bus network offer (in red). 
Open-source code

More about this category

World Data League - a competition for data scientists
World Data League @Copyright 2022