Models

epr.DensityEPR([name, rho, gamma, beta, ...])

Density-EPR model.

epr.SpatialEPR([name, rho, gamma, beta, ...])

Spatial-EPR model.

epr.Ditras(diary_generator[, name, rho, gamma])

Ditras modelling framework.

markov_diary_generator.MarkovDiaryGenerator([name])

Markov Diary Learner and Generator.

gravity.Gravity([deterrence_func_type, ...])

Gravity model.

radiation.Radiation([name])

Radiation model.

geosim.GeoSim([name, rho, gamma, alpha, ...])

GeoSim model.

sts_epr.STS_epr([name, rho, gamma, alpha])

STS-EPR model.

EPR

class skmob.models.epr.DensityEPR(name='Density EPR model', rho=0.6, gamma=0.21, beta=0.8, tau=17, min_wait_time_minutes=20)

Density-EPR model.

The d-EPR model of individual human mobility consists of the following mechanisms [PSRPGB2015] [PSR2016]:

Waiting time choice. The waiting time \(\Delta t\) between two movements of the agent is chosen randomly from the distribution \(P(\Delta t) \sim \Delta t^{−1 −\beta} \exp(−\Delta t/ \tau)\). Parameters \(\beta\) and \(\tau\) correspond to arguments beta and tau of the constructor, respectively.

Action selection. With probability \(P_{new}=\rho S^{-\gamma}\), where \(S\) is the number of distinct locations previously visited by the agent, the agent visits a new location (Exploration phase), otherwise it returns to a previously visited location (Return phase). Parameters \(\rho\) and \(\gamma\) correspond to arguments rho and gamma of the constructor, respectively.

Exploration phase. If the agent that is currently in location \(i\) explores a new location, then the new location \(j \neq i\) is selected according to the gravity model with probability \(p_{ij} = \frac{1}{N} \frac{n_i n_j}{r_{ij}^2}\), where \(n_{i (j)}\) is the location’s relevance, that is, the probability of a population to visit location \(i(j)\), \(r_{ij}\) is the geographic distance between \(i\) and \(j\), and \(N = \sum_{i, j \neq i} p_{ij}\) is a normalization constant. The number of distinct locations visited, \(S\), is increased by 1.

Return phase. If the individual returns to a previously visited location, such a location \(i\) is chosen with probability proportional to the number of time the agent visited \(i\), i.e., \(\Pi_i = f_i\), where \(f_i\) is the visitation frequency of location \(i\).

Parameters
  • name (str, optional) – the name of the instantiation of the d-EPR model. The default value is “Density EPR model”.

  • rho (float, optional) – it corresponds to the parameter \(\rho \in (0, 1]\) in the Action selection mechanism \(P_{new} = \rho S^{-\gamma}\) and controls the agent’s tendency to explore a new location during the next move versus returning to a previously visited location. The default value is \(\rho = 0.6\) [SKWB2010].

  • gamma (float, optional) – it corresponds to the parameter \(\gamma\) (\(\gamma \geq 0\)) in the Action selection mechanism \(P_{new} = \rho S^{-\gamma}\) and controls the agent’s tendency to explore a new location during the next move versus returning to a previously visited location. The default value is \(\gamma=0.21\) [SKWB2010].

  • beta (float, optional) – it corresponds to the parameter \(\beta\) of the waiting time distribution in the Waiting time choice mechanism. The default value is \(\beta=0.8\) [SKWB2010].

  • tau (int, optional) – it corresponds to the parameter \(\tau\) of the waiting time distribution in the Waiting time choice mechanism. The default value is \(\tau = 17\), expressed in hours [SKWB2010].

  • min_wait_time_minutes (int) – minimum waiting time between two movements, in minutes.

Variables
  • name (str) – the name of the instantiation of the model.

  • rho (float) – the input parameter \(\rho\).

  • gamma (float) – the input parameters \(\gamma\).

  • beta (float) – the input parameter \(\beta\).

  • tau (int) – the input parameter \(\tau\).

  • min_wait_time_minutes (int) – the input parameters min_wait_time_minutes.

Examples

>>> import skmob
>>> import pandas as pd
>>> import geopandas as gpd
>>> from skmob.models.epr import DensityEPR
>>> url = >>> url = skmob.utils.constants.NY_COUNTIES_2011
>>> tessellation = gpd.read_file(url)
>>> start_time = pd.to_datetime('2019/01/01 08:00:00')
>>> end_time = pd.to_datetime('2019/01/14 08:00:00')
>>> depr = DensityEPR()
>>> tdf = depr.generate(start_time, end_time, tessellation, relevance_column='population', n_agents=100, show_progress=True)
>>> print(tdf.head())
   uid                   datetime        lat        lng
0    1 2019-01-01 08:00:00.000000  42.780819 -76.823724
1    1 2019-01-01 09:45:58.388540  42.728060 -77.775510
2    1 2019-01-01 10:16:09.406408  42.780819 -76.823724
3    1 2019-01-01 17:13:39.999037  42.852827 -77.299810
4    1 2019-01-01 19:24:27.353379  42.728060 -77.775510
>>> print(tdf.parameters)
{'model': {'class': <function DensityEPR.__init__ at 0x7f548a49cf28>, 'generate': {'start_date': Timestamp('2019-01-01 08:00:00'), 'end_date': Timestamp('2019-01-14 08:00:00'), 'gravity_singly': {}, 'n_agents': 100, 'relevance_column': 'population', 'random_state': None, 'show_progress': True}}}

References

PSRPGB2015

Pappalardo, L., Simini, F. Rinzivillo, S., Pedreschi, D. Giannotti, F. & Barabasi, A. L. (2015) Returners and Explorers dichotomy in human mobility. Nature Communications 6, https://www.nature.com/articles/ncomms9166

PSR2016

Pappalardo, L., Simini, F. Rinzivillo, S. (2016) Human Mobility Modelling: exploration and preferential return meet the gravity model. Procedia Computer Science 83, https://www.sciencedirect.com/science/article/pii/S1877050916302216

SKWB2010

Song, C., Koren, T., Wang, P. & Barabasi, A.L. (2010) Modelling the scaling properties of human mobility. Nature Physics 6, 818-823, https://www.nature.com/articles/nphys1760

See also

EPR, SpatialEPR, Ditras

generate(start_date, end_date, spatial_tessellation, gravity_singly={}, n_agents=1, starting_locations=None, od_matrix=None, relevance_column='relevance', random_state=None, log_file=None, show_progress=False)

Start the simulation of a set of agents at time start_date till time end_date.

Parameters
  • start_date (datetime) – the starting date of the simulation, in “YYY/mm/dd HH:MM:SS” format.

  • end_date (datetime) – the ending date of the simulation, in “YYY/mm/dd HH:MM:SS” format.

  • spatial_tessellation (geopandas GeoDataFrame) – the spatial tessellation, i.e., a division of the territory in locations.

  • gravity_singly ({} or Gravity, optional) – the gravity model (singly constrained) to use when generating the probability to move between two locations (note, used by DensityEPR). The default is “{}”.

  • n_agents (int, optional) – the number of agents to generate. The default is 1.

  • relevance_column (str, optional) – the name of the column in spatial_tessellation to use as relevance variable. The default is “relevance”.

  • starting_locations (list or None, optional) – a list of integers, each identifying the location from which to start the simulation of each agent. Note that, if starting_locations is not None, its length must be equal to the value of n_agents, i.e., you must specify one starting location per agent. The default is None.

  • od_matrix (numpy array or None, optional) – the origin destination matrix to use for deciding the movements of the agent (element [i,j] is the probability of one trip from location with tessellation index i to j, normalized by origin location). If None, it is computed “on the fly” during the simulation. The default is None.

  • random_state (int or None, optional) – if int, it is the seed used by the random number generator; if None, the random number generator is the RandomState instance used by np.random and random.random. The default is None.

  • log_file (str or None, optional) – the name of the file where to write a log of the execution of the model. The logfile will contain all decisions (returns or explorations) made by the model. The default is None.

  • show_progress (boolean, optional) – if True, show a progress bar. The default is False.

Returns

the synthetic trajectories generated by the model

Return type

TrajDataFrame

class skmob.models.epr.Ditras(diary_generator, name='Ditras model', rho=0.3, gamma=0.21)

Ditras modelling framework.

The DITRAS (DIary-based TRAjectory Simulator) modelling framework to simulate the spatio-temporal patterns of human mobility [PS2018]. DITRAS consists of two phases:

Mobility Diary Generation. In the first phase, DITRAS generates a mobility diary which captures the temporal patterns of human mobility.

Trajectory Generation. In the second phase, DITRAS transforms the mobility diary into a mobility trajectory which captures the spatial patterns of human movements.

https://raw.githubusercontent.com/jonpappalord/DITRAS/master/DITRAS_schema.png

Outline of the DITRAS framework. DITRAS combines two probabilistic models: a diary generator (e.g., \(MD(t)\)) and trajectory generator (e.g., d-EPR). The diary generator produces a mobility diary \(D\). The mobility diary \(D\) is the input of the trajectory generator together with a weighted spatial tessellation of the territory \(L\). From \(D\) and \(L\) the trajectory generator produces a synthetic mobility trajectory \(S\).

Parameters
  • diary_generator (MarkovDiaryGenerator) – the diary generator to use for generating the diary.

  • name (str, optional) – the name of the instantiation of the Ditras model. The default value is “Ditras”.

  • rho (float, optional) – it corresponds to the parameter \(\rho \in (0, 1]\) in the Action selection mechanism of the DensityEPR model \(P_{new} = \rho S^{-\gamma}\) and controls the agent’s tendency to explore a new location during the next move versus returning to a previously visited location. The default value is \(\rho = 0.6\) [SKWB2010].

  • gamma (float, optional) – it corresponds to the parameter \(\gamma\) (\(\gamma \geq 0\)) in the Action selection mechanism of the DensityEPR model \(P_{new} = \rho S^{-\gamma}\) and controls the agent’s tendency to explore a new location during the next move versus returning to a previously visited location. The default value is \(\gamma=0.21\) [SKWB2010].

Variables
  • diary_generator (MarkovDiaryGenerator) – the diary generator to use for generating the diary [PS2018].

  • name (str) – the name of the instantiation of the model.

  • rho (float) – the input parameter \(\rho\).

  • gamma (float) – the input parameters \(\gamma\).

Examples

>>> import skmob
>>> from skmob.models.epr import Ditras
>>> from skmob.models.markov_diary_generator import MarkovDiaryGenerator
>>> from skmob.preprocessing import filtering, compression, detection, clustering
>>>
>>> # load and preprocess data to train the MarkovDiaryGenerator
>>> url = skmob.utils.constants.GEOLIFE_SAMPLE
>>> df = pd.read_csv(url, sep=',', compression='gzip')
>>> tdf = skmob.TrajDataFrame(df, latitude='lat', longitude='lon', user_id='user', datetime='datetime')
>>> ctdf = compression.compress(tdf)
>>> stdf = detection.stay_locations(ctdf)
>>> cstdf = clustering.cluster(stdf)
>>>
>>> # instantiate and train the MarkovDiaryGenerator
>>> mdg = MarkovDiaryGenerator()
>>> mdg.fit(cstdf, 2, lid='cluster')
>>>
>>> # set start time, end time and tessellation for the simulation
>>> start_time = pd.to_datetime('2019/01/01 08:00:00')
>>> end_time = pd.to_datetime('2019/01/14 08:00:00')
>>> tessellation = gpd.GeoDataFrame.from_file("data/NY_counties_2011.geojson")
>>>
>>> # instantiate the model
>>> ditras = Ditras(mdg)
>>>
>>> # run the model
>>> ditras_tdf = ditras.generate(start_time, end_time, tessellation, relevance_column='population',
                n_agents=3, od_matrix=None, show_progress=True)
>>> print(ditras_tdf.head())
   uid            datetime        lat        lng
0    1 2019-01-01 08:00:00  43.382528 -78.230656
1    1 2019-01-02 03:00:00  43.309133 -77.680414
2    1 2019-01-02 23:00:00  43.382528 -78.230656
3    1 2019-01-03 10:00:00  43.382528 -78.230656
4    1 2019-01-03 21:00:00  43.309133 -77.680414
>>> print(ditras_tdf.parameters)
{'model': {'class': <function Ditras.__init__ at 0x7f0cf0b7e158>, 'generate': {'start_date': Timestamp('2019-01-01 08:00:00'), 'end_date': Timestamp('2019-01-14 08:00:00'), 'gravity_singly': {}, 'n_agents': 3, 'relevance_column': 'population', 'random_state': None, 'show_progress': True}}}

References

PS2018

Pappalardo, L. & Simini, F. (2018) Data-driven generation of spatio-temporal routines in human mobility. Data Mining and Knowledge Discovery 32, 787-829, https://link.springer.com/article/10.1007/s10618-017-0548-4

See also

DensityEPR, MarkovDiaryGenerator

generate(start_date, end_date, spatial_tessellation, gravity_singly={}, n_agents=1, starting_locations=None, od_matrix=None, relevance_column='relevance', random_state=None, log_file=None, show_progress=False)

Start the simulation of a set of agents at time start_date till time end_date.

Parameters
  • start_date (datetime) – the starting date of the simulation, in “YYY/mm/dd HH:MM:SS” format.

  • end_date (datetime) – the ending date of the simulation, in “YYY/mm/dd HH:MM:SS” format.

  • spatial_tessellation (geopandas GeoDataFrame) – the spatial tessellation, i.e., a division of the territory in locations.

  • gravity_singly ({} or Gravity, optional) – the (singly constrained) gravity model to use when generating the probability to move between two locations. The default is “{}”.

  • n_agents (int, optional) – the number of agents to generate. The default is 1.

  • starting_locations (list or None, optional) – a list of integers, each identifying the location from which to start the simulation of each agent. Note that, if starting_locations is not None, its length must be equal to the value of n_agents, i.e., you must specify one starting location per agent. The default is None.

  • od_matrix (numpy array or None, optional) – the origin destination matrix to use for deciding the movements of the agent (element [i,j] is the probability of one trip from location with tessellation index i to j, normalized by origin location). If None, it is computed “on the fly” during the simulation. The default is None.

  • relevance_column (str, optional) – the name of the column in spatial_tessellation to use as relevance variable. The default is “relevance”.

  • random_state (int or None, optional) – if int, it is the seed used by the random number generator; if None, the random number generator is the RandomState instance used by np.random and random.random. The default is None.

  • log_file (str or None, optional) – the name of the file where to write a log of the execution of the model. The logfile will contain all decisions (returns or explorations) made by the model. The default is None.

  • show_progress (boolean, optional) – if True, show a progress bar. The default is False.

Returns

the synthetic trajectories generated by the model

Return type

TrajDataFrame

class skmob.models.epr.SpatialEPR(name='Spatial EPR model', rho=0.6, gamma=0.21, beta=0.8, tau=17, min_wait_time_minutes=20)

Spatial-EPR model.

The s-EPR model of individual human mobility consists of the following mechanisms [PSRPGB2015] [PSR2016] [SKWB2010]:

Waiting time choice. The waiting time \(\Delta t\) between two movements of the agent is chosen randomly from the distribution \(P(\Delta t) \sim \Delta t^{−1 −\beta} \exp(−\Delta t/ \tau)\). Parameters \(\beta\) and \(\tau\) correspond to arguments beta and tau of the constructor, respectively.

Action selection. With probability \(P_{new}=\rho S^{-\gamma}\), where \(S\) is the number of distinct locations previously visited by the agent, the agent visits a new location (Exploration phase), otherwise it returns to a previously visited location (Return phase). Parameters \(\rho\) and \(\gamma\) correspond to arguments rho and gamma of the constructor, respectively.

Exploration phase. If the agent that is currently in location \(i\) explores a new location, then the new location \(j \neq i\) is selected according to the distance from the current location \(p_{ij} = \frac{1}{r_{ij}^2}\), where \(r_{ij}\) is the geographic distance between \(i\) and \(j\). The number of distinct locations visited, \(S\), is increased by 1.

Return phase. If the individual returns to a previously visited location, such a location \(i\) is chosen with probability proportional to the number of time the agent visited \(i\), i.e., \(\Pi_i = f_i\), where \(f_i\) is the visitation frequency of location \(i\).

https://media.springernature.com/full/springer-static/image/art%3A10.1038%2Fnphys1760/MediaObjects/41567_2010_Article_BFnphys1760_Fig2_HTML.jpg?as=webp

Starting at time \(t\) from the configuration shown in the left panel, indicating that the user visited previously \(S=4\) locations with frequency \(f_i\) that is proportional to the size of circles drawn at each location, at time \(t + \Delta t\) (with \(Delta t\) drawn from the \(P(\Delta t)\) fat-tailed distribution) the user can either visit a new location at distance \(\Delta r\) from his/her present location, or return to a previously visited location with probability \(P_{ret}=\rho S^{-\gamma}\), where the next location will be chosen with probability \(\Pi_i=f_i\) (preferential return; lower panel). Figure from [SKWB2010].

Parameters
  • name (str, optional) – the name of the instantiation of the s-EPR model. The default value is “Spatial EPR model”.

  • rho (float, optional) – it corresponds to the parameter \(\rho \in (0, 1]\) in the Action selection mechanism \(P_{new} = \rho S^{-\gamma}\) and controls the agent’s tendency to explore a new location during the next move versus returning to a previously visited location. The default value is \(\rho = 0.6\) [SKWB2010].

  • gamma (float, optional) – it corresponds to the parameter \(\gamma\) (\(\gamma \geq 0\)) in the Action selection mechanism \(P_{new} = \rho S^{-\gamma}\) and controls the agent’s tendency to explore a new location during the next move versus returning to a previously visited location. The default value is \(\gamma=0.21\) [SKWB2010].

  • beta (float, optional) – it corresponds to the parameter \(\beta\) of the waiting time distribution in the Waiting time choice mechanism. The default value is \(\beta=0.8\) [SKWB2010].

  • tau (int, optional) – it corresponds to the parameter \(\tau\) of the waiting time distribution in the Waiting time choice mechanism. The default value is \(\tau = 17\), expressed in hours [SKWB2010].

  • min_wait_time_minutes (int) – the input parameters min_wait_time_minutes

Variables
  • name (str) – the name of the instantiation of the model.

  • rho (float) – the input parameter \(\rho\).

  • gamma (float) – the input parameters \(\gamma\).

  • beta (float) – the input parameter \(\beta\).

  • tau (int) – the input parameter \(\tau\).

  • min_wait_time_minutes (int) – the input parameters min_wait_time_minutes.

Examples

>>> import skmob
>>> import pandas as pd
>>> import geopandas as gpd
>>> from skmob.models.epr import SpatialEPR
>>> url = >>> url = skmob.utils.constants.NY_COUNTIES_2011
>>> tessellation = gpd.read_file(url)
>>> start_time = pd.to_datetime('2019/01/01 08:00:00')
>>> end_time = pd.to_datetime('2019/01/14 08:00:00')
>>> sepr = SpatialEPR()
>>> tdf = sepr.generate(start_time, end_time, tessellation, n_agents=100, show_progress=True)
>>> print(tdf.head())
   uid                   datetime        lat        lng
0    1 2019-01-01 08:00:00.000000  42.267915 -77.383591
1    1 2019-01-01 13:06:13.973868  42.633510 -77.105324
2    1 2019-01-01 14:17:41.188668  42.452018 -76.473618
3    1 2019-01-01 14:49:30.896248  42.633510 -77.105324
4    1 2019-01-01 15:10:54.133150  43.382528 -78.230656
>>> print(tdf.parameters)
{'model': {'class': <function SpatialEPR.__init__ at 0x7f548a49e048>, 'generate': {'start_date': Timestamp('2019-01-01 08:00:00'), 'end_date': Timestamp('2019-01-14 08:00:00'), 'gravity_singly': {}, 'n_agents': 100, 'relevance_column': None, 'random_state': None, 'show_progress': True}}}

See also

EPR, DensityEPR, Ditras

generate(start_date, end_date, spatial_tessellation, gravity_singly={}, n_agents=1, starting_locations=None, od_matrix=None, random_state=None, log_file=None, show_progress=False)

Start the simulation of a set of agents at time start_date till time end_date.

Parameters
  • start_date (datetime) – the starting date of the simulation, in “YYY/mm/dd HH:MM:SS” format.

  • end_date (datetime) – the ending date of the simulation, in “YYY/mm/dd HH:MM:SS” format.

  • spatial_tessellation (geopandas GeoDataFrame) – the spatial tessellation, i.e., a division of the territory in locations.

  • gravity_singly ({} or Gravity, optional) – the gravity model (singly constrained) to use when generating the probability to move between two locations (note, used by DensityEPR). The default is “{}”.

  • n_agents (int, optional) – the number of agents to generate. The default is 1.

  • starting_locations (list or None, optional) – a list of integers, each identifying the location from which to start the simulation of each agent. Note that, if starting_locations is not None, its length must be equal to the value of n_agents, i.e., you must specify one starting location per agent. The default is None.

  • od_matrix (numpy array or None, optional) – the origin destination matrix to use for deciding the movements of the agent (element [i,j] is the probability of one trip from location with tessellation index i to j, normalized by origin location). If None, it is computed “on the fly” during the simulation. The default is None.

  • random_state (int or None, optional) – if int, it is the seed used by the random number generator; if None, the random number generator is the RandomState instance used by np.random and random.random. The default is None.

  • log_file (str or None, optional) – the name of the file where to write a log of the execution of the model. The logfile will contain all decisions (returns or explorations) made by the model. The default is None.

  • show_progress (boolean, optional) – if True, show a progress bar. The default is False.

Returns

the synthetic trajectories generated by the model

Return type

TrajDataFrame

Markov Diary Generator

class skmob.models.markov_diary_generator.MarkovDiaryGenerator(name='Markov diary')

Markov Diary Learner and Generator.

A diary generator \(G\) produces a mobility diary, \(D(t)\), containing the sequence of trips made by an agent during a time period divided in time slots of \(t\) seconds. For example, \(G(3600)\) and \(G(60)\) produce mobility diaries with temporal resolutions of one hour and one minute, respectively [PS2018].

A Mobility Diary Learner (MDL) is a data-driven algorithm to compute a mobility diary \(MD\) from the mobility trajectories of a set of real individuals. We use a Markov model to describe the probability that an individual follows her routine and visits a typical location at the usual time, or she breaks the routine and visits another location. First, MDL translates mobility trajectory data of real individuals into abstract mobility trajectories. Second, it uses the obtained abstract trajectory data to compute the transition probabilities of the Markov model \(MD(t)\) [PS2018].

Parameters

name (str, optional) – name of the instantiation of the class. The default is “Markov diary”.

Variables
  • name (str) – name of the instantiation of the class.

  • markov_chain (dict) – the trained markov chain.

  • time_slot_length (str) – length of the time slot (1h).

Examples

>>> import skmob
>>> import pandas as pd
>>> import geopandas as gpd
>>> from skmob.models.epr import Ditras
>>> from skmob.models.markov_diary_generator import MarkovDiaryGenerator
>>> from skmob.preprocessing import filtering, compression, detection, clustering
>>> url = skmob.utils.constants.GEOLIFE_SAMPLE
>>>
>>> df = pd.read_csv(url, sep=',', compression='gzip')
>>> tdf = skmob.TrajDataFrame(df, latitude='lat', longitude='lon', user_id='user', datetime='datetime')
>>>
>>> ctdf = compression.compress(tdf)
>>> stdf = detection.stops(ctdf)
>>> cstdf = clustering.cluster(stdf)
>>>
>>> mdg = MarkovDiaryGenerator()
>>> mdg.fit(cstdf, 2, lid='cluster')
>>>
>>> start_time = pd.to_datetime('2019/01/01 08:00:00')
>>> diary = mdg.generate(100, start_time)
>>> print(diary)
             datetime  abstract_location
0 2019-01-01 08:00:00                  0
1 2019-01-02 19:00:00                  1
2 2019-01-02 20:00:00                  0
3 2019-01-03 17:00:00                  1
4 2019-01-03 18:00:00                  2
5 2019-01-04 08:00:00                  0
6 2019-01-05 03:00:00                  1

References

PS2018

Pappalardo, L. & Simini, F. (2018) Data-driven generation of spatio-temporal routines in human mobility. Data Mining and Knowledge Discovery 32, 787-829, https://link.springer.com/article/10.1007/s10618-017-0548-4

See also

Ditras

fit(traj, n_individuals, lid='location')

Train the markov mobility diary from real trajectories.

Parameters
  • traj (TrajDataFrame) – the trajectories of the individuals.

  • n_individuals (int) – the number of individuals in the TrajDataFrame to consider.

  • lid (string, optional) – the name of the column containing the identifier of the location. The default is “location”.

generate(diary_length, start_date, random_state=None)

Start the generation of the mobility diary.

Parameters
  • diary_length (int) – the length of the diary in hours.

  • start_date (datetime) – the starting date of the generation.

Returns

the generated mobility diary.

Return type

pandas DataFrame

Gravity

class skmob.models.gravity.Gravity(deterrence_func_type='power_law', deterrence_func_args=[-2.0], origin_exp=1.0, destination_exp=1.0, gravity_type='singly constrained', name='Gravity model')

Gravity model.

The Gravity model of human migration. In its original formulation, the probability \(T_{ij}\) of moving from a location \(i\) to location \(j\) is defined as [Z1946]:

\[T_{ij} \propto \frac{P_i P_j}{r_{ij}}\]

where \(P_i\) and \(P_j\) are the population of location \(i\) and \(j\) and \(r_{ij}\) is the distance between \(i\) and \(j\). The basic assumptions of this model are that the number of trips leaving location \(i\) is proportional to its population \(P_i\), the attractivity of location \(j\) is also proportional to \(P_j\), and finally, that there is a cost effect in terms of distance traveled. These ideas can be generalized assuming a relation of the type [BBGJLLMRST2018]:

\[T_{ij} = K m_i m_j f(r_{ij})\]

where \(K\) is a constant, the masses \(m_i\) and \(m_j\) relate to the number of trips leaving location \(i\) or the ones attracted by location \(j\), and \(f(r_{ij})\), called deterrence function, is a decreasing function of distance. The deterrence function \(f(r_{ij})\) is commonly modeled with a powerlaw or an exponential form.

Constrained gravity models. Some of the limitations of the gravity model can be resolved via constrained versions. For example, one may hold the number of people originating from a location \(i\) to be a known quantity \(O_i\), and the gravity model is then used to estimate the destination, constituting a so-called singly constrained gravity model of the form:

\[T_{ij} = K_i O_i m_j f(r_{ij}) = O_i \frac{m_i f(r_{ij})}{\sum_k m_k f(r_{ik})}.\]

In this formulation, the proportionality constants \(K_i\) depend on the location of the origin and its distance to the other places considered. We can fix also the total number of travelers arriving at a destination \(j\) as \(D_j = \sum_i T_{ij}\), leading to a doubly-constrained gravity model. For each origin-destination pair, the flow is calculated as:

\[T_{ij} = K_i O_i L_j D_j f(r_{ij})\]

where there are now two flavors of proportionality constants

\[K_i = \frac{1}{\sum_j L_j D_j f(r_{ij})}, L_j = \frac{1}{\sum_i K_i O_i f(r_{ij})}.\]
Parameters
  • deterrence_func_type (str, optional) – the deterrence function to use. The possible deterrence function are “power_law” and “exponential”. The default is “power_law”.

  • deterrence_func_args (list, optional) – the arguments of the deterrence function. The default is [-2.0].

  • origin_exp (float, optional) – the exponent of the origin’s relevance (only relevant to globally-constrained model). The default is 1.0.

  • destination_exp (float, optional) – the explonent of the destination’s relevance. The default is 1.0.

  • gravity_type (str, optional) – the type of gravity model. Possible values are “singly constrained” and “globally constrained”. The default is “singly constrained”.

  • name (str, optional) – the name of the instantiation of the Gravity model. The default is “Gravity model”.

Variables
  • deterrence_func_type (str) – the deterrence function to use. The possible deterrence function are “power_law” and “exponential”.

  • deterrence_func_args (list) – the arguments of the deterrence function.

  • origin_exp (float) – the exponent of the origin’s relevance (only relevant to globally-constrained model).

  • destination_exp (float) – the explonent of the destination’s relevance.

  • gravity_type (str) – the type of gravity model. Possible values are “singly constrained” and “globally constrained”.

  • name (str) – the name of the instantiation of the Gravity model.

Examples

>>> import skmob
>>> from skmob.utils import utils, constants
>>> import pandas as pd
>>> import geopandas as gpd
>>> import numpy as np
>>> from skmob.models import Gravity
>>> # load a spatial tessellation
>>> url_tess = skmob.utils.constants.NY_COUNTIES_2011
>>> tessellation = gpd.read_file(url_tess).rename(columns={'tile_id': 'tile_ID'})
>>> print(tessellation.head())
  tile_ID  population                                           geometry
0   36019       81716  POLYGON ((-74.006668 44.886017, -74.027389 44....
1   36101       99145  POLYGON ((-77.099754 42.274215, -77.0996569999...
2   36107       50872  POLYGON ((-76.25014899999999 42.296676, -76.24...
3   36059     1346176  POLYGON ((-73.707662 40.727831, -73.700272 40....
4   36011       79693  POLYGON ((-76.279067 42.785866, -76.2753479999...
>>> # load real flows into a FlowDataFrame
>>> fdf = skmob.FlowDataFrame.from_file(skmob.utils.constants.NY_FLOWS_2011,
                                        tessellation=tessellation,
                                        tile_id='tile_ID',
                                        sep=",")
>>> print(fdf.head())
     flow origin destination
0  121606  36001       36001
1       5  36001       36005
2      29  36001       36007
3      11  36001       36017
4      30  36001       36019
>>> # compute the total outflows from each location of the tessellation (excluding self loops)
>>> tot_outflows = fdf[fdf['origin'] != fdf['destination']].groupby(by='origin', axis=0)[['flow']].sum().fillna(0)
>>> tessellation = tessellation.merge(tot_outflows, left_on='tile_ID', right_on='origin').rename(columns={'flow': constants.TOT_OUTFLOW})
>>> print(tessellation.head())
  tile_id  population                                           geometry      0   36019       81716  POLYGON ((-74.006668 44.886017, -74.027389 44....
1   36101       99145  POLYGON ((-77.099754 42.274215, -77.0996569999...
2   36107       50872  POLYGON ((-76.25014899999999 42.296676, -76.24...
3   36059     1346176  POLYGON ((-73.707662 40.727831, -73.700272 40....
4   36011       79693  POLYGON ((-76.279067 42.785866, -76.2753479999...
   tot_outflow
0        29981
1         5319
2       295916
3         8665
4         8871
>>> # instantiate a singly constrained Gravity model
>>> gravity_singly = Gravity(gravity_type='singly constrained')
>>> print(gravity_singly)
Gravity(name="Gravity model", deterrence_func_type="power_law", deterrence_func_args=[-2.0], origin_exp=1.0, destination_exp=1.0, gravity_type="singly constrained")
>>> np.random.seed(0)
>>> synth_fdf = gravity_singly.generate(tessellation,
                                   tile_id_column='tile_ID',
                                   tot_outflows_column='tot_outflow',
                                   relevance_column= 'population',
                                   out_format='flows')
>>> print(synth_fdf.head())
  origin destination  flow
0  36019       36101   101
1  36019       36107    66
2  36019       36059  1041
3  36019       36011   151
4  36019       36123    33
>>> # fit the parameters of the Gravity model from real fluxes
>>> gravity_singly_fitted = Gravity(gravity_type='singly constrained')
>>> print(gravity_singly_fitted)
Gravity(name="Gravity model", deterrence_func_type="power_law", deterrence_func_args=[-2.0], origin_exp=1.0, destination_exp=1.0, gravity_type="singly constrained")
>>> gravity_singly_fitted.fit(fdf, relevance_column='population')
>>> print(gravity_singly_fitted)
Gravity(name="Gravity model", deterrence_func_type="power_law", deterrence_func_args=[-1.9947152031914186], origin_exp=1.0, destination_exp=0.6471759552223144, gravity_type="singly constrained")
>>> np.random.seed(0)
>>> synth_fdf_fitted = gravity_singly_fitted.generate(tessellation,
                                                        tile_id_column='tile_ID',
                                                        tot_outflows_column='tot_outflow',
                                                        relevance_column= 'population',
                                                        out_format='flows')
>>> print(synth_fdf_fitted.head())
  origin destination  flow
0  36019       36101   102
1  36019       36107    66
2  36019       36059  1044
3  36019       36011   152
4  36019       36123    33

References

Z1946

Zipf, G. K. (1946) The P 1 P 2/D hypothesis: on the intercity movement of persons. American sociological review 11(6), 677-686, https://www.jstor.org/stable/2087063?seq=1#metadata_info_tab_contents

W1971

Wilson, A. G. (1971) A family of spatial interaction models, and associated developments. Environment and Planning A 3(1), 1-32, https://econpapers.repec.org/article/pioenvira/v_3a3_3ay_3a1971_3ai_3a1_3ap_3a1-32.htm

BBGJLLMRST2018

Barbosa, H., Barthelemy, M., Ghoshal, G., James, C. R., Lenormand, M., Louail, T., Menezes, R., Ramasco, J. J. , Simini, F. & Tomasini, M. (2018) Human mobility: Models and applications. Physics Reports 734, 1-74, https://www.sciencedirect.com/science/article/abs/pii/S037015731830022X

See also

Radiation

fit(flow_df, relevance_column='relevance')

Fit the parameters of the Gravity model to the flows provided in input, using a Generalized Linear Model (GLM) with a Poisson regression [FM1982].

Parameters
  • flow_df (FlowDataFrame) – the real flows on the spatial tessellation.

  • relevance_column (str, optional) – the column in the spatial tessellation with the relevance of the location. The default is constants.RELEVANCE.

References

FM1982

Flowerdew, R. & Murray, A. (1982) A method of fitting the gravity model based on the Poisson distribution. Journal of regional science 22(2), 191-202, https://onlinelibrary.wiley.com/doi/abs/10.1111/j.1467-9787.1982.tb00744.x

generate(spatial_tessellation, tile_id_column='tile_ID', tot_outflows_column='tot_outflow', relevance_column='relevance', out_format='flows')

Start the simulation of the Gravity model.

Parameters
  • spatial_tessellation (GeoDataFrame) – the spatial tessellation on which to run the simulation.

  • tile_id_column (str, optional) – the column in spatial_tessellation of the location identifier. The default is constants.TILE_ID.

  • tot_outflows_column (str, optional) – the column in spatial_tessellation with the outflow of the location. The default is constants.TOT_OUTFLOW.

  • relevance_column (str, optional) – the column in spatial_tessellation with the relevance of the location. The default is constants.RELEVANCE.

  • out_format (str, optional) – the format of the generated flows. Possible values are “flows” (average flow between two locations), “flows_sample” (random sample of flows), and “probabilities” (probability of a unit flow between two locations). The default is “flows”.

Returns

the flows generated by the Gravity model.

Return type

FlowDataFrame

Radiation

class skmob.models.radiation.Radiation(name='Radiation model')

Radiation model.

The radiation model for human migration. The radiation model assumes that the choice of a traveler’s destination consists of two steps. First, each opportunity in every location is assigned a fitness represented by a number \(z\), chosen from some distribution \(P(z)\) whose value represents the quality of the opportunity for the traveler. Second, the traveler ranks all opportunities according to their distances from the origin location and chooses the closest opportunity with a fitness higher than the traveler’s fitness threshold, which is another random number extracted from the fitness distribution \(P(z)\). As a result, the average number of travelers from location \(i\) to location \(j\) takes the form [SGMB2012]:

\[T_{ij} = O_i \frac{1}{1 - \frac{m_i}{M}}\frac{m_i m_j}{(m_i + s_{ij})(m_i + m_j + s_{ij})}.\]

The destination of the \(O_i\) trips originating in \(i\) is sampled from a distribution of probabilities that a trip originating in \(i\) ends in location \(j\). This probability depends on the number of opportunities at the origin \(m_i\), at the destination \(m_j\) and the number of opportunities \(s_{ij}\) in a circle of radius \(r_{ij}\) centered in \(i\) (excluding the source and destination). This conditional probability needs to be normalized so that the probability that a trip originating in the region of interest ends in this region is equal to 1. In case of a finite system it is possible to show that this is equal to \(1 - \frac{m_i}{M}\) where \(M=\sum_i m_i\) is the total number of opportunities. In the original version of the radiation model, the number of opportunities is approximated by the population, but the total inflows \(D_j\) to each destination can also be used.

https://media.springernature.com/full/springer-static/image/art%3A10.1038%2Fnature10856/MediaObjects/41586_2012_Article_BFnature10856_Fig1_HTML.jpg?as=webp

(a) To demonstrate the limitations of the gravity law we highlight two pairs of counties, one in Utah (UT) and the other in Alabama (AL), with similar origin (\(m\), blue) and destination (\(n\), green) populations and comparable distance \(r\) between them (see bottom left table). The US census 2000 reports a flux that is an order of magnitude greater between the Utah counties, a difference correctly captured by the radiation model (b, c). (b) The definition of the radiation model: an individual (for example, living in Saratoga County, New York) applies for jobs in all counties and collects potential employment offers. The number of job opportunities in each county (\(j\)) is \(n_j / n_{jobs}\), chosen to be proportional to the resident population \(n_j\). Each offer’s attractiveness (benefit) is represented by a random variable with distribution \(P(z)\), the numbers placed in each county representing the best offer among the \(n_j / n_{jobs}\) trials in that area. Each county is marked in green (red) if its best offer is better (lower) than the best offer in the home county (here \(z = 10\)). (c) An individual accepts the closest job that offers better benefits than his home county. In the shown configuration the individual will commute to Oneida County, New York, the closest county whose benefit \(z = 13\) exceeds the home county benefit \(z = 10\). This process is repeated for each potential commuter, choosing new benefit variables \(z\) in each case. Figure from [SGMB2012].

Parameters

name (str, optional) – the name of the instantiation of the radiation model. The default is ‘Radiation model’.

Variables

name (str) – the name of the instantiation of the model.

Examples

>>> import skmob
>>> from skmob.utils import utils, constants
>>> import pandas as pd
>>> import geopandas as gpd
>>> import numpy as np
>>> from skmob.models import Radiation
>>> # load a spatial tessellation
>>> url_tess = skmob.utils.constants.NY_COUNTIES_2011
>>> tessellation = gpd.read_file(url_tess).rename(columns={'tile_id': 'tile_ID'})
>>> print(tessellation.head())
  tile_ID  population                                           geometry
0   36019       81716  POLYGON ((-74.006668 44.886017, -74.027389 44....
1   36101       99145  POLYGON ((-77.099754 42.274215, -77.0996569999...
2   36107       50872  POLYGON ((-76.25014899999999 42.296676, -76.24...
3   36059     1346176  POLYGON ((-73.707662 40.727831, -73.700272 40....
4   36011       79693  POLYGON ((-76.279067 42.785866, -76.2753479999...
>>> # load real flows into a FlowDataFrame
>>> fdf = skmob.FlowDataFrame.from_file(skmob.utils.constants.NY_FLOWS_2011,
                                        tessellation=tessellation,
                                        tile_id='tile_ID',
                                        sep=",")
>>> print(fdf.head())
     flow origin destination
0  121606  36001       36001
1       5  36001       36005
2      29  36001       36007
3      11  36001       36017
4      30  36001       36019
>>> # compute the total outflows from each location of the tessellation (excluding self loops)
>>> tot_outflows = fdf[fdf['origin'] != fdf['destination']].groupby(by='origin', axis=0)[['flow']].sum().fillna(0)
>>> tessellation = tessellation.merge(tot_outflows, left_on='tile_ID', right_on='origin').rename(columns={'flow': constants.TOT_OUTFLOW})
>>> print(tessellation.head())
  tile_id  population                                           geometry      0   36019       81716  POLYGON ((-74.006668 44.886017, -74.027389 44....
1   36101       99145  POLYGON ((-77.099754 42.274215, -77.0996569999...
2   36107       50872  POLYGON ((-76.25014899999999 42.296676, -76.24...
3   36059     1346176  POLYGON ((-73.707662 40.727831, -73.700272 40....
4   36011       79693  POLYGON ((-76.279067 42.785866, -76.2753479999...
   tot_outflow
0        29981
1         5319
2       295916
3         8665
4         8871
>>> np.random.seed(0)
>>> radiation = Radiation()
>>> rad_flows = radiation.generate(tessellation, tile_id_column='tile_ID',  tot_outflows_column='tot_outflow', relevance_column='population', out_format='flows_sample')
>>> print(rad_flows.head())
  origin destination   flow
0  36019       36033  11648
1  36019       36031   4232
2  36019       36089   5598
3  36019       36113   1596
4  36019       36041    117

References

SGMB2012(1,2)

Simini, F., Gonzàlez, M. C., Maritan, A. & Barabasi, A.-L. (2012) A universal model for mobility and migration patterns. Nature 484(7392), 96-100, https://www.nature.com/articles/nature10856

MSJB2013

Masucci, A. P., Serras, J., Johansson, A., & Batty, M. (2013). Gravity versus radiation models: On the importance of scale and heterogeneity in commuting flows. Physical Review E, 88(2), 022812.

generate(spatial_tessellation, tile_id_column='tile_ID', tot_outflows_column='tot_outflow', relevance_column='relevance', out_format='flows')

Start the simulation of the Radiation model.

Parameters
  • spatial_tessellation (GeoDataFrame) – the spatial tessellation on which to perform the simulation.

  • tile_id_column (str, optional) – the column in spatial_tessellation of the location identifier. The default is constants.TILE_ID.

  • tot_outflows_column (str, optional) – the column in spatial_tessellation with the outflow of the location. The default is constants.TOT_OUTFLOW.

  • relevance_column (str, optional) – the column in spatial_tessellation with the relevance of the location. The default is constants.RELEVANCE.

  • out_format (str, optional) – the format of the generated flows. Possible values are: “flows” (average flow between two locations), “flows_sample” (random sample of flows), and “probabilities” (probability of a unit flow between two locations). The default is “flows”.

Returns

the fluxes generated by the Radiation model.

Return type

FlowDataFrame

GeoSim

class skmob.models.geosim.GeoSim(name='GeoSim', rho=0.6, gamma=0.21, alpha=0.2, beta=0.8, tau=17, min_wait_time_hours=1)

GeoSim model.

The GeoSim model of individual human mobility consists of the following mechanisms [THSG2015]:

Waiting time choice. The waiting time \(\Delta t\) between two movements of the agent is chosen randomly from the distribution \(P(\Delta t) \sim \Delta t^{−1 −\beta} \exp(−\Delta t/ \tau)\). Parameters \(\beta\) and \(\tau\) correspond to arguments beta and tau of the constructor, respectively.

Action selection. With probability \(P_{exp}=\rho S^{-\gamma}\), where \(S\) is the number of distinct locations previously visited by the agent, the agent visits a new location (Exploration), otherwise with a complementary probability \(P_{ret}=1-P{exp}\) it returns to a previously visited location (Return). At that point, the agent determines whether or not the location’s choice will be affected by the other agents; with a probability \(\alpha\), the agent’s social contacts influence its movement (Social). With a complementary probability of \(1-\alpha\), the agent’s choice is not influenced by the other agents (Individual).

Parameters \(\rho\), \(\gamma\), and \(\alpha=\) correspond to arguments rho, gamma, and alpha of the constructor, respectively.

After the selection of the spatial mechanism (Exploration or Return) and the social mechanism (Individual or Social) decides which location will be the destination of its next displacement during the Location selection phase. For an agent \(a\), we denote the sets containing the indices of the locations \(a\) can explore or return, as \(exp_{a}\) and \(ret_{a}\), respectively.

Individual Exploration. If the agent \(a\) is currently in location \(i\), and explores a new location without the influence of its social contacts, then the new location \(j \neq i\) is an unvisited location for the agent (\(i \in exp_{a}\)) and it is selected with probability proportional to \(\frac{1}{|exp_{a}|}\). The number of distinct locations visited, \(S\), is increased by 1.

Social Exploration. If the agent \(a\) is currently in location \(i\), and explores a new location with the influence of a social contact, it first selects a social contact \(c\) with probability \(p(c) \propto mob_{sim}(a,c)\) [THSG2015]. At this point, the agent \(a\) explores an unvisited location for agent \(a\) that was visited by agent \(c\), i.e., the location \(j \neq i\) is selected from set \(A = exp_a \cap ret_c\); the probability \(p(j)\) for a location \(j \in A\), to be selected is proportional to \(\Pi_j = f_j\), where \(f_j\) is the visitation frequency of location \(j\) for the agent \(c\). The number of distinct locations visited, \(S\), is increased by 1.

Individual Return. If the agent \(a\), currently at location \(i\), returns to a previously visited location \(j \in ret_a\), it is chosen with probability proportional to the number of time the agent visited \(j\), i.e., \(\Pi_j = f_j\), where \(f_j\) is the visitation frequency of location \(j\).

Social Return. If the agent \(a\) is currently in location \(i\), and returns to a previously visited location with the influence of a social contact, it first selects a social contact \(c\) with probability \(p(c) \propto mob_{sim}(a,c)\) [THSG2015]. At this point, the agent \(a\) returns to a previously visited location for agent \(a\) that was visited by agent \(c\) too, i.e., the location \(j \neq i\) is selected from set \(A = ret_a \cap ret_c\); the probability \(p(j)\) for a location \(j \in A\), to be selected is proportional to \(\Pi_j = f_j\), where \(f_j\) is the visitation frequency of location \(j\) for the agent \(c\).

Parameters
  • name (str, optional) – the name of the instantiation of the GeoSim model. The default value is “GeoSim”.

  • rho (float, optional) – it corresponds to the parameter \(\rho \in (0, 1]\) in the Action selection mechanism \(P_{exp} = \rho S^{-\gamma}\) and controls the agent’s tendency to explore a new location during the next move versus returning to a previously visited location. The default value is \(\rho = 0.6\) [SKWB2010].

  • gamma (float, optional) – it corresponds to the parameter \(\gamma\) (\(\gamma \geq 0\)) in the Action selection mechanism \(P_{exp} = \rho S^{-\gamma}\) and controls the agent’s tendency to explore a new location during the next move versus returning to a previously visited location. The default value is \(\gamma=0.21\) [SKWB2010].

  • alpha (float, optional) – it corresponds to the parameter alpha in the Action selection mechanism and controls the influence of the social contacts for an agent during its location selection phase. The default value is \(\alpha=0.2\) [THSG2015].

  • beta (float, optional) – it corresponds to the parameter \(\beta\) of the waiting time distribution in the Waiting time choice mechanism. The default value is \(\beta=0.8\) [SKWB2010].

  • tau (int, optional) – it corresponds to the parameter \(\tau\) of the waiting time distribution in the Waiting time choice mechanism. The default value is \(\tau = 17\), expressed in hours [SKWB2010].

  • min_wait_time_minutes (int) – minimum waiting time between two movements, in hours.

Variables
  • name (str) – the name of the instantiation of the model.

  • rho (float) – the input parameter \(\rho\).

  • gamma (float) – the input parameters \(\gamma\).

  • alpha (float) – the input parameter \(\alpha\).

  • beta (float) – the input parameter \(\beta\).

  • tau (int) – the input parameter \(\tau\).

  • min_wait_time_minutes (int) – the input parameters min_wait_time_minutes.

References

PSRPGB2015

Pappalardo, L., Simini, F. Rinzivillo, S., Pedreschi, D. Giannotti, F. & Barabasi, A. L. (2015) Returners and Explorers dichotomy in human mobility. Nature Communications 6, https://www.nature.com/articles/ncomms9166

PSR2016

Pappalardo, L., Simini, F. Rinzivillo, S. (2016) Human Mobility Modelling: exploration and preferential return meet the gravity model. Procedia Computer Science 83, https://www.sciencedirect.com/science/article/pii/S1877050916302216

SKWB2010

Song, C., Koren, T., Wang, P. & Barabasi, A.L. (2010) Modelling the scaling properties of human mobility. Nature Physics 6, 818-823, https://www.nature.com/articles/nphys1760

THSG2015

Toole, Jameson & Herrera-Yague, Carlos & Schneider, Christian & Gonzalez, Marta C.. (2015). Coupling Human Mobility and Social Ties. Journal of the Royal Society, Interface / the Royal Society. 12. 10.1098/rsif.2014.1128.

See also

EPR, SpatialEPR, Ditras

action_correction(agent, choice)

The implementation of the action-correction phase, executed by an agent if the location selection phase does not allow movements in any location

cosine_similarity(x, y)

Cosine Similarity (x,y) = <x,y>/(||x||*||y||)

generate(start_date, end_date, spatial_tessellation, social_graph='random', n_agents=500, dt_update_mobSim=168, indipendency_window=0.5, random_state=None, log_file=None, verbose=0, show_progress=False)

Start the simulation of a set of agents at time start_date till time end_date.

Parameters
  • start_date (datetime) – the starting date of the simulation, in “YYY/mm/dd HH:MM:SS” format.

  • end_date (datetime) – the ending date of the simulation, in “YYY/mm/dd HH:MM:SS” format.

  • spatial_tessellation (pandas DataFrame or geopandas GeoDataFrame) – the spatial tessellation, i.e., a division of the territory in locations.

  • social_graph (“random” or an edge list) – the social graph describing the sociality of the agents. The default is “random”.

  • n_agents (int, optional) – the number of agents to generate. If social_graph is “random”, n_agents are initialized and connected, otherwise the number of agents is inferred from the edge list. The default is 500.

  • random_state (int or None, optional) – if int, it is the seed used by the random number generator; if None, the random number generator is the RandomState instance used by np.random and random.random. The default is None.

  • dt_update_mobSim (float, optional) – the time interval (in hours) that specifies how often to update the weights of the social graph. The default is 24*7=168 (one week).

  • indipendency_window (float, optional) – the time window (in hours) that must elapse before an agent’s movements can affect the movements of other agents in the simulation. The default is 0.5.

  • log_file (str or None, optional) – the name of the file where to write a log of the execution of the model. The logfile will contain all decisions made by the model. The default is None.

  • verbose (int, optional) – the verbosity level of the model relative to the standard output. If verbose is equal to 2 the initialization info and the decisions made by the model are printed, if verbose is equal to 1 only the initialization info are reported. The default is 0.

  • show_progress (boolean, optional) – if True, show a progress bar. The default is False.

Returns

the synthetic trajectories generated by the model

Return type

TrajDataFrame

make_individual_exploration_action(agent)

The agent A selects uniformly at random an UNVISITED location (i.e., in exp(A))

make_individual_return_action(agent)

The agent A makes a preferential choice selecting a VISITED location (i.e., in ret(A)) with probability proportional to the number of visits to that location.

make_social_action(agent, mode)

The agent A makes a social choice in the following way:

1. The agent A selects a social contact C with probability proportional to the mobility similarity between them

2. The candidate location to visit or explore is selected from the set composed of the locations visited by C (ret(C)), that are feasible according to A’s action:

  • exploration: exp(A) intersect ret(C)

  • return: ret(A) intersect ret(C)

3. select one of the feasible locations (if any) with a probability proportional to C’s visitation frequency

STS-EPR

class skmob.models.sts_epr.STS_epr(name='STS-EPR', rho=0.6, gamma=0.21, alpha=0.2)

STS-EPR model.

The STS-EPR (Spatial, Temporal and Social EPR model) model of individual human mobility consists of the following mechanisms [CRP2020]:

Action selection. With probability \(P_{exp}=\rho S^{-\gamma}\), where \(S\) is the number of distinct locations previously visited by the agent, the agent visits a new location (Exploration), otherwise with a complementary probability \(P_{ret}=1-P{exp}\) it returns to a previously visited location (Return). At that point, the agent determines whether or not the location’s choice will be affected by the other agents; with a probability \(\alpha\), the agent’s social contacts influence its movement (Social). With a complementary probability of \(1-\alpha\), the agent’s choice is not influenced by the other agents (Individual).

Parameters \(\rho\), \(\gamma\), and \(\alpha=\) correspond to arguments rho, gamma, and alpha of the constructor, respectively.

After the selection of the spatial mechanism (Exploration or Return) and the social mechanism (Individual or Social) decides which location will be the destination of its next displacement during the Location selection phase. For an agent \(a\), we denote the sets containing the indices of the locations \(a\) can explore or return, as \(exp_{a}\) and \(ret_{a}\), respectively.

Individual Exploration. If the agent \(a\) is currently in location \(i\), and explores a new location without the influence of its social contacts, then the new location \(j \neq i\) is an unvisited location for the agent (\(i \in exp_{a}\)) and it is selected according to the gravity model with probability proportional to \(p_{ij} = \frac{r_i r_j}{dist_{ij}^2}\), where \(r_{i (j)}\) is the location’s relevance, that is, the probability of a population to visit location \(i(j)\), \(dist_{ij}\) is the geographic distance between \(i\) and \(j\),

The number of distinct locations visited, \(S\), is increased by 1.

Social Exploration. If the agent \(a\) is currently in location \(i\), and explores a new location with the influence of a social contact, it first selects a social contact \(c\) with probability \(p(c) \propto mob_{sim}(a,c)\) [THSG2015]. At this point, the agent \(a\) explores an unvisited location for agent \(a\) that was visited by agent \(c\), i.e., the location \(j \neq i\) is selected from set \(A = exp_a \cap ret_c\); the probability \(p(j)\) for a location \(j \in A\), to be selected is proportional to \(\Pi_j = f_j\), where \(f_j\) is the visitation frequency of location \(j\) for the agent \(c\). The number of distinct locations visited, \(S\), is increased by 1.

Individual Return. If the agent \(a\), currently at location \(i\), returns to a previously visited location \(j \in ret_a\), it is chosen with probability proportional to the number of time the agent visited \(j\), i.e., \(\Pi_j = f_j\), where \(f_j\) is the visitation frequency of location \(j\).

Social Return. If the agent \(a\) is currently in location \(i\), and returns to a previously visited location with the influence of a social contact, it first selects a social contact \(c\) with probability \(p(c) \propto mob_{sim}(a,c)\) [THSG2015]. At this point, the agent \(a\) returns to a previously visited location for agent \(a\) that was visited by agent \(c\) too, i.e., the location \(j \neq i\) is selected from set \(A = ret_a \cap ret_c\); the probability \(p(j)\) for a location \(j \in A\), to be selected is proportional to \(\Pi_j = f_j\), where \(f_j\) is the visitation frequency of location \(j\) for the agent \(c\).

Parameters
  • name (str, optional) – the name of the instantiation of the STS-EPR model. The default value is “STS-EPR”.

  • rho (float, optional) – it corresponds to the parameter \(\rho \in (0, 1]\) in the Action selection mechanism \(P_{exp} = \rho S^{-\gamma}\) and controls the agent’s tendency to explore a new location during the next move versus returning to a previously visited location. The default value is \(\rho = 0.6\) [SKWB2010].

  • gamma (float, optional) – it corresponds to the parameter \(\gamma\) (\(\gamma \geq 0\)) in the Action selection mechanism \(P_{exp} = \rho S^{-\gamma}\) and controls the agent’s tendency to explore a new location during the next move versus returning to a previously visited location. The default value is \(\gamma=0.21\) [SKWB2010].

  • alpha (float, optional) – it corresponds to the parameter alpha in the Action selection mechanism and controls the influence of the social contacts for an agent during its location selection phase. The default value is \(\alpha=0.2\) [THSG2015].

Variables
  • name (str) – the name of the instantiation of the model.

  • rho (float) – the input parameter \(\rho\).

  • gamma (float) – the input parameters \(\gamma\).

  • alpha (float) – the input parameter \(\alpha\).

References

PSRPGB2015

Pappalardo, L., Simini, F. Rinzivillo, S., Pedreschi, D. Giannotti, F. & Barabasi, A. L. (2015) Returners and Explorers dichotomy in human mobility. Nature Communications 6, https://www.nature.com/articles/ncomms9166

PSR2016

Pappalardo, L., Simini, F. Rinzivillo, S. (2016) Human Mobility Modelling: exploration and preferential return meet the gravity model. Procedia Computer Science 83, https://www.sciencedirect.com/science/article/pii/S1877050916302216

SKWB2010

Song, C., Koren, T., Wang, P. & Barabasi, A.L. (2010) Modelling the scaling properties of human mobility. Nature Physics 6, 818-823, https://www.nature.com/articles/nphys1760

THSG2015

Toole, Jameson & Herrera-Yague, Carlos & Schneider, Christian & Gonzalez, Marta C.. (2015). Coupling Human Mobility and Social Ties. Journal of the Royal Society, Interface / the Royal Society. 12. 10.1098/rsif.2014.1128.

CRP2020

Cornacchia, Giuliano & Rossetti, Giulio & Pappalardo, Luca. (2020). Modelling Human Mobility considering Spatial,Temporal and Social Dimensions.

PS2018

Pappalardo, L. & Simini, F. (2018) Data-driven generation of spatio-temporal routines in human mobility. Data Mining and Knowledge Discovery 32, 787-829, https://link.springer.com/article/10.1007/s10618-017-0548-4

See also

EPR, SpatialEPR, Ditras

action_correction_diary(agent, choice)

The implementation of the action-correction phase, executed by an agent if the location selection phase does not allow movements in any location

cosine_similarity(x, y)

Cosine Similarity (x,y) = <x,y>/(||x||*||y||)

generate(start_date, end_date, spatial_tessellation, diary_generator, social_graph='random', n_agents=500, rsl=False, distance_matrix=None, relevance_column=None, min_relevance=0.1, dt_update_mobSim=168, indipendency_window=0.5, random_state=None, log_file=None, verbose=0, show_progress=False)

Start the simulation of a set of agents at time start_date till time end_date.

Parameters
  • start_date (datetime) – the starting date of the simulation, in “YYY/mm/dd HH:MM:SS” format.

  • end_date (datetime) – the ending date of the simulation, in “YYY/mm/dd HH:MM:SS” format.

  • spatial_tessellation (pandas DataFrame or geopandas GeoDataFrame) – the spatial tessellation, i.e., a division of the territory in locations.

  • diary_generator (MarkovDiaryGenerator) – the diary generator to use for generating the mobility diary [PS2018].

  • social_graph (“random” or an edge list) – the social graph describing the sociality of the agents. The default is “random”.

  • n_agents (int, optional) – the number of agents to generate. If social_graph is “random”, n_agents are initialized and connected, otherwise the number of agents is inferred from the edge list. The default is 500.

  • rsl (bool, optional) – if Truen the probability \(p(i)\) for an agent of being assigned to a starting physical location \(i\) is proportional to the relevance of location \(i\); otherwise, if False, it is selected uniformly at random. The defailt is False.

  • distance_matrix (numpy array or None, optional) – the origin destination matrix to use for deciding the movements of the agent. If None, it is computed “on the fly” during the simulation. The default is None.

  • relevance_column (str, optional) – the name of the column in spatial_tessellation to use as relevance variable. The default is “relevance”.

  • min_relevance (float, optional) – the value in which to map the null relevance. The default is 0.1.

  • random_state (int or None, optional) – if int, it is the seed used by the random number generator; if None, the random number generator is the RandomState instance used by np.random and random.random. The default is None.

  • dt_update_mobSim (float, optional) – the time interval (in hours) that specifies how often to update the weights of the social graph. The default is 24*7=168 (one week).

  • indipendency_window (float, optional) – the time window (in hours) that must elapse before an agent’s movements can affect the movements of other agents in the simulation. The default is 0.5.

  • log_file (str or None, optional) – the name of the file where to write a log of the execution of the model. The logfile will contain all decisions made by the model. The default is None.

  • verbose (int, optional) – the verbosity level of the model relative to the standard output. If verbose is equal to 2 the initialization info and the decisions made by the model are printed, if verbose is equal to 1 only the initialization info are reported. The default is 0.

  • show_progress (boolean, optional) – if True, show a progress bar. The default is False.

Returns

the synthetic trajectories generated by the model

Return type

TrajDataFrame

make_individual_exploration_action(agent)

The agent A, current at location i selects an UNVISITED location (i.e., in exp(A)) j with probability proportional to (r_i * r_j)/ d_ij^2

make_individual_return_action(agent)

The agent A makes a preferential choice selecting a VISITED location (i.e., in ret(A)) with probability proportional to the number of visits to that location.

make_social_action(agent, mode)

The agent A makes a social choice in the following way:

1. The agent A selects a social contact C with probability proportional to the mobility similarity between them

2. The candidate location to visit or explore is selected from the set composed of the locations visited by C (ret(C)), that are feasible according to A’s action:

  • exploration: exp(A) intersect ret(C)

  • return: ret(A) intersect ret(C)

3. select one of the feasible locations (if any) with a probability proportional to C’s visitation frequency