Collective measures

`random_location_entropy`(traj[, show_progress])	Random location entropy.
`uncorrelated_location_entropy`(traj[, ...])	Temporal-uncorrelated entropy.
`mean_square_displacement`(traj[, days, ...])	Mean Square Displacement.
`visits_per_location`(traj)	Visits per location.
`homes_per_location`(traj[, start_night, ...])	Homes per location.
`visits_per_time_unit`(traj[, time_unit])	Visits per time unit.

skmob.measures.collective.homes_per_location(traj, start_night='22:00', end_night='07:00')

Homes per location.

Compute the number of home locations in each location. The number of home locations in a location \(j\) is computed as [PRS2016]:

\[N_{homes}(j) = |\{h_u | h_u = j, u \in U \}|\]

where \(h_u\) indicates the home location of an individual \(u\) and \(U\) is the set of individuals.

Parameters

traj (TrajDataFrame) – the trajectories of the individuals.
start_night (str, optional) – the starting time of the night (format HH:MM). The default is ‘22:00’.
end_night (str, optional) – the ending time for the night (format HH:MM). The default is ‘07:00’.

Returns

the number of homes per location.

Return type

pandas DataFrame

Examples

>>> import skmob
>>> from skmob.measures.collective import homes_per_location
>>> url = "https://snap.stanford.edu/data/loc-brightkite_totalCheckins.txt.gz"
>>> df = pd.read_csv(url, sep='\t', header=0, nrows=100000, names=['user', 'check-in_time', 'latitude', 'longitude', 'location id'])
>>> tdf = skmob.TrajDataFrame(df, latitude='latitude', longitude='longitude', datetime='check-in_time', user_id='user').sort_values(by='datetime')
>>> hl_df = homes_per_location(tdf).sort_values(by='n_homes', ascending=False)
>>> print(hl_df.head())
         lat         lng  n_homes
0  39.739154 -104.984703       15
1  37.584103 -122.366083        6
2  40.014986 -105.270546        5
3  37.580304 -122.343679        5
4  37.774929 -122.419415        4

References

PRS2016(1,2): Pappalardo, L., Rinzivillo, S. & Simini, F. (2016) Human Mobility Modelling: exploration and preferential return meet the gravity model. Procedia Computer Science 83, 934-939, http://dx.doi.org/10.1016/j.procs.2016.04.188

skmob.measures.collective.mean_square_displacement(traj, days=0, hours=1, minutes=0, show_progress=True)

Mean Square Displacement.

Compute the mean square displacement across the individuals in a TrajDataFrame. The mean squared displacement is a measure of the deviation of the position of an object with respect to a reference position over time [BHG2006] [SKWB2010]. It is defined as:

\[MSD = \langle |r(t) - r(0)| \rangle = \frac{1}{N} \sum_{i = 1}^N |r^{(i)}(t) - r^{(i)}(0)|^2\]

where \(N\) is the number of individuals to be averaged, vector \(x^{(i)}(0)\) is the reference position of the \(i\)-th individual, and vector \(x^{(i)}(t)\) is the position of the \(i\)-th individual at time \(t\) [FS2002].

Parameters

traj (TrajDataFrame) – the trajectories of the individuals.
days (int, optional) – the days since the starting time. The default is 0.
hours (int, optional) – the hours since the days since the starting time. The default is 1.
minutes (int, optional) – the minutes since the hours since the days since the starting time. The default is 0.
show_progress (boolean, optional) – if True, show a progress bar. The default is True.

Returns

the mean square displacement.

Return type

float

Warning

The input TrajDataFrame must be sorted in ascending order by datetime.

Examples

>>> import skmob
>>> url = "https://snap.stanford.edu/data/loc-brightkite_totalCheckins.txt.gz"
>>> df = pd.read_csv(url, sep='\t', header=0, nrows=100000, names=['user', 'check-in_time', 'latitude', 'longitude', 'location id'])
>>> tdf = skmob.TrajDataFrame(df, latitude='latitude', longitude='longitude', datetime='check-in_time', user_id='user').sort_values(by='datetime')
>>> msd = mean_square_displacement(tdf, days=0, hours=1, minutes=0)
>>> print(msd)
534672.3361996822

References

FS2002: Frenkel, D. & Smit, B. (2002) Understanding molecular simulation: From algorithms to applications. Academic Press, 196 (2nd Ed.), https://www.sciencedirect.com/book/9780122673511/understanding-molecular-simulation.
BHG2006: Brockmann, D., Hufnagel, L. & Geisel, T. (2006) The scaling laws of human travel. Nature 439, 462-465, https://www.nature.com/articles/nature04292
SKWB2010: Song, C., Koren, T., Wang, P. & Barabasi, A.L. (2010) Modelling the scaling properties of human mobility. Nature Physics 6, 818-823, https://www.nature.com/articles/nphys1760

skmob.measures.collective.random_location_entropy(traj, show_progress=True)

Random location entropy.

Compute the random location entropy of the locations in a TrajDataFrame. The random location entropy of a location \(j\) captures the degree of predictability of \(j\) if each individual visits it with equal probability, and it is defined as:

\[LE_{rand}(j) = log_2(N_j)\]

where \(N_j\) is the number of distinct individuals that visited location \(j\).

Parameters

traj (TrajDataFrame) – the trajectories of the individuals.
show_progress (boolean, optional) – if True, show a progress bar. The default is True.

Returns

the random location entropy of the locations.

Return type

pandas DataFrame

Example

>>> import skmob
>>> from skmob.measures.collective import random_location_entropy
>>> url = "https://snap.stanford.edu/data/loc-brightkite_totalCheckins.txt.gz"
>>> df = pd.read_csv(url, sep='\t', header=0, nrows=100000,
             names=['user', 'check-in_time', 'latitude', 'longitude', 'location id'])
>>> tdf = skmob.TrajDataFrame(df, latitude='latitude', longitude='longitude', datetime='check-in_time', user_id='user')
>>> rle_df = random_location_entropy(tdf, show_progress=True).sort_values(by='random_location_entropy', ascending=False)
>>> print(rle_df.head())
             lat         lng  random_location_entropy
10286  39.739154 -104.984703                 6.129283
49      0.000000    0.000000                 5.643856
5991   37.774929 -122.419415                 5.523562
12504  39.878664 -104.682105                 5.491853
5377   37.615223 -122.389979                 5.247928

See also

uncorrelated_location_entropy

skmob.measures.collective.uncorrelated_location_entropy(traj, normalize=False, show_progress=True)

Temporal-uncorrelated entropy.

Compute the temporal-uncorrelated location entropy of the locations in a TrajDataFrame. The temporal-uncorrelated location entropy \(LE_{unc}(j)\) of a location \(j\) is the historical probability that \(j\) is visited by an individual \(u\). Formally, it is defined as [CML2011]:

\[LE_{unc}(j) = -\sum_{i=j}^{N_j} p_jlog_2(p_j)\]

where \(N_j\) is the number of distinct individuals that visited \(j\) and \(p_j\) is the historical probability that a visit to location \(j\) is by individual \(u\).

Parameters

traj (TrajDataFrame) – the trajectories of the individuals.
normalize (boolean, optional) – if True, normalize the location entropy by dividing by \(log2(N_j)\), where \(N_j\) is the number of distinct individuals that visited location \(j\). The default is False.
show_progress (boolean) – if True, show a progress bar. The default is True.

Returns

the temporal-uncorrelated location entropies of the locations.

Return type

pandas DataFrame

Examples

>>> import skmob
>>> from skmob.measures.collective import uncorrelated_location_entropy
>>> url = "https://snap.stanford.edu/data/loc-brightkite_totalCheckins.txt.gz"
>>> df = pd.read_csv(url, sep='\t', header=0, nrows=100000,
             names=['user', 'check-in_time', 'latitude', 'longitude', 'location id'])
>>> tdf = skmob.TrajDataFrame(df, latitude='latitude', longitude='longitude', datetime='check-in_time', user_id='user')
>>> ule_df = uncorrelated_location_entropy(tdf, show_progress=True).sort_values(by='uncorrelated_location_entropy', ascending=False)
>>> print(ule_df.head())
             lat         lng  uncorrelated_location_entropy
12504  39.878664 -104.682105                       3.415713
5377   37.615223 -122.389979                       3.176950
10286  39.739154 -104.984703                       3.118656
12435  39.861656 -104.673177                       2.918413
12361  39.848233 -104.675031                       2.899175
dtype: float64

References

CML2011: Cho, E., Myers, S. A. & Leskovec, J. (2011) Friendship and mobility: user movement in location-based social networks. In Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining, 1082-1090, https://dl.acm.org/citation.cfm?id=2020579