Individual measures
|
Radius of gyration. |
|
k-radius of gyration. |
|
Random entropy. |
|
Uncorrelated entropy. |
|
Real entropy. |
|
Jump lengths. |
|
Maximum distance. |
|
Distance straight line. |
|
Waiting times. |
|
Number of distinct locations. |
|
Home location. |
|
Maximum distance from home. |
|
Number of visits. |
|
Location frequency. |
|
Individual Mobility Network. |
|
Recency rank. |
|
Frequency rank. |
- skmob.measures.individual.distance_straight_line(traj, show_progress=True)
Distance straight line.
Compute the distance (in kilometers) travelled straight line by a set of individuals in a TrajDataFrame. The distance straight line \(d_{SL}\) travelled by an individual \(u\) is computed as the sum of the distances travelled \(u\):
\[d_{SL} = \sum_{j=2}^{n_u} dist(r_{j-1}, r_j)\]where \(n_u\) is the number of points recorded for \(u\), \(r_{j-1}\) and \(r_j\) are two consecutive points, described as a \((latitude, longitude)\) pair, in \(u\)’s time-ordered trajectory, and \(dist\) is the geographic distance between the two points [WTDED2015].
- Parameters
traj (TrajDataFrame) – the trajectories of the individuals.
show_progress (boolean, optional) – if True, show a progress bar. The default is True.
- Returns
the straight line distance traveled by the individuals. Note that \(NaN\) indicates that an individual visited just one location and hence distance is not defined.
- Return type
pandas DataFrame
Warning
The input TrajDataFrame must be sorted in ascending order by datetime.
Examples
>>> import skmob >>> from skmob.measures.individual import distance_straight_line >>> url = "https://snap.stanford.edu/data/loc-brightkite_totalCheckins.txt.gz" >>> df = pd.read_csv(url, sep='\t', header=0, nrows=100000, names=['user', 'check-in_time', 'latitude', 'longitude', 'location id']) >>> tdf = skmob.TrajDataFrame(df, latitude='latitude', longitude='longitude', datetime='check-in_time', user_id='user') >>> dsl_df = distance_straight_line(tdf) >>> print(dsl_df.head()) uid distance_straight_line 0 0 374530.954882 1 1 774346.816009 2 2 88710.682464 3 3 470986.771764 4 4 214623.524252
See also
- skmob.measures.individual.frequency_rank(traj, show_progress=True)
Frequency rank.
Compute the frequency rank of the locations of a set of individuals in a TrajDataFrame. The frequency rank \(K_f(r_i)\) of a location \(r_i\) of an individual \(u\) is \(K_f(r_i) = 1\) if location \(r_i\) is the most visited location, it is \(K_f(r_i) = 2\) if \(r_i\) is the second-most visited location, and so on [BDEM2015].
- Parameters
traj (TrajDataFrame) – the trajectories of the individuals.
show_progress (boolean, optional) – if True, show a progress bar. The default is True.
- Returns
the frequency rank for each location of the individuals.
- Return type
pandas DataFrame
Examples
>>> import skmob >>> from skmob.measures.individual import frequency_rank >>> url = "https://snap.stanford.edu/data/loc-brightkite_totalCheckins.txt.gz" >>> df = pd.read_csv(url, sep='\t', header=0, nrows=100000, names=['user', 'check-in_time', 'latitude', 'longitude', 'location id']) >>> tdf = skmob.TrajDataFrame(df, latitude='latitude', longitude='longitude', datetime='check-in_time', user_id='user') >>> fr_df = frequency_rank(tdf) >>> print(fr_df.head()) lat lng frequency_rank uid 0 0 39.762146 -104.982480 1 1 39.891077 -105.068532 2 2 39.739154 -104.984703 3 3 39.891586 -105.068463 4 4 39.827022 -105.143191 5
See also
- skmob.measures.individual.home_location(traj, start_night='22:00', end_night='07:00', show_progress=True)
Home location.
Compute the home location of a set of individuals in a TrajDataFrame. The home location \(h(u)\) of an individual \(u\) is defined as the location \(u\) visits the most during nighttime [CBTDHVSB2012] [PSO2012]:
\[h(u) = \arg\max_{i} |\{r_i | t(r_i) \in [t_{startnight}, t_{endnight}] \}|\]where \(r_i\) is a location visited by \(u\), \(t(r_i)\) is the time when \(u\) visited \(r_i\), and \(t_{startnight}\) and \(t_{endnight}\) indicates the times when nighttime starts and ends, respectively.
- Parameters
traj (TrajDataFrame) – the trajectories of the individuals.
start_night (str, optional) – the starting time of the night (format HH:MM). The default is ‘22:00’.
end_night (str, optional) – the ending time for the night (format HH:MM). The default is ‘07:00’.
show_progress (boolean, optional) – if True, show a progress bar. The default is True.
- Returns
the home location, as a \((latitude, longitude)\) pair, of the individuals.
- Return type
pandas DataFrame
Examples
>>> import skmob >>> from skmob.measures.individual import home_location >>> url = "https://snap.stanford.edu/data/loc-brightkite_totalCheckins.txt.gz" >>> df = pd.read_csv(url, sep='\t', header=0, nrows=100000, names=['user', 'check-in_time', 'latitude', 'longitude', 'location id']) >>> tdf = skmob.TrajDataFrame(df, latitude='latitude', longitude='longitude', datetime='check-in_time', user_id='user') >>> hl_df = home_location(tdf) >>> print(hl_df.head()) uid lat lng 0 0 39.891077 -105.068532 1 1 37.630490 -122.411084 2 2 39.739154 -104.984703 3 3 37.748170 -122.459192 4 4 60.180171 24.949728
References
- CBTDHVSB2012
Csáji, B. C., Browet, A., Traag, V. A., Delvenne, J.-C., Huens, E., Van Dooren, P., Smoreda, Z. & Blondel, V. D. (2012) Exploring the Mobility of Mobile Phone Users. Physica A: Statistical Mechanics and its Applications 392(6), 1459-1473, https://www.sciencedirect.com/science/article/pii/S0378437112010059
- PSO2012
Phithakkitnukoon, S., Smoreda, Z. & Olivier, P. (2012) Socio-geography of human mobility: A study using longitudinal mobile phone data. PLOS ONE 7(6): e39253. https://doi.org/10.1371/journal.pone.0039253
See also
- skmob.measures.individual.individual_mobility_network(traj, self_loops=False, show_progress=True)
Individual Mobility Network.
Compute the individual mobility network of a set of individuals in a TrajDataFrame. An Individual Mobility Network (aka IMN) of an individual \(u\) is a directed graph \(G_u=(V,E)\), where \(V\) is the set of nodes and \(E\) is the set of edges. Nodes indicate locations visisted by \(u\), and edges indicate trips between two locations by \(u\). On the edges the following function is defined:
\[\omega: E \rightarrow \mathbb{N}\]which returns the weight of an edge, i.e., the number of travels performed by \(u\) on that edge [RGNPPG2014] [BL2012] [SQBB2010].
- Parameters
traj (TrajDataFrame) – the trajectories of the individuals.
self_loops (boolean, optional) – if True, adds self loops also. The default is False.
show_progress (boolean, optional) – if True, show a progress bar. The default is True.
- Returns
the individual mobility network of each individual.
- Return type
pandas DataFrame
Warning
The input TrajDataFrame must be sorted in ascending order by datetime.
Examples
>>> import skmob >>> from skmob.measures.individual import individual_mobility_network >>> url = "https://snap.stanford.edu/data/loc-brightkite_totalCheckins.txt.gz" >>> df = pd.read_csv(url, sep='\t', header=0, nrows=100000, names=['user', 'check-in_time', 'latitude', 'longitude', 'location id']) >>> tdf = skmob.TrajDataFrame(df, latitude='latitude', longitude='longitude', datetime='check-in_time', user_id='user') >>> imn_df = individual_mobility_network(tdf) >>> print(imn_df.head()) uid lat_origin lng_origin lat_dest lng_dest n_trips 0 0 37.774929 -122.419415 37.600747 -122.382376 1 1 0 37.600747 -122.382376 37.615223 -122.389979 1 2 0 37.600747 -122.382376 37.580304 -122.343679 1 3 0 37.615223 -122.389979 39.878664 -104.682105 1 4 0 37.615223 -122.389979 37.580304 -122.343679 1
References
- RGNPPG2014
Rinzivillo, S., Gabrielli, L., Nanni, M., Pappalardo, L., Pedreschi, D. & Giannotti, F. (2012) The purpose of motion: Learning activities from Individual Mobility Networks. Proceedings of the 2014 IEEE International Conference on Data Science and Advanced Analytics, 312-318, https://ieeexplore.ieee.org/document/7058090
- BL2012
Bagrow, J. P. & Lin, Y.-R. (2012) Mesoscopic Structure and Social Aspects of Human Mobility. PLOS ONE 7(5): e37676. https://doi.org/10.1371/journal.pone.0037676
- skmob.measures.individual.jump_lengths(traj, show_progress=True, merge=False)
Jump lengths.
Compute the jump lengths (in kilometers) of a set of individuals in a TrajDataFrame. A jump length (or trip distance) \(\Delta r\) is defined as the geographic distance between two consecutive points visited by \(u\):
\[\Delta r = dist(r_i, r_{i + 1})\]where \(r_i\) and \(r_{i + 1}\) are two consecutive points, described as a latitude, longitude pair, in the time-ordered trajectory of an individual, and \(dist\) is the geographic distance between the two points [BHG2006] [GHB2008] [PRQPG2013].
- Parameters
traj (TrajDataFrame) – the trajectories of the individuals
show_progress (boolean, optional) – if True, show a progress bar. The default is True.
merge (boolean, optional) – if True, merge the individuals’ lists into one list. The default is False.
- Returns
the jump lengths for each individual, where \(NaN\) indicates that an individual visited just one location and hence distance is not defined; or a list with all jumps together if merge is True.
- Return type
pandas DataFrame or list
Warning
The input TrajDataFrame must be sorted in ascending order by datetime.
Examples
>>> import skmob >>> from skmob.measures.individual import jump_lengths >>> url = "https://snap.stanford.edu/data/loc-brightkite_totalCheckins.txt.gz" >>> df = pd.read_csv(url, sep='\t', header=0, nrows=100000, names=['user', 'check-in_time', 'latitude', 'longitude', 'location id']) >>> tdf = skmob.TrajDataFrame(df, latitude='latitude', longitude='longitude', datetime='check-in_time', user_id='user') >>> jl_df = jump_lengths(tdf) >>> print(jl_df.head()) uid jump_lengths 0 0 [19.640467328877936, 0.0, 0.0, 1.7434311010381... 1 1 [6.505330424378251, 46.75436600375988, 53.9284... 2 2 [0.0, 0.0, 0.0, 0.0, 3.6410097195943507, 0.0, ... 3 3 [3861.2706300798827, 4.061631313492122, 5.9163... 4 4 [15511.92758595804, 0.0, 15511.92758595804, 1.... >>> jl_list = jump_lengths(tdf, merge=True) >>> print(jl_list[:10]) # print the first ten elements in the list [19.640467328877936, 0.0, 0.0, 1.743431101038163, 1553.5011134765616, 0.0, 30.14517724008101, 0.0, 2.563647571198179, 1.9309489380903868]
References
- BHG2006
Brockmann, D., Hufnagel, L. & Geisel, T. (2006) The scaling laws of human travel. Nature 439, 462-465, https://www.nature.com/articles/nature04292
See also
- skmob.measures.individual.k_radius_of_gyration(traj, k=2, show_progress=True)
k-radius of gyration.
Compute the k-radii of gyration (in kilometers) of a set of individuals in a TrajDataFrame. The k-radius of gyration of an individual \(u\) is defined as [PSRPGB2015]:
\[r_g^{(k)}(u) = \sqrt{\frac{1}{n_u^{(k)}} \sum_{i=1}^k (r_i(u) - r_{cm}^{(k)}(u))^2}\]where \(r_i(u)\) represents the \(n_u^{(k)}\) positions recorded for \(u\) on their k most frequent locations, and \(r_{cm}^{(k)}(u)\) is the center of mass of \(u\)’s trajectory considering the visits to the k most frequent locations only. In mobility analysis, the k-radius of gyration indicates the characteristic distance travelled by that individual as induced by their k most frequent locations.
- Parameters
traj (TrajDataFrame) – the trajectories of the individual.
k (int, optional) – the number of most frequent locations to consider. The default is 2. The possible range of values is \([2, +inf]\).
show_progress (boolean, optional) – if True, show a progress bar. The default is True.
- Returns
the k-radii of gyration of the individuals
- Return type
pandas DataFrame
Examples
>>> import skmob >>> from skmob.measures.individual import k_radius_of_gyration >>> url = "https://snap.stanford.edu/data/loc-brightkite_totalCheckins.txt.gz" >>> df = pd.read_csv(url, sep='\t', header=0, nrows=100000, names=['user', 'check-in_time', 'latitude', 'longitude', 'location id']) >>> tdf = skmob.TrajDataFrame(df, latitude='latitude', longitude='longitude', datetime='check-in_time', user_id='user') >>> krg_df = k_radius_of_gyration(tdf) >>> print(krg_df.head()) uid 3k_radius_of_gyration 0 0 7.730516 1 1 3.620671 2 2 6.366549 3 3 10.543072 4 4 3910.808802
References
- PSRPGB2015
Pappalardo, L., Simini, F. Rinzivillo, S., Pedreschi, D. Giannotti, F. & Barabasi, A. L. (2015) Returners and Explorers dichotomy in human mobility. Nature Communications 6, https://www.nature.com/articles/ncomms9166
See also
- skmob.measures.individual.location_frequency(traj, normalize=True, as_ranks=False, show_progress=True, location_columns=['lat', 'lng'])
Location frequency.
Compute the visitation frequency of each location, for a set of individuals in a TrajDataFrame. Given an individual \(u\), the visitation frequency of a location \(r_i\) is the number of visits to that location by \(u\). The visitation frequency \(f(r_i)\) of location \(r_i\) is also defined in the literaure as the probability of visiting location \(r_i\) by \(u\) [SKWB2010] [PF2018]:
\[f(r_i) = \frac{n(r_i)}{n_u}\]where \(n(r_i)\) is the number of visits to location \(r_i\) by \(u\), and \(n_u\) is the total number of data points in \(u\)’s trajectory.
- Parameters
traj (TrajDataFrame) – the trajectories of the individuals.
normalize (boolean, optional) – if True, the number of visits to a location by an individual is computed as probability, i.e., divided by the individual’s total number of visits. The default is True.
as_ranks (boolean, optional) – if True, return a list where element \(i\) indicates the average visitation frequency of the \(i\)-th most frequent location. The default is False.
show_progress (boolean, optional) – if True, show a progress bar. The default is True.
location_columns (list, optional) – the name of the column(s) indicating the location. The default is [constants.LATITUDE, constants.LONGITUDE].
- Returns
the location frequency for each location for each individual, or the ranks list for each individual.
- Return type
pandas DataFrame or list
Examples
>>> import skmob >>> from skmob.measures.individual import location_frequency >>> url = "https://snap.stanford.edu/data/loc-brightkite_totalCheckins.txt.gz" >>> df = pd.read_csv(url, sep='\t', header=0, nrows=100000, names=['user', 'check-in_time', 'latitude', 'longitude', 'location id']) >>> tdf = skmob.TrajDataFrame(df, latitude='latitude', longitude='longitude', datetime='check-in_time', user_id='user') >>> lf_df = location_frequency(tdf, normalize=False).reset_index() >>> print(lf_df.head()) uid lat lng location_frequency 0 0 39.762146 -104.982480 214 1 0 39.891077 -105.068532 137 2 0 39.739154 -104.984703 126 3 0 39.891586 -105.068463 72 4 0 39.827022 -105.143191 53 >>> lf_df = location_frequency(tdf, normalize=True).reset_index() # frequencies ad probabilities >>> print(lf_df.head()) uid lat lng location_frequency 0 0 39.762146 -104.982480 0.101953 1 0 39.891077 -105.068532 0.065269 2 0 39.739154 -104.984703 0.060029 3 0 39.891586 -105.068463 0.034302 4 0 39.827022 -105.143191 0.025250 >>> ranks = location_frequency(tdf, as_ranks=True) # as rank list >>> print(ranks[:10]) [0.26774954912290716, 0.12699129836809203, 0.07090642778490935, 0.04627646190564675, 0.03657120208870922, 0.029353331229094993, 0.025050267239164755, 0.020284764933447663, 0.018437443393907686, 0.01656729815097415]
See also
visits_per_location
- skmob.measures.individual.max_distance_from_home(traj, start_night='22:00', end_night='07:00', show_progress=True)
Maximum distance from home.
Compute the maximum distance (in kilometers) traveled from their home location by a set of individuals in a TrajDataFrame. The maximum distance from home \(dh_{max}(u)\) of an individual \(u\) is defined as [CM2015]:
\[dh_{max}(u) = \max\limits_{1 \leq i \lt j \lt n_u} dist(r_i, h(u))\]where \(n_u\) is the number of points recorded for \(u\), \(r_i\) is a location visited by \(u\) described as a \((latitude, longitude)\) pair, \(h(u)\) is the home location of \(u\), and \(dist\) is the geographic distance between two points.
- Parameters
traj (TrajDataFrame) – the trajectories of the individuals.
start_night (str, optional) – the starting time of the night (format HH:MM). The default is ‘22:00’.
end_night (str, optional) – the ending time for the night (format HH:MM). The default is ‘07:00’.
show_progress (boolean, optional) – if True, show a progress bar. The default is True.
- Returns
the maximum distance from home of the individuals.
- Return type
pandas DataFrame
Examples
>>> import skmob >>> from skmob.measures.individual import max_distance_from_home >>> url = "https://snap.stanford.edu/data/loc-brightkite_totalCheckins.txt.gz" >>> df = pd.read_csv(url, sep='\t', header=0, nrows=100000, names=['user', 'check-in_time', 'latitude', 'longitude', 'location id']) >>> tdf = skmob.TrajDataFrame(df, latitude='latitude', longitude='longitude', datetime='check-in_time', user_id='user') >>> dh_max = max_distance_from_home(tdf) >>> print(df_max.head()) uid max_distance_from_home 0 0 11286.942949 1 1 12800.547682 2 2 11282.748348 3 3 12799.754644 4 4 15512.788707
References
- CM2015
Canzian, L. & Musolesi, M. (2015) Trajectories of depression: unobtrusive monitoring of depressive states by means of smartphone mobility traces analysis. Proceedings of the 2015 ACM International Joint Conference on Pervasive and Ubiquitous Computing, 1293-1304, https://dl.acm.org/citation.cfm?id=2805845
See also
- skmob.measures.individual.maximum_distance(traj, show_progress=True)
Maximum distance.
Compute the maximum distance (in kilometers) traveled by a set of individuals in a TrajDataFrame. The maximum distance \(d_{max}\) travelled by an individual \(u\) is defined as:
\[d_{max} = \max\limits_{1 \leq i \lt j \lt n_u} dist(r_i, r_j)\]where \(n_u\) is the number of points recorded for \(u\), \(r_i\) and \(r_{i + 1}\) are two consecutive points, described as a \((latitude, longitude)\) pair, in \(u\)’s time-ordered trajectory, and \(dist\) is the geographic distance between the two points [WTDED2015] [LBH2012].
- Parameters
traj (TrajDataFrame) – the trajectories of the individuals.
show_progress (boolean, optional) – if True, show a progress bar. The default is True.
- Returns
the maximum traveled distance for each individual. Note that \(NaN\) indicates that an individual visited just one location and so the maximum distance is not defined.
- Return type
pandas DataFrame
Examples
>>> import skmob >>> from skmob.measures.individual import maximum_distance >>> url = "https://snap.stanford.edu/data/loc-brightkite_totalCheckins.txt.gz" >>> df = pd.read_csv(url, sep='\t', header=0, nrows=100000, names=['user', 'check-in_time', 'latitude', 'longitude', 'location id']) >>> tdf = skmob.TrajDataFrame(df, latitude='latitude', longitude='longitude', datetime='check-in_time', user_id='user') >>> md_df = maximum_distance(tdf) >>> print(md_df.head()) uid maximum_distance 0 0 11294.436420 1 1 12804.895064 2 2 11286.745660 3 3 12803.259219 4 4 15511.927586
References
- WTDED2015(1,2)
Williams, N. E., Thomas, T. A., Dunbar, M., Eagle, N. & Dobra, A. (2015) Measures of Human Mobility Using Mobile Phone Records Enhanced with GIS Data. PLOS ONE 10(7): e0133630. https://doi.org/10.1371/journal.pone.0133630
- LBH2012
Lu, X., Bengtsson, L. & Holme, P. (2012) Predictability of population displacement after the 2010 haiti earthquake. Proceedings of the National Academy of Sciences 109 (29) 11576-11581; https://doi.org/10.1073/pnas.1203882109
See also
- skmob.measures.individual.number_of_locations(traj, show_progress=True)
Number of distinct locations.
Compute the number of distinct locations visited by a set of individuals in a TrajDataFrame [GHB2008].
- Parameters
traj (TrajDataFrame) – the trajectories of the individuals
show_progress (boolean, optional) – if True, show a progress bar. The default is True.
- Returns
the number of distinct locations visited by the individuals.
- Return type
pandas DataFrame
Examples
>>> import skmob >>> from skmob.measures.individual import number_of_locations >>> url = "https://snap.stanford.edu/data/loc-brightkite_totalCheckins.txt.gz" >>> df = pd.read_csv(url, sep='\t', header=0, nrows=100000, names=['user', 'check-in_time', 'latitude', 'longitude', 'location id']) >>> tdf = skmob.TrajDataFrame(df, latitude='latitude', longitude='longitude', datetime='check-in_time', user_id='user') >>> nl_df = number_of_locations(tdf) >>> print(nl_df.head()) uid number_of_locations 0 0 542 1 1 97 2 2 460 3 3 614 4 4 216
- skmob.measures.individual.number_of_visits(traj, show_progress=True)
Number of visits.
Compute the number of visits (i.e., data points) for each individual in a TrajDataFrame.
- Parameters
traj (TrajDataFrame) – the trajectories of the individuals.
show_progress (boolean, optional) – if True, show a progress bar. The default is True.
- Returns
the number of visits or points per each individual.
- Return type
pandas DataFrame
Examples
>>> import skmob >>> from skmob.measures.individual import number_of_visits >>> url = "https://snap.stanford.edu/data/loc-brightkite_totalCheckins.txt.gz" >>> df = pd.read_csv(url, sep='\t', header=0, nrows=100000, names=['user', 'check-in_time', 'latitude', 'longitude', 'location id']) >>> tdf = skmob.TrajDataFrame(df, latitude='latitude', longitude='longitude', datetime='check-in_time', user_id='user') >>> num_v_df = number_of_visits(tdf) >>> print(num_v_df.head()) uid number_of_visits 0 0 2099 1 1 1210 2 2 2100 3 3 1807 4 4 779
- skmob.measures.individual.radius_of_gyration(traj, show_progress=True)
Radius of gyration.
Compute the radii of gyration (in kilometers) of a set of individuals in a TrajDataFrame. The radius of gyration of an individual \(u\) is defined as [GHB2008] [PRQPG2013]:
\[r_g(u) = \sqrt{ \frac{1}{n_u} \sum_{i=1}^{n_u} dist(r_i(u) - r_{cm}(u))^2}\]where \(r_i(u)\) represents the \(n_u\) positions recorded for \(u\), and \(r_{cm}(u)\) is the center of mass of \(u\)’s trajectory. In mobility analysis, the radius of gyration indicates the characteristic distance travelled by \(u\).
- Parameters
traj (TrajDataFrame) – the trajectories of the individuals.
show_progress (boolean, optional) – if True, show a progress bar. The default is True.
- Returns
the radius of gyration of each individual.
- Return type
pandas DataFrame
Examples
>>> import skmob >>> from skmob.measures.individual import radius_of_gyration >>> url = "https://snap.stanford.edu/data/loc-brightkite_totalCheckins.txt.gz" >>> df = pd.read_csv(url, sep='\t', header=0, nrows=100000, names=['user', 'check-in_time', 'latitude', 'longitude', 'location id']) >>> tdf = skmob.TrajDataFrame(df, latitude='latitude', longitude='longitude', datetime='check-in_time', user_id='user') >>> rg_df = radius_of_gyration(tdf) >>> print(rg_df.head()) uid radius_of_gyration 0 0 1564.436792 1 1 2467.773523 2 2 1439.649774 3 3 1752.604191 4 4 5380.503250
References
- GHB2008(1,2,3)
González, M. C., Hidalgo, C. A. & Barabási, A. L. (2008) Understanding individual human mobility patterns. Nature, 453, 779–782, https://www.nature.com/articles/nature06958.
- PRQPG2013(1,2)
Pappalardo, L., Rinzivillo, S., Qu, Z., Pedreschi, D. & Giannotti, F. (2013) Understanding the patterns of car travel. European Physics Journal Special Topics 215(1), 61-73, https://link.springer.com/article/10.1140%2Fepjst%2Fe2013-01715-5
See also
- skmob.measures.individual.random_entropy(traj, show_progress=True)
Random entropy.
Compute the random entropy of a set of individuals in a TrajDataFrame. The random entropy of an individual \(u\) is defined as [EP2009] [SQBB2010]:
\[E_{rand}(u) = log_2(N_u)\]where \(N_u\) is the number of distinct locations visited by \(u\), capturing the degree of predictability of \(u\)’s whereabouts if each location is visited with equal probability.
- Parameters
traj (TrajDataFrame) – the trajectories of the individuals.
show_progress (boolean, optional) – if True, show a progress bar. The default is True.
- Returns
the random entropy of the individuals.
- Return type
pandas DataFrame
Examples
>>> import skmob >>> from skmob.measures.individual import random_entropy >>> url = "https://snap.stanford.edu/data/loc-brightkite_totalCheckins.txt.gz" >>> df = pd.read_csv(url, sep='\t', header=0, nrows=100000, names=['user', 'check-in_time', 'latitude', 'longitude', 'location id']) >>> tdf = skmob.TrajDataFrame(df, latitude='latitude', longitude='longitude', datetime='check-in_time', user_id='user') >>> re_df = random_entropy(tdf) >>> print(re_df.head()) uid random_entropy 0 0 9.082149 1 1 6.599913 2 2 8.845490 3 3 9.262095 4 4 7.754888
References
- EP2009(1,2)
Eagle, N. & Pentland, A. S. (2009) Eigenbehaviors: identifying structure in routine. Behavioral Ecology and Sociobiology 63(7), 1057-1066, https://link.springer.com/article/10.1007/s00265-009-0830-6
- SQBB2010(1,2,3,4)
Song, C., Qu, Z., Blumm, N. & Barabási, A. L. (2010) Limits of Predictability in Human Mobility. Science 327(5968), 1018-1021, https://science.sciencemag.org/content/327/5968/1018
See also
- skmob.measures.individual.real_entropy(traj, show_progress=True)
Real entropy.
Compute the real entropy of a set of individuals in a TrajDataFrame. The real entropy of an individual \(u\) is defined as [SQBB2010]:
\[E(u) = - \sum_{T'_u}P(T'_u)log_2[P(T_u^i)]\]where \(P(T'_u)\) is the probability of finding a particular time-ordered subsequence \(T'_u\) in the trajectory \(T_u\). The real entropy hence depends not only on the frequency of visitation, but also the order in which the nodes were visited and the time spent at each location, thus capturing the full spatiotemporal order present in an \(u\)’s mobility patterns.
- Parameters
traj (TrajDataFrame) – the trajectories of the individuals
show_progress (boolean, optional) – if True, show a progress bar. The default is True.
- Returns
the real entropy of the individuals
- Return type
pandas DataFrame
Warning
The input TrajDataFrame must be sorted in ascending order by datetime. Note that the computation of this measure is, by construction, slow.
Examples
>>> import skmob >>> from skmob.measures.individual import real_entropy >>> url = "https://snap.stanford.edu/data/loc-brightkite_totalCheckins.txt.gz" >>> df = pd.read_csv(url, sep='\t', header=0, nrows=100000, names=['user', 'check-in_time', 'latitude', 'longitude', 'location id']) >>> tdf = skmob.TrajDataFrame(df, latitude='latitude', longitude='longitude', datetime='check-in_time', user_id='user') >>> re_df = real_entropy(tdf[tdf.uid < 50]) # computed on a subset of individuals >>> print(re_df.head()) uid real_entropy 0 0 4.906479 1 1 2.207224 2 2 4.467225 3 3 4.782442 4 4 3.585371
See also
- skmob.measures.individual.recency_rank(traj, show_progress=True)
Recency rank.
Compute the recency rank of the locations of a set of individuals in a TrajDataFrame. The recency rank \(K_s(r_i)\) of a location \(r_i\) of an individual \(u\) is \(K_s(r_i) = 1\) if location \(r_i\) is the last visited location, it is \(K_s(r_i) = 2\) if \(r_i\) is the second-last visited location, and so on [BDEM2015].
- Parameters
traj (TrajDataFrame) – the trajectories of the individuals.
show_progress (boolean, optional) – if True, show a progress bar. The default is True.
- Returns
the recency rank for each location of the individuals.
- Return type
pandas DataFrame
Warning
The input TrajDataFrame must be sorted in ascending order by datetime.
Examples
>>> import skmob >>> from skmob.measures.individual import recency_rank >>> url = "https://snap.stanford.edu/data/loc-brightkite_totalCheckins.txt.gz" >>> df = pd.read_csv(url, sep='\t', header=0, nrows=100000, names=['user', 'check-in_time', 'latitude', 'longitude', 'location id']) >>> tdf = skmob.TrajDataFrame(df, latitude='latitude', longitude='longitude', datetime='check-in_time', user_id='user') >>> rr_df = recency_rank(tdf) >>> print(rr_df.head()) lat lng recency_rank uid 0 0 39.891383 -105.070814 1 1 39.891077 -105.068532 2 2 39.750469 -104.999073 3 3 39.752713 -104.996337 4 4 39.752508 -104.996637 5
References
- BDEM2015(1,2)
Barbosa, H., de Lima-Neto, F. B., Evsukoff, A., Menezes, R. (2015) The effect of recency to human mobility, EPJ Data Science 4(21), https://epjdatascience.springeropen.com/articles/10.1140/epjds/s13688-015-0059-8
See also
Uncorrelated entropy.
Compute the temporal-uncorrelated entropy of a set of individuals in a TrajDataFrame. The temporal-uncorrelated entropy of an individual \(u\) is defined as [EP2009] [SQBB2010] [PVGSPG2016]:
\[E_{unc}(u) = - \sum_{j=1}^{N_u} p_u(j) log_2 p_u(j)\]where \(N_u\) is the number of distinct locations visited by \(u\) and \(p_u(j)\) is the historical probability that a location \(j\) was visited by \(u\). The temporal-uncorrelated entropy characterizes the heterogeneity of \(u\)’s visitation patterns.
- Parameters
traj (TrajDataFrame) – the trajectories of the individuals.
normalize (boolean, optional) – if True, normalize the entropy in the range \([0, 1]\) by dividing by \(log_2(N_u)\), where \(N\) is the number of distinct locations visited by individual \(u\). The default is False.
show_progress (boolean, optional) – if True, show a progress bar. The default is True.
- Returns
the temporal-uncorrelated entropy of the individuals.
- Return type
pandas DataFrame
Examples
>>> import skmob >>> from skmob.measures.individual import uncorrelated_entropy >>> url = "https://snap.stanford.edu/data/loc-brightkite_totalCheckins.txt.gz" >>> df = pd.read_csv(url, sep='\t', header=0, nrows=100000, names=['user', 'check-in_time', 'latitude', 'longitude', 'location id']) >>> tdf = skmob.TrajDataFrame(df, latitude='latitude', longitude='longitude', datetime='check-in_time', user_id='user') >>> ue_df = uncorrelated_entropy(tdf, normalize=True) >>> print(ue_df.head()) uid norm_uncorrelated_entropy 0 0 0.819430 1 1 0.552972 2 2 0.764304 3 3 0.794553 4 4 0.756421
References
- PVGSPG2016
Pappalardo, L., Vanhoof, M., Gabrielli, L., Smoreda, Z., Pedreschi, D. & Giannotti, F. (2016) An analytical framework to nowcast well-being using mobile phone data. International Journal of Data Science and Analytics 2(75), 75-92, https://link.springer.com/article/10.1007/s41060-016-0013-2
See also
- skmob.measures.individual.waiting_times(traj, show_progress=True, merge=False)
Waiting times.
Compute the waiting times (in seconds) between the movements of each individual in a TrajDataFrame. A waiting time (or inter-time) by an individual \(u\) is defined as the time between two consecutive points in \(u\)’s trajectory:
\[\Delta t = |t(r_i) - t(r_{i + 1})|\]where \(r_i\) and \(r_{i + 1}\) are two consecutive points, described as a \((latitude, longitude)\) pair, in the time-ordered trajectory of \(u\), and \(t(r)\) indicates the time when \(u\) visits point \(r\) [SKWB2010] [PF2018].
- Parameters
traj (TrajDataFrame) – the trajectories of the individuals.
show_progress (boolean, optional) – if True, show a progress bar. The default is True.
merge (boolean, optional) – if True, merge the individuals’ lists into one list. The default is False.
- Returns
the list of waiting times for each individual, where \(NaN\) indicates that an individual visited just one location and hence waiting time is not defined; or a list with all waiting times together if merge is True.
- Return type
pandas DataFrame or list
Warning
The input TrajDataFrame must by sorted in ascending order by datetime.
Examples
>>> import skmob >>> from skmob.measures.individual import waiting_times >>> url = "https://snap.stanford.edu/data/loc-brightkite_totalCheckins.txt.gz" >>> df = pd.read_csv(url, sep='\t', header=0, nrows=100000, names=['user', 'check-in_time', 'latitude', 'longitude', 'location id']) >>> tdf = skmob.TrajDataFrame(df, latitude='latitude', longitude='longitude', datetime='check-in_time', user_id='user') >>> wt_df = waiting_times(tdf) >>> print(wt_df.head()) uid waiting_times 0 0 [2358.0, 136.0, 303.0, 1836.0, 14869.0, 517.0,... 1 1 [43460.0, 34353.0, 8347.0, 40694.0, 281.0, 16.... 2 2 [293.0, 308.0, 228.0, 402.0, 16086.0, 665.0, 9... 3 3 [10200079.0, 30864.0, 54415.0, 2135.0, 63.0, 1... 4 4 [82845.0, 56.0, 415156.0, 1372.0, 23.0, 42679.... >>> wl_list = waiting_times(tdf, merge=True) >>> print(wl_list[:10]) [2358.0, 136.0, 303.0, 1836.0, 14869.0, 517.0, 8995.0, 41306.0, 949.0, 11782.0]
References
- SKWB2010(1,2)
Song, C., Koren, T., Wang, P. & Barabasi, A.L. (2010) Modelling the scaling properties of human mobility. Nature Physics 6, 818-823, https://www.nature.com/articles/nphys1760
- PF2018(1,2)
Pappalardo, L. & Simini, F. (2018) Data-driven generation of spatio-temporal routines in human mobility. Data Mining and Knowledge Discovery 32, 787-829, https://link.springer.com/article/10.1007/s10618-017-0548-4