Code
import math
import geopandas
import matplotlib.pyplot as plt
import networkx
import igraph
import osmnx
from shapely.geometry import Point
= True
osmnx.settings.log_console = True osmnx.settings.use_cache
This started out as the notebook developed by Geoff Boeing available here although by now it has been very heavily modified!
I was lucky enough to co-advise Geoff as a PhD student at Berkeley, and when he announced that he was going to build osmnx
as part of his project—only a part, mind!—I did the proper thing and expressed concern that he was being overly ambitious. PhD advising is mostly about steering students towards the possible, and reining in their wilder ideas… at least that is the common wisdom. So much for that: Geoff built it and the rest is (niche) history.
[If I have one criticism of osmnx
it is that the code gets updated so often that function names are subject to rapid change. You may find that some of what follows doesn’t work in more recent versions than 1.6.0.]
Anyway… as usual we need some libraries:
What osmnx
excels at is pulling (generally messy) street network data and cleaning it up for analysis. Here’s how it works:
The network needs to be connected for any of the following to work reliably, so we extract the largest component and throw away everything else. I can’t vouch for this being the best way to accomplish this, but it seems to work.
This line will project the data to an appropriate UTM projection, and then we can make an appropriately UTM projected geopandas.GeoDataFrame
.
Next we make a convex hull of the nodes, and determine its centroid, then use this to find the node nearest the centroid. We also make a simple GeoSeries
so we can check what we’ve done makes sense.
And we can make a sanity check plot
The data in OSM networks is fairly ad hoc in this regard, so the below has been assembled by trial and error. There may be better tools available for cleaning up network data attributes.
We assume a default_speed
of 30 km/h expressed in metres per minute (i.e. 500!). Where we find maximum speed information in the data we use that instead.
We assume a delay at intersections—this might not make total sense, but slows things down on major arterials.
speed_units_per_metre
is the maxspeed attribute from OSM and for most jurisdictions is km, so 0.001 per metre.
Some hackery below to convert appropriately if maxspeed
is missing or is a list of some kind or is expressed in mph.
# add an edge attribute for time in minutes required to traverse each edge
for data in G.edges.values():
len_seg = data["length"]
data["time"] = len_seg / default_speed
if "maxspeed" in data:
max_speed = data["maxspeed"]
if type(max_speed) is list:
max_speed = max_speed[0]
if ";" in max_speed:
max_speed = max_speed.split(";")[0]
if "mph" in max_speed:
max_speed = max_speed.split()[0]
units_per_metre = 0.0006213712
if max_speed.isnumeric():
max_speed = float(max_speed)
data["time"] = len_seg * units_per_metre / max_speed * 60
data["time"] += intersection_delay
Now we can determine network travel times from our centre node to every other node. While assembling this notebook I encountered a problem with the networkx
function single_source_path_length()
function, which can be unreliable with floating point edge weights (in this cast the time
attribute). As a workaround I am using igraph
instead.
We also need the bearings of every node from the centre node.
# Assume that nodes have x and y attributes and these are in a projected
# coordinate system such that simple trigonometry will give bearing
def bearing(G, n0, n1):
N0 = G.nodes[n0]
N1 = G.nodes[n1]
return n1, \
math.atan2((N1["y"] - N0["y"]), (N1["x"] - N0["x"]))
node_bearings = dict(
[bearing(G, centre_node, n) for n in node_travel_times.keys()])
and… a function to return \(x\) and \(y\) offsets from the centre node, given their distance and bearing.
Keep in mind distances are now travel time based, while bearings remain ‘true’, so that less accessible locations are pushed farther from centre, and more accessible ones pulled nearer in the direction of the bearing. Note that all angles here are in radians.
import pandas
df_nodes = pandas.DataFrame(data = {"osmid": list(node_travel_times.keys())})
df_nodes["tx"] = [node_dxdys[i][0] for i in df_nodes.osmid]
df_nodes["ty"] = [node_dxdys[i][1] for i in df_nodes.osmid]
df_nodes["x"] = [G.nodes[i]["x"] for i in df_nodes.osmid]
df_nodes["y"] = [G.nodes[i]["y"] for i in df_nodes.osmid]
df_nodes["time"] = [node_travel_times[i] for i in df_nodes.osmid]
df_nodes.head()
osmid | tx | ty | x | y | time | |
---|---|---|---|---|---|---|
0 | 164707756 | -10.095840 | 0.516131 | 248112.235570 | 3.813112e+06 | 10.109025 |
1 | 164707812 | -10.891007 | -0.676010 | 247856.180584 | 3.812756e+06 | 10.911967 |
2 | 165442802 | -10.533059 | 0.734952 | 248000.588151 | 3.813176e+06 | 10.558668 |
3 | 1467958591 | -9.136781 | -0.608773 | 248676.344687 | 3.812796e+06 | 9.157039 |
4 | 165337776 | -11.392333 | -0.672562 | 247771.128942 | 3.812761e+06 | 11.412169 |
Write this out to disk and read it back in. This means you can start here next time if you prefer…
GeoDataFrame
s and mapsThis is pretty simple now we have all the data.
n_gdf_xy = geopandas.GeoDataFrame(
data = df_nodes[["time"]],
geometry = geopandas.GeoSeries(
[Point(p[0], p[1]) for p in zip(df_nodes.x, df_nodes.y)]))
n_gdf_xy.set_crs(CRS)
n_gdf_txty = geopandas.GeoDataFrame(
data = df_nodes[["time"]],
geometry = geopandas.GeoSeries(
[Point(p[0], p[1]) for p in zip(df_nodes.tx, df_nodes.ty)]))
And make two maps of the nodes.
Text(0.5, 1.0, 'Santa Barbara in drivetime space')
The geodetic one is easy—just use the osmnx
plotting function
First we need to make a new graph with the time based positions as the node positions
Now… for reasons unclear to me, simply plotting this using osmnx
will not work—presumably (yet again!) something to do with projections. Instead we need to make a matplotlib
LineCollection
and plot that… so here goes…
# define the plot limits
node_xs = [float(n["x"]) for n in G_time.nodes.values()]
node_ys = [float(n["y"]) for n in G_time.nodes.values()]
W, S, E, N = (min(node_xs) - 1, min(node_ys) - 1,
max(node_xs) + 1, max(node_ys) + 1)
bbox_aspect = (N - S) / (E - W)
fig_h = 12
fig_w = 12 / bbox_aspect
# create the figure and axis
fig, ax = plt.subplots(figsize = (fig_w, fig_h))
ax.set_xlim(W, E)
ax.set_ylim(S, N)
ax.set_aspect("equal")
lines = []
for u, v, data in G_time.edges.keys():
lines.append([(G_time.nodes[u]["x"], G_time.nodes[u]["y"]),
(G_time.nodes[v]["x"], G_time.nodes[v]["y"])])
from matplotlib.collections import LineCollection
ax.add_collection(LineCollection(lines, colors = "k", linewidth = .5))
plt.axis("off")
(-24.923310674353207,
27.844793379718062,
-19.572596660049093,
21.215954647646313)
© 2023-24 David O’Sullivan