Following my map of London’s green and blue infrastructure, I have been working on some analysis of the land uses.
I was inspired and encouraged to try this by Liliana’s interesting work called “imagining all of Southwark“. Lili and Ari have managed to get the council to release lots of data on properties and car parking, and they are producing analysis of this data by postal code area and by street. They haven’t managed to get anything on land uses, so I thought, why not produce this with OpenStreetMap data?
A few evenings later, here is the result shared on Google docs (direct link) covering the eight postal code areas that between them cover most of the borough (SE1, SE5, SE15, SE16, SE17, SE21, SE22, SE24):
[googleapps domain=”docs” dir=”spreadsheet/pub” query=”key=0Ago4j1EdZaOFdHlYTE9CNU15VUxGdDBLeVpPV0gxbFE&output=html&widget=true” width=”500″ height=”300″ /]
The “summary” worksheet shows the total land area, expressed in hectares (10,000 m2), for various different types of land coverage. I have also calculated the percentage of that postal code area that the land uses represent, which gives an interesting insight into the differences between the areas.
Some of the land uses will overlap, for example miscellaneous bits of green space are often mapped on top of residential areas. So the numbers aren’t supposed to add up to anything like 100%.
The spreadsheet also contains worksheets for each postal code area. These contain a dump of all the objects in OpenStreetMap in those postal code areas, and this is the raw data the summary spreadsheet uses to get the totals.
You should use this data with a large spoonful of salt. Here are the significant flaws I have noticed:
Postal code areas are approximate, for example the boundary between SE15 and SE22 should mark the boundary between Peckham Rye Common (SE15) and Peckham Rye Park (SE22). In my data both the park and the common show up in both of the postal codes, because the boundary isn’t quite right. Read down to my method to see why. The errors introduced are pretty tiny in most places (plus or minus a few meters along the full boundary), and probably cancel themselves out for big land uses like residential, but they probably also introduce some significant errors for parks where the boundaries go awry by 20-30m in places. Sadly there aren’t any accurate open data polygons I can use.
Data is missing because OpenStreetMap contributors haven’t mapped it. Of course the easy solution here is to get more of it mapped and up to date! My estimate of the different types is as follows:
Data is also sometimes missing because of flaws in the Geofabrik shapefiles, not all of which I have corrected. For example, I noticed they were missing commons so I manually added those in, but I may have missed other land uses. One major omission, a shame given the interest in them, is the humble sports pitch/playing field.
After a lot of experimentation – I’ve never been trained to use GIS tools – I worked out this method. If you know of an easier way I’d love to hear about it.
For reference, some of the totals in the summary work off more than one land use type so here are the categories and the corresponding OpenStreetMap tags:
One obvious improvement would be to get more data in. Perhaps this first analysis will encourage people to help out with that? I have also emailed Geofabrik about the flaws I have discovered in their shapefiles, so I hope those get fixed.
Another thought is to produce the stats by council ward. But given that there are far more wards, I’d like to find a quicker way of producing the stats for each ward (step three above) first.
It would also be interesting to do it by town/suburb, for example comparing Peckham to East Dulwich. But we don’t have any meaningful boundaries for those natural areas. It would be really interesting to do a mass version of “this isn’t fucking Dalston” for a whole borough, using the Voronoi polygons method to infer areas from surveys at thousands of locations around the borough. One day…