One of the downsides of forming polygon from line segments in this way is that it implicitly assumes all lines have a vertex at the intersection of another line.
For example, something like this would fail ("o" represents a vertex):
o o o
\ \/
\ /\
\ / \ o
\ / \ /
/ /
/ \ / \
o \ / \
\/ \
/\ o
/ \
o \
o
You can handle this in a couple of ways:1) Intersect the lines before-hand, ensuring that you wind up with a vertex at every intersection. (In shapely's case, you'd call `shapely.ops.cascaded_union(lines)` before applying `polygonize`.)
2) Cut a bounding polygon with the lines, rather than forming polygons from individual line segments.
The second method has the advantage of ensuring that everything inside of the bounding polygon is split up into smaller, block-level polygons. Usually this is what you want, though for something like city blocks, you'd need to filter out non-urban areas (e.g. water).
Including intersections wouldn't significantly change the size of the dataset, but that type of clean-up is best done as a pre-analysis step. There's no good reason to do it in the "raw" data.
Furthermore, it's dependent on the projection/etc that you do it in. The intersection of two lines in lat/long space is not at the same point as the intersection of two lines in another lat/long space with a different datum, and it's not the same as the intersection in a projected coordinate system.
Ideally, you'd only include the intersection if you've actually measured it at that location. It's best to leave the observations as observations and try not to alter them too much.
At any rate, by the time you're analyzing the data, you've already made the type of assumptions about cartesian vs. spherical vs. real space that I'm referring to above. Therefore, it's usually a good idea to defer cleanup that's specific to a particular operation (this one is) until you're doing that operation.
The original post was trying to do it for Riga, Latvia, so they can't use US TIGER files.
I've always thought of city blocks as an anti-pattern in urban layout e.g. exploring New York on foot for the first time I tried 'using my nose' to find interesting places but had the problem that I got zero incremental information until the next intersection at which point the large block size meant it was too far to turn back.
I prefer to live in cities for which the concept of a 'block' has less relevance, and I'd caution against trying to 'blockify' cities worldwide.
When serving the tiles over http, the only tiles themselves are served, not the entire sqlite database.
This was actually pioneered in the era of raster tiles - pngs/jpgs in the sqlite db. Vector tiles just "inherited" that.
Proof might be an interactive map that shows the city block of a selected area in NYC.
Is there a visual proof in the walkthrough I'm not seeing?
I followed my instructions in the post (except I had to use encoding='iso-8859-1' instead of utf-8)
I cropped the map a little.
Generated file https://gist.github.com/pdonald/223e42b491eb42ba99b8e10ef1ae...
Interactive and clickable http://bl.ocks.org/anonymous/raw/d62bfd592234ed71d24bfd6f264...
You can also go to http://geojson.io and load the generated file for more interactivity (select, delete, add your own polygons, etc.) or open it in QGIS.
In the Results section I had a screenshot of the generated polygons in both QGIS and geojson.io (polygons are separated by thick black lines and their area is colored gray).
Basically for me when I click the interactive map I get a response, which is a pinned blank text-box with a x to close it. Thing is that I'm unable see that a city block is highlighted via a click; the text box simple appears where I click, unless it's outside the targeted area; for example, in the river.
Looking at the raw JSON it does appear the block are extracted.
So, putting aside "seeing" it, appears you're confirming it extracts single blocks as objects and isn't doing something else like if two blocks are attached, they're not sharing the path that's between them.
thx
Generated file (cropped) https://gist.github.com/pdonald/dfeee5213c8c5379f7b421f9ad5f...