I have a shapefile which has ~350,000 entries. This is too big to handle. I'd like to split it into many smaller files, such that each file has 10,000 entries (this means there'd be ~350 shapefiles). I don't really care about what entries go to what other entries (i.e. no requirement for all with a certain attribute be in the same file, and no requirement for all entries in the same geographic area be together). The 10,000 limit is not a hard limit. So long as it's approximately 10k. I want to do this on the command line on linux (so probably ogr2ogr or similar?). How can I do this?
- 1You need to understand that drawing 350 shapefiles takes roughly 350 times longer than drawing one shapefile, so your plan may be counterproductive. Using a more efficient data format, which has been sorted for optimal draw performance may be a better soluion.Vince– Vince2016-05-31 11:05:03 +00:00Commented May 31, 2016 at 11:05
Add a comment |
1 Answer
Use GDAL with SQLite SQL dialect http://www.gdal.org/ogr_sql_sqlite.html and utilize LIMIT and OFFSET
ogr2ogr -f "ESRI Shapefile" -dialect sqlite -sql "select * from my_shape limit 10000 offset 0" batch_1.shp my_shape.shp ogr2ogr -f "ESRI Shapefile" -dialect sqlite -sql "select * from my_shape limit 10000 offset 10000" batch_2.shp my_shape.shp ogr2ogr -f "ESRI Shapefile" -dialect sqlite -sql "select * from my_shape limit 10000 offset 20000" batch_3.shp my_shape.shp