5

I have a shapefile which has ~350,000 entries. This is too big to handle. I'd like to split it into many smaller files, such that each file has 10,000 entries (this means there'd be ~350 shapefiles). I don't really care about what entries go to what other entries (i.e. no requirement for all with a certain attribute be in the same file, and no requirement for all entries in the same geographic area be together). The 10,000 limit is not a hard limit. So long as it's approximately 10k. I want to do this on the command line on linux (so probably ogr2ogr or similar?). How can I do this?

1
  • 1
    You need to understand that drawing 350 shapefiles takes roughly 350 times longer than drawing one shapefile, so your plan may be counterproductive. Using a more efficient data format, which has been sorted for optimal draw performance may be a better soluion. Commented May 31, 2016 at 11:05

1 Answer 1

7

Use GDAL with SQLite SQL dialect http://www.gdal.org/ogr_sql_sqlite.html and utilize LIMIT and OFFSET

ogr2ogr -f "ESRI Shapefile" -dialect sqlite -sql "select * from my_shape limit 10000 offset 0" batch_1.shp my_shape.shp ogr2ogr -f "ESRI Shapefile" -dialect sqlite -sql "select * from my_shape limit 10000 offset 10000" batch_2.shp my_shape.shp ogr2ogr -f "ESRI Shapefile" -dialect sqlite -sql "select * from my_shape limit 10000 offset 20000" batch_3.shp my_shape.shp 

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.