My Objective: I would like to use GDAL to convert a GeoPDF. I want the vector layers as shp files and the raster layers as tif files. I want to do this in a programmatic way.
Edit: In reality, I want to do this with many geospatial PDFs. I'm prototyping the workflow using Python, but it will probably end up being C++. (End Edit)
The Problem: Naturally, the command to convert a vector layer differs from a raster layer. And I don't know (again in a programmatic way) which layers are vector and which are raster.
What I've Tried: First, here is my sample data https://www.terragotech.com/images/pdf/webmap_urbansample.pdf.
gdalinfo webmap_urbansample.pdf -mdd LAYERS gives the layer names:
... Metadata (LAYERS): LAYER_00_NAME=Layers LAYER_01_NAME=Layers.BPS_-_Water_Sources LAYER_02_NAME=Layers.BPS_-_Facilities LAYER_03_NAME=Layers.BPS_-_Buildings LAYER_04_NAME=Layers.Sewerage_Man_Holes LAYER_05_NAME=Layers.Sewerage_Pump_Stations LAYER_06_NAME=Layers.Water_Points LAYER_07_NAME=Layers.Roads LAYER_08_NAME=Layers.Sewerage_Jump-Ups LAYER_09_NAME=Layers.Sewerage_Lines LAYER_10_NAME=Layers.Water_Lines LAYER_11_NAME=Layers.Cadastral_Boundaries LAYER_12_NAME=Layers.Raster_Images ... I know to look at the data which are vector and which are raster, but I don't know how to parse this information to know whether to use ogr2ogr or gdal_translate to do the conversion.
Then I thought I could use ogrinfo and just diff all the layers to deduce which ones are raster, but ogrinfo gives me:
... 1: Cadastral Boundaries (Polygon) 2: Water Lines (Line String) 3: Sewerage Lines (Line String) 4: Sewerage Jump-Ups (Line String) 5: Roads 6: Water Points (Point) 7: Sewerage Pump Stations (Point) 8: Sewerage Man Holes (Point) 9: BPS - Buildings (Polygon) 10: BPS - Facilities (Polygon) 11: BPS - Water Sources (Point) So there's not a one-to-one correspondence with the way these are output.
Does anyone know how to have gdal print the GeoPDF layers and indicate which are raster vs. vector?