HDT keeps big RDF datasets compressed while maintaining efficient search and browse operations.
The implementation has the following dependencies:
- Serd v0.28+ This enables importing RDF data in the Turtle and N-Triples serialization formats specifically. The dependency is activated by default.
- libz Enables loading N-Triples files compressed with GZIP (e.g.,
file.nt.gz) and gzipped HDTs (file.hdt.gz). The dependency is activated by default.
The installation process has the following dependencies:
The following commands should install both packages:
sudo apt-get update sudo apt-get install autoconf libtool To compile and install, run the following commands under the directory hdt-cpp. This will generate the library and tools.
First run the following script to generate all necessary installation files with autotools:
./autogen.sh Then, run:
./configure make -j2 After building, these are the typical operations that you will perform:
-
Convert your RDF data to HDT:
NB: the input stream is assumed to be valid RDF, so you should validate your data before feeding it into rdf2hdt.
$ libhdt/tools/rdf2hdt data/test.nt data/test.hdt -
Create only the index of an HDT file:
$ libhdt/tools/hdtSearch -q 0 data/test.hdt -
Convert an HDT to another RDF serialization format, such as N-Triples:
$ libhdt/tools/hdt2rdf data/test.hdt data/test.hdtexport.nt -
Open a terminal to search triple patterns within an HDT file:
$ libhdt/tools/hdtSearch data/test.hdt >> ? ? ? http://example.org/uri3 http://example.org/predicate3 http://example.org/uri4 http://example.org/uri3 http://example.org/predicate3 http://example.org/uri5 http://example.org/uri4 http://example.org/predicate4 http://example.org/uri5 http://example.org/uri1 http://example.org/predicate1 "literal1" http://example.org/uri1 http://example.org/predicate1 "literalA" http://example.org/uri1 http://example.org/predicate1 "literalB" http://example.org/uri1 http://example.org/predicate1 "literalC" http://example.org/uri1 http://example.org/predicate2 http://example.org/uri3 http://example.org/uri1 http://example.org/predicate2 http://example.org/uriA3 http://example.org/uri2 http://example.org/predicate1 "literal1" 9 results shown. >> http://example.org/uri3 ? ? http://example.org/uri3 http://example.org/predicate3 http://example.org/uri4 http://example.org/uri3 http://example.org/predicate3 http://example.org/uri5 2 results shown. >> exit -
Extract the Header of an HDT file:
$ libhdt/tools/hdtInfo data/test.hdt > header.nt -
Replace the Header of an HDT file with a new one. For example, by editing the existing one as extracted using
hdtInfo:$ libhdt/tools/replaceHeader data/test.hdt data/testOutput.hdt newHeader.nt
Contributions and PRs should be sent to the develop branch, and not to master.
hdt-cpp is free software licensed as GNU Lesser General Public License. See libhdt/COPYRIGHT