Python module to read/write BGEN files
At the most simplest, one can extract full variant data with
import bgenpy bgen = bgenpy.Reader("my.bgen") attributes = bgen.attributes() samples = bgen.samples() for info in bgen: print(info[0])If you want to just get a list of variants, then it is faster to read only the variant header information:
import bgenpy bgen = bgen.Reader("my.bgen") bgen.seek_first_variant() try: while True: info = bgen.read_minimal_variant() except StopIteration: passYou can save the offsets to variants:
import bgenpy bgen = bgen.Reader("my.bgen") bgen.seek_first_variant() interesting_variant_offsets = [] try: while True: offset = bgen.offset() info = bgen.read_minimal_variant() if is_interesting(info): interesting_variant_offsets.append(offset) except StopIteration: passAnd later jump to those variants (after re-opening the file):
import bgenpy bgen = bgen.Reader("my.bgen") for offset in interesing_offsets: bgen.seek_to_variant_offset(offset) info = bgen.read_full_variant()Building this module requires the zstd compression library. Then set environment variables:
export ZSTD_INC=/to/where/is/zstd-1.1.0/lib export ZSTD_LIB=/to/where/is/zstd-1.1.0/lib export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$ZSTD_LIB