Skip to content

A tensorflow MongoDB connector. The connector opens a native link from Tensorflow to your MongoDB database and hence offers superior performance when data is retrieved from this database.

License

Notifications You must be signed in to change notification settings

svenboesiger/tfmongodb

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TFMongoDB

Overview

TFMongoDB is a C++ implemented dataset op for google’s tensorflow that allows you to connect to your MongoDatabase natively. Hence you can access your mongodb stored documents more efficiently.

Currently only MacOS X is supported while Tensorflow >= 1.5 is required.

Install

In order to use the dataset you need to install it with pip:

pip install tfmongodb 

Install from source (Linux)

To use the tfmongodb you first have to install mongoc (http://mongoc.org/libmongoc/current/installing.html) and mongocxx (http://mongocxx.org/mongocxx-v3/installation/) libraries (follow the official manual for your distribution).

Afterwards you clone the repo:

git clone https://github.com/svenboesiger/tfmongodb.git 

Change to the directory:

cd tfmongodb 

Create a virtualenv called "venv_tf", switch to the directory, initialize and install tensorflow:

virtualenv venv_tf cd venv_tf/ source bin/activate pip install tensorflow 

Create the makefile:

cd .. cmake . 

Compile and link the library:

make 

Create the pip package:

cd dist/ ./build_wheel.sh 

Install the wheel:

cd .. cd dist pip install TF<Version> 

Usage

TFMongoDB can be accessed through the MongoDBDataset:

dataset = MongoDBDataset(<database_name>, <collection_name>) 

example:

from tfmongodb import MongoDBDataset from tensorflow.python.framework import ops from tensorflow.python.data.ops import iterator_ops import tensorflow as tf CSV_TYPES = [[""], [""], [0]] def _parse_line(line): fields = tf.decode_csv(line, record_defaults=CSV_TYPES) return fields dataset = MongoDBDataset("eccounting", "creditors") dataset = dataset.map(_parse_line) repeat_dataset2 = dataset.repeat() batch_dataset = repeat_dataset2.batch(20) iterator = iterator_ops.Iterator.from_structure(batch_dataset.output_types) #init_op = iterator.make_initializer(dataset) init_batch_op = iterator.make_initializer(batch_dataset) get_next = iterator.get_next() with tf.Session() as sess: sess.run(init_batch_op, feed_dict={}) for i in range(5): print(sess.run(get_next)) 

About

A tensorflow MongoDB connector. The connector opens a native link from Tensorflow to your MongoDB database and hence offers superior performance when data is retrieved from this database.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •