fluent-plugin-azuresearch is a fluent plugin to output to Azure Search
| fluent-plugin-azuresearch | fluentd | ruby |
|---|---|---|
| >= 0.2.0 | >= v0.14.15 | >= 2.1 |
| < 0.2.0 | >= v0.12.0 | >= 1.9 |
$ gem install fluent-plugin-azuresearch To use Microsoft Azure Search, you must create an Azure Search service in the Azure Portal. Also you must have an index, persisted storage of documents to which fluent-plugin-azuresearch writes event stream out. Here are instructions:
<match azuresearch.*> @type azuresearch @log_level info endpoint https://AZURE_SEARCH_ACCOUNT.search.windows.net api_key AZURE_SEARCH_API_KEY search_index messages column_names id,user_name,message,tag,created_at key_names postid,user,content,tag,posttime </match> - endpoint (required) - Azure Search service endpoint URI
- api_key (required) - Azure Search API key
- search_index (required) - Azure Search Index name to insert records
- column_names (required) - Column names in a target Azure search index. Each column needs to be separated by a comma.
- key_names (optional) - Default:nil. Key names in incomming record to insert. Each key needs to be separated by a comma. ${time} is placeholder for Time.at(time).strftime("%Y-%m-%dT%H:%M:%SZ"), and ${tag} is placeholder for tag. By default, key_names is as same as column_names
[note] @log_level is a fluentd built-in parameter (optional) that controls verbosity of logging: fatal|error|warn|info|debug|trace (See also Logging of Fluentd)
Suppose you have the following fluent.conf and azure search index schema:
fluent.conf
<match azuresearch.*> @type azuresearch endpoint https://yoichidemo.search.windows.net api_key 2XX3D2456052A9AD21E54CB03C3ABF6A(dummy) search_index messages column_names id,user_name,message,created_at </match> Azure Search Schema: messages
{ "name": "messages", "fields": [ { "name":"id", "type":"Edm.String", "key": true, "searchable": false }, { "name":"user_name", "type":"Edm.String" }, { "name":"message", "type":"Edm.String", "filterable":false, "sortable":false, "facetable":false, "analyzer":"en.lucene" }, { "name":"created_at", "type":"Edm.DateTimeOffset", "facetable":false} ] } The plugin will write event stream out to Azure Ssearch like this:
Input event stream
{ "id": "1", "user_name": "taylorswift13", "message":"post by taylorswift13", "created_at":"2016-01-29T00:00:00Z" }, { "id": "2", "user_name": "katyperry", "message":"post by katyperry", "created_at":"2016-01-30T00:00:00Z" }, { "id": "3", "user_name": "ladygaga", "message":"post by ladygaga", "created_at":"2016-01-31T00:00:00Z" } Search results
"value": [ { "@search.score": 1, "id": "1", "user_name": "taylorswift13", "message": "post by taylorswift13", "created_at": "2016-01-29T00:00:00Z" }, { "@search.score": 1, "id": "2", "user_name": "katyperry", "message": "post by katyperry", "created_at": "2016-01-30T00:00:00Z" }, { "@search.score": 1, "id": "3", "user_name": "ladygaga", "message": "post by ladygaga", "created_at": "2016-01-31T00:00:00Z" } ] Suppose you have the following fluent.conf and azure search index schema:
fluent.conf
<match azuresearch.*> @type azuresearch endpoint https://yoichidemo.search.windows.net api_key 2XX3D2456052A9AD21E54CB03C3ABF6A(dummy) search_index messages column_names id,user_name,message,created_at key_names postid,user,content,posttime </match> Azure Search Schema: messages
{ "name": "messages", "fields": [ { "name":"id", "type":"Edm.String", "key": true, "searchable": false }, { "name":"user_name", "type":"Edm.String" }, { "name":"message", "type":"Edm.String", "filterable":false, "sortable":false, "facetable":false, "analyzer":"en.lucene" }, { "name":"created_at", "type":"Edm.DateTimeOffset", "facetable":false} ] } The plugin will write event stream out to Azure Ssearch like this:
Input event stream
{ "postid": "1", "user": "taylorswift13", "content":"post by taylorswift13", "posttime":"2016-01-29T00:00:00Z" }, { "postid": "2", "user": "katyperry", "content":"post by katyperry", "posttime":"2016-01-30T00:00:00Z" }, { "postid": "3", "user": "ladygaga", "content":"post by ladygaga", "posttime":"2016-01-31T00:00:00Z" } Search results
"value": [ { "@search.score": 1, "id": "1", "user_name": "taylorswift13", "message": "post by taylorswift13", "created_at": "2016-01-29T00:00:00Z" }, { "@search.score": 1, "id": "2", "user_name": "katyperry", "message": "post by katyperry", "created_at": "2016-01-30T00:00:00Z" }, { "@search.score": 1, "id": "3", "user_name": "ladygaga", "message": "post by ladygaga", "created_at": "2016-01-31T00:00:00Z" } ] fluent.conf
<match azuresearch.*> @type azuresearch endpoint https://yoichidemo.search.windows.net api_key 2XX3D2456052A9AD21E54CB03C3ABF6A(dummy) search_index messages column_names id,user_name,message,tag,created_at key_names postid,user,content,${tag},${time} </match> Azure Search Schema: messages
{ "name": "messages", "fields": [ { "name":"id", "type":"Edm.String", "key": true, "searchable": false }, { "name":"user_name", "type":"Edm.String" }, { "name":"message", "type":"Edm.String", "filterable":false, "sortable":false, "facetable":false, "analyzer":"en.lucene" }, { "name":"created_at", "type":"Edm.DateTimeOffset", "facetable":false} ] } The plugin will write event stream out to Azure Ssearch like this:
Input event stream
{ "id": "1", "user_name": "taylorswift13", "message":"post by taylorswift13" }, { "id": "2", "user_name": "katyperry", "message":"post by katyperry" }, { "id": "3", "user_name": "ladygaga", "message":"post by ladygaga" } Search results
"value": [ { "@search.score": 1, "id": "1", "user_name": "taylorswift13", "message": "post by taylorswift13", "tag": "azuresearch.msg", "created_at": "2016-01-31T21:03:41Z" }, { "@search.score": 1, "id": "2", "user_name": "katyperry", "message": "post by katyperry", "tag": "azuresearch.msg", "created_at": "2016-01-31T21:03:41Z" }, { "@search.score": 1, "id": "3", "user_name": "ladygaga", "message": "post by ladygaga", "tag": "azuresearch.msg", "created_at": "2016-01-31T21:03:41Z" } ] [note] the value of created_at above is the time when fluentd actually recieves its corresponding input event.
$ git clone https://github.com/yokawasa/fluent-plugin-azuresearch.git $ cd fluent-plugin-azuresearch # edit CONFIG params of test/plugin/test_azuresearch.rb $ vi test/plugin/test_azuresearch.rb # run test $ rake test $ rake build $ rake install:local # running fluentd with your fluent.conf $ fluentd -c fluent.conf -vv & # send test input event to test plugin using fluent-cat $ echo ' { "postid": "100", "user": "ladygaga", "content":"post by ladygaga"}' | fluent-cat azuresearch.msg Please don't forget that you need forward input configuration to receive the message from fluent-cat
<source> @type forward </source> - Input validation for Azure Search - check total size of columns to add
- http://yokawasa.github.io/fluent-plugin-azuresearch
- https://rubygems.org/gems/fluent-plugin-azuresearch
Bug reports and pull requests are welcome on GitHub at https://github.com/yokawasa/fluent-plugin-azuresearch.
| Copyright | Copyright (c) 2016- Yoichi Kawasaki |
| License | Apache License, Version 2.0 |