Get distinct results (filtered results) of Splunk Query based on a results field/string value

Question

I have a splunk query something like

index=myIndex* source="source/path/of/logs/*.log" "Elephant"

Thus, this brings up about 2,000 results which are JSON responses from one of my APIs that include the world "Elephant". This is kind of what I want - However, some of these results have duplicate carId fields, and I only want Splunk to show me the unique search results

The Results of Splunk looks something like this:

MyApiRequests {"carId":3454353435,"make":"toyota","year":"2015","model":"camry","value":25000.00}

NOW, I just want to filter on the carId's that are unique. I don't want duplicates. Thus, I would expect the original value of 2,000 results to decrease quite a bit.

Can anyone help me formulate my Splunk Query to achieve this?

warren · Accepted Answer · 2021-05-06 20:11:33Z

12

stats will be your friend here.

Consider the following:

index=myIndex* source="source/path/of/logs/*.log" "Elephant" carId=* | stats values(*) as * by carId

answered May 6, 2021 at 20:11

warren

33.6k23 gold badges90 silver badges131 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

PainIsAMaster Over a year ago

Interesting. When I try this, I get 0 results back.

RichG Over a year ago

This answer and @Mads Hansen's presume the carId field is extracted already. If it isn't the neither query will work. The fields can be extracted automatically by specifying either INDEXED_EXTRACTION=JSON or KV_MODE=json in props.conf. Otherwise, you can use the spath command in a query. Either way, the JSON must be in the correct format. For improper JSON, you can use rex to extract fields.

warren Over a year ago

@RichG - ennth indicated the field seems to be "available" already

PainIsAMaster Over a year ago

Yes, if you do "fields carId" or the "carId=*" as the post stated, it will automatically extract the field "carId" with those values. You can see it if you go to the left side bar of your splunk, it will be extracted there . For some reason, I can only get this to work with results in my _raw area that are in the key=value format. The only thing I can't figure out now is that stats(values) never returns Unique values for me, despite everyone saying it returns only unique values.

warren Over a year ago

@ennth - are you sure you have the spelling on the field name correct?

Mads Hansen · Accepted Answer · 2021-05-06 20:06:34Z

9

You could use dedup

index=myIndex* source="source/path/of/logs/*.log" "Elephant" | dedup carId

answered May 6, 2021 at 20:06

Mads Hansen

67.6k12 gold badges119 silver badges154 bronze badges

2 Comments

PainIsAMaster Over a year ago

Okay I tried piping the results (which there was 2000) into dedup and I get 0 events as the results.... I expected to get a filtered list of the results back. I'm assuming if I had, say 5 duplicates, this would have got returned to me... So Is this how dedup works?

warren Over a year ago

You can use dedup. But you generally shouldn't. It's a very inefficient operation in Splunk

Collectives™ on Stack Overflow

Get distinct results (filtered results) of Splunk Query based on a results field/string value

2 Answers 2

5 Comments

2 Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

5 Comments

2 Comments

Linked

Related