4

CrossFilter/JS newbie here.

This question pretty much describes exactly what I'm trying to do but there doesn't seem to be a solution using CrossFilter:

How to return the number of unique values by category using crossfilter?

I have data with

var va = [{ date: "2014-10-01", id: "1"}, { date: "2014-10-02", id: "1"}, { date: "2014-10-03", id: "1"}, { date: "2014-10-04", id: "1"}, { date: "2014-10-05", id: "1"}, { date: "2014-10-01", id: "2"}, { date: "2014-10-02", id: "2"}, { date: "2014-10-03", id: "2"}, { date: "2014-10-04", id: "1"}, { date: "2014-10-01", id: "3"}, { date: "2014-10-02", id: "3"}, { date: "2014-10-03", id: "1"}, { date: "2014-10-01", id: "4"}, { date: "2014-10-02", id: "1"}, { date: "2014-10-01", id: "5"} } 

I am trying to get the number of unique id's per date from this. I would like to group by date and basically have a count of unique id's for that particular date:

"2014-10-01" - 5 "2014-10-02" - 3 "2014-10-03" - 2 "2014-10-04" - 1 "2014-10-05" - 1 

Currently, I'm trying to follow the answer given in this question

Crossfilter reduce :: find number of uniques

to do the following:

//Create a Crossfilter instance var ndx = crossfilter(va); //Define dimensions var date_dim = ndx.dimension(function(d) { return d["date"]; }); //total number of ids per date var num_ids_by_date = date_dim.group(); //unique number of ids per date var num_uniq_ids_by_date = date_dim .group() .reduce( function (p, d) { if(d.id in p.ids){ } else{ p.ids[d.id] = 1; } return p; }, function (p, d) { p.ids[d.id]--; if(p.ids[d.id] === 0){ delete p.ids[d.id]; } return p; }, function () { return {ids: {}}; }) 

When I look in the num_uniq_ids_by_date object and call num_uniq_ids_by_date.reduceCount().top(1), it seems to be the same output as num_ids_by_date.top(1).

So, I still don't seem to be getting what I'm looking for and have been stumped for a while.

Any suggestions? Thanks in advance!

4
  • Seems like you aren't incrementing the counter on add, which will cause you problems. If you put together a working example, it will be easier to diagnose the issue. You could also use a library like Reductio, which supports this: github.com/esjewett/… (plugging my own library, sorry) Commented Jul 28, 2015 at 21:03
  • Thanks for the response Ethan. The reason I don't increment the counter on add is because I don't entirely care about the amount of each particular id, I just would like the number of unique ids. Also, thanks for the library suggestion, I'll definitely check it out. If possible, I would like to keep it to just the CrossFilter library for now while I'm still learning :) Commented Jul 28, 2015 at 21:51
  • If you don't increment on add, but you decrement on remove (which you're doing), you're going to get into an inconsistent state pretty fast. I didn't see your actual question though. Calling num_uniq_ids_by_date.reduceCount() wipes out all your custom group reducers. Just call num_uniq_ids_by_date.top(1). Commented Jul 29, 2015 at 1:35
  • Oops, that was a mistake on my part - thanks for pointing it out. Thanks for the suggestions! I was actually able to get it. I'll be sure to add my answer. Commented Jul 29, 2015 at 19:36

1 Answer 1

3

Okay I was able to get it.

What I ended up doing is the following:

//Create a Crossfilter instance var ndx = crossfilter(va); //Define dimensions var date_dim = ndx.dimension(function(d) { return d["date"]; }); var num_unique_ids_by_date = date_dim .group() .reduce( function (p, d) { if(d.id in p.ids){ p.ids[d.id] += 1 } else{ p.ids[d.id] = 1; p.id_count++; } return p; }, function (p, d) { p.ids[d.id]--; if(p.ids[d.id] === 0){ delete p.ids[d.id]; p.id_count--; } return p; }, function () { return {ids: {}, id_count: 0}; }); 

This gives me a total number of unique id's as well as the total numer of occurences of each id.

Then when I want to display this in my bar graph using dc.js, I go ahead and use the following code.

var minDate = date_dim.bottom(1)[0]["date"]; var maxDate = date_dim.top(1)[0]["date"]; var timeChart = dc.barChart("#time-chart"); timeChart .width(1500) .height(400) .margins({top: 10, right: 50, bottom: 30, left: 50}) .dimension(date_dim) .group(num_unique_ids_by_date) .valueAccessor(function (d) { return d.value.id_count; }) .transitionDuration(500) .x(d3.time.scale().domain([minDate, maxDate])) .elasticY(true) .elasticX(true) .xAxisLabel("Year") .yAxis(); dc.renderAll(); 
Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.