Skip to main content
added 186 characters in body
Source Link
roody
  • 2.7k
  • 6
  • 41
  • 51

I am creating a fake dataset, and would like to essentially disaggregate a sum to create dummy rows that I can populate with random dates.

For example, my df might look like this:

id orders skips joe 3 0 mary 2 1 jack 5 1 

I want to produce is a data.frame or data.table that looks like this, where a successful order is 1 and a skip is 0:

id order joe 1 joe 1 joe 1 mary 1 mary 0 mary 1 jack 1 jack 1 jack 1 jack 1 jack 10 jack 01 

ADDITION: Ideally, the 0 values would be randomly mixed/sandwiched between 1 values if possible. This is due to a quirk of what the dataset will be used for in a problem set.

In a perfect world, I'd then assign a random start_date from a given range to each order within id, such that:

id order date joe 1 1/2/2016 joe 1 1/3/2016 joe 1 1/8/2016 mary 1 1/10/2016 mary 0 1/3/2016 mary 1 1/5/2016 jack 1 1/7/2016 jack 1 1/2/2016 jack 1 1/1/2016 jack 1 1/10/2016 jack 0 1/12/2016 jack 1 1/15/2016 

I initially thought that I could use a combination of dcast and reshape to trick R into making the dataset, e.g.dcast(df,id~orders,fun.aggregate=length) but this took me down the wrong path.

But, one must walk before they crawl. Anyone able to help?

I am creating a fake dataset, and would like to essentially disaggregate a sum to create dummy rows that I can populate with random dates.

For example, my df might look like this:

id orders skips joe 3 0 mary 2 1 jack 5 1 

I want to produce is a data.frame or data.table that looks like this, where a successful order is 1 and a skip is 0:

id order joe 1 joe 1 joe 1 mary 1 mary 0 mary 1 jack 1 jack 1 jack 1 jack 1 jack 1 jack 0 

In a perfect world, I'd then assign a random start_date from a given range to each order within id, such that:

id order date joe 1 1/2/2016 joe 1 1/3/2016 joe 1 1/8/2016 mary 1 1/10/2016 mary 0 1/3/2016 mary 1 1/5/2016 jack 1 1/7/2016 jack 1 1/2/2016 jack 1 1/1/2016 jack 1 1/10/2016 jack 0 1/12/2016 jack 1 1/15/2016 

I initially thought that I could use a combination of dcast and reshape to trick R into making the dataset, e.g.dcast(df,id~orders,fun.aggregate=length) but this took me down the wrong path.

But, one must walk before they crawl. Anyone able to help?

I am creating a fake dataset, and would like to essentially disaggregate a sum to create dummy rows that I can populate with random dates.

For example, my df might look like this:

id orders skips joe 3 0 mary 2 1 jack 5 1 

I want to produce is a data.frame or data.table that looks like this, where a successful order is 1 and a skip is 0:

id order joe 1 joe 1 joe 1 mary 1 mary 0 mary 1 jack 1 jack 1 jack 1 jack 1 jack 0 jack 1 

ADDITION: Ideally, the 0 values would be randomly mixed/sandwiched between 1 values if possible. This is due to a quirk of what the dataset will be used for in a problem set.

In a perfect world, I'd then assign a random start_date from a given range to each order within id, such that:

id order date joe 1 1/2/2016 joe 1 1/3/2016 joe 1 1/8/2016 mary 1 1/10/2016 mary 0 1/3/2016 mary 1 1/5/2016 jack 1 1/7/2016 jack 1 1/2/2016 jack 1 1/1/2016 jack 1 1/10/2016 jack 0 1/12/2016 jack 1 1/15/2016 

I initially thought that I could use a combination of dcast and reshape to trick R into making the dataset, e.g.dcast(df,id~orders,fun.aggregate=length) but this took me down the wrong path.

But, one must walk before they crawl. Anyone able to help?

added 200 characters in body
Source Link
roody
  • 2.7k
  • 6
  • 41
  • 51

I am creating a fake dataset, and would like to essentially disaggregate a sum to create dummy rows that I can populate with random dates.

For example, my df might look like this:

id orders skips joe 3 0 mary 2 1 jack 5 1 

I want to produce is a data.frame or data.table that looks like this, where a successful order is 1 and a skip is 0:

id order joe 1 joe 1 joe 1 mary 1 mary 0 mary 1 jack 1 jack 1 jack 1 jack 1 jack 1 jack 0 

In a perfect world, I'd then assign a random start_date from a given range to each order within id, such that:

id order date joe 1 1/2/2016 joe 1 1/3/2016 joe 1 1/8/2016 mary 1 1/10/2016 mary 0 1/3/2016 mary 1 1/5/2016 jack 1 1/7/2016 jack 1 1/2/2016 jack 1 1/1/2016 jack 1 1/10/2016 jack 0 1/12/2016 jack 1 1/15/2016 

I initially thought that I could use a combination of dcast and reshape to trick R into making the dataset, e.g.dcast(df,id~orders,fun.aggregate=length) but this took me down the wrong path.

But, one must walk before they crawl. Anyone able to help?

I am creating a fake dataset, and would like to essentially disaggregate a sum to create dummy rows that I can populate with random dates.

For example, my df might look like this:

id orders skips joe 3 0 mary 2 1 jack 5 1 

I want to produce is a data.frame or data.table that looks like this, where a successful order is 1 and a skip is 0:

id order joe 1 joe 1 joe 1 mary 1 mary 0 mary 1 jack 1 jack 1 jack 1 jack 1 jack 1 jack 0 

In a perfect world, I'd then assign a random start_date from a given range to each order within id, such that:

id order date joe 1 1/2/2016 joe 1 1/3/2016 joe 1 1/8/2016 mary 1 1/10/2016 mary 0 1/3/2016 mary 1 1/5/2016 jack 1 1/7/2016 jack 1 1/2/2016 jack 1 1/1/2016 jack 1 1/10/2016 jack 0 1/12/2016 jack 1 1/15/2016 

But, one must walk before they crawl. Anyone able to help?

I am creating a fake dataset, and would like to essentially disaggregate a sum to create dummy rows that I can populate with random dates.

For example, my df might look like this:

id orders skips joe 3 0 mary 2 1 jack 5 1 

I want to produce is a data.frame or data.table that looks like this, where a successful order is 1 and a skip is 0:

id order joe 1 joe 1 joe 1 mary 1 mary 0 mary 1 jack 1 jack 1 jack 1 jack 1 jack 1 jack 0 

In a perfect world, I'd then assign a random start_date from a given range to each order within id, such that:

id order date joe 1 1/2/2016 joe 1 1/3/2016 joe 1 1/8/2016 mary 1 1/10/2016 mary 0 1/3/2016 mary 1 1/5/2016 jack 1 1/7/2016 jack 1 1/2/2016 jack 1 1/1/2016 jack 1 1/10/2016 jack 0 1/12/2016 jack 1 1/15/2016 

I initially thought that I could use a combination of dcast and reshape to trick R into making the dataset, e.g.dcast(df,id~orders,fun.aggregate=length) but this took me down the wrong path.

But, one must walk before they crawl. Anyone able to help?

Source Link
roody
  • 2.7k
  • 6
  • 41
  • 51

Create individual rows based on sum value for fake dataset

I am creating a fake dataset, and would like to essentially disaggregate a sum to create dummy rows that I can populate with random dates.

For example, my df might look like this:

id orders skips joe 3 0 mary 2 1 jack 5 1 

I want to produce is a data.frame or data.table that looks like this, where a successful order is 1 and a skip is 0:

id order joe 1 joe 1 joe 1 mary 1 mary 0 mary 1 jack 1 jack 1 jack 1 jack 1 jack 1 jack 0 

In a perfect world, I'd then assign a random start_date from a given range to each order within id, such that:

id order date joe 1 1/2/2016 joe 1 1/3/2016 joe 1 1/8/2016 mary 1 1/10/2016 mary 0 1/3/2016 mary 1 1/5/2016 jack 1 1/7/2016 jack 1 1/2/2016 jack 1 1/1/2016 jack 1 1/10/2016 jack 0 1/12/2016 jack 1 1/15/2016 

But, one must walk before they crawl. Anyone able to help?