Return to Question

added 186 characters in body

edited Feb 29, 2016 at 1:24

2.7k
6
41
51

I am creating a fake dataset, and would like to essentially disaggregate a sum to create dummy rows that I can populate with random dates.

For example, my df might look like this:

id orders skips joe 3 0 mary 2 1 jack 5 1

I want to produce is a data.frame or data.table that looks like this, where a successful order is 1 and a skip is 0:

id order joe 1 joe 1 joe 1 mary 1 mary 0 mary 1 jack 1 jack 1 jack 1 jack 1 jack 10 jack 01

ADDITION: Ideally, the 0 values would be randomly mixed/sandwiched between 1 values if possible. This is due to a quirk of what the dataset will be used for in a problem set.

In a perfect world, I'd then assign a random start_date from a given range to each order within id, such that:

id order date joe 1 1/2/2016 joe 1 1/3/2016 joe 1 1/8/2016 mary 1 1/10/2016 mary 0 1/3/2016 mary 1 1/5/2016 jack 1 1/7/2016 jack 1 1/2/2016 jack 1 1/1/2016 jack 1 1/10/2016 jack 0 1/12/2016 jack 1 1/15/2016

I initially thought that I could use a combination of dcast and reshape to trick R into making the dataset, e.g.dcast(df,id~orders,fun.aggregate=length) but this took me down the wrong path.

But, one must walk before they crawl. Anyone able to help?

I am creating a fake dataset, and would like to essentially disaggregate a sum to create dummy rows that I can populate with random dates.

For example, my df might look like this:

id orders skips joe 3 0 mary 2 1 jack 5 1

I want to produce is a data.frame or data.table that looks like this, where a successful order is 1 and a skip is 0:

id order joe 1 joe 1 joe 1 mary 1 mary 0 mary 1 jack 1 jack 1 jack 1 jack 1 jack 1 jack 0

In a perfect world, I'd then assign a random start_date from a given range to each order within id, such that:

id order date joe 1 1/2/2016 joe 1 1/3/2016 joe 1 1/8/2016 mary 1 1/10/2016 mary 0 1/3/2016 mary 1 1/5/2016 jack 1 1/7/2016 jack 1 1/2/2016 jack 1 1/1/2016 jack 1 1/10/2016 jack 0 1/12/2016 jack 1 1/15/2016

I initially thought that I could use a combination of dcast and reshape to trick R into making the dataset, e.g.dcast(df,id~orders,fun.aggregate=length) but this took me down the wrong path.

But, one must walk before they crawl. Anyone able to help?

I am creating a fake dataset, and would like to essentially disaggregate a sum to create dummy rows that I can populate with random dates.

For example, my df might look like this:

id orders skips joe 3 0 mary 2 1 jack 5 1

I want to produce is a data.frame or data.table that looks like this, where a successful order is 1 and a skip is 0:

id order joe 1 joe 1 joe 1 mary 1 mary 0 mary 1 jack 1 jack 1 jack 1 jack 1 jack 0 jack 1

ADDITION: Ideally, the 0 values would be randomly mixed/sandwiched between 1 values if possible. This is due to a quirk of what the dataset will be used for in a problem set.

In a perfect world, I'd then assign a random start_date from a given range to each order within id, such that:

id order date joe 1 1/2/2016 joe 1 1/3/2016 joe 1 1/8/2016 mary 1 1/10/2016 mary 0 1/3/2016 mary 1 1/5/2016 jack 1 1/7/2016 jack 1 1/2/2016 jack 1 1/1/2016 jack 1 1/10/2016 jack 0 1/12/2016 jack 1 1/15/2016

I initially thought that I could use a combination of dcast and reshape to trick R into making the dataset, e.g.dcast(df,id~orders,fun.aggregate=length) but this took me down the wrong path.

But, one must walk before they crawl. Anyone able to help?

added 200 characters in body

Source Link

edited Feb 28, 2016 at 23:59

roody

2.7k
6
41
51

I am creating a fake dataset, and would like to essentially disaggregate a sum to create dummy rows that I can populate with random dates.

For example, my df might look like this:

id orders skips joe 3 0 mary 2 1 jack 5 1

I want to produce is a data.frame or data.table that looks like this, where a successful order is 1 and a skip is 0:

id order joe 1 joe 1 joe 1 mary 1 mary 0 mary 1 jack 1 jack 1 jack 1 jack 1 jack 1 jack 0

In a perfect world, I'd then assign a random start_date from a given range to each order within id, such that:

id order date joe 1 1/2/2016 joe 1 1/3/2016 joe 1 1/8/2016 mary 1 1/10/2016 mary 0 1/3/2016 mary 1 1/5/2016 jack 1 1/7/2016 jack 1 1/2/2016 jack 1 1/1/2016 jack 1 1/10/2016 jack 0 1/12/2016 jack 1 1/15/2016

I initially thought that I could use a combination of dcast and reshape to trick R into making the dataset, e.g.dcast(df,id~orders,fun.aggregate=length) but this took me down the wrong path.

But, one must walk before they crawl. Anyone able to help?

I am creating a fake dataset, and would like to essentially disaggregate a sum to create dummy rows that I can populate with random dates.

For example, my df might look like this:

id orders skips joe 3 0 mary 2 1 jack 5 1

I want to produce is a data.frame or data.table that looks like this, where a successful order is 1 and a skip is 0:

id order joe 1 joe 1 joe 1 mary 1 mary 0 mary 1 jack 1 jack 1 jack 1 jack 1 jack 1 jack 0

In a perfect world, I'd then assign a random start_date from a given range to each order within id, such that:

id order date joe 1 1/2/2016 joe 1 1/3/2016 joe 1 1/8/2016 mary 1 1/10/2016 mary 0 1/3/2016 mary 1 1/5/2016 jack 1 1/7/2016 jack 1 1/2/2016 jack 1 1/1/2016 jack 1 1/10/2016 jack 0 1/12/2016 jack 1 1/15/2016

But, one must walk before they crawl. Anyone able to help?

I am creating a fake dataset, and would like to essentially disaggregate a sum to create dummy rows that I can populate with random dates.

For example, my df might look like this:

id orders skips joe 3 0 mary 2 1 jack 5 1

I want to produce is a data.frame or data.table that looks like this, where a successful order is 1 and a skip is 0:

id order joe 1 joe 1 joe 1 mary 1 mary 0 mary 1 jack 1 jack 1 jack 1 jack 1 jack 1 jack 0

In a perfect world, I'd then assign a random start_date from a given range to each order within id, such that:

id order date joe 1 1/2/2016 joe 1 1/3/2016 joe 1 1/8/2016 mary 1 1/10/2016 mary 0 1/3/2016 mary 1 1/5/2016 jack 1 1/7/2016 jack 1 1/2/2016 jack 1 1/1/2016 jack 1 1/10/2016 jack 0 1/12/2016 jack 1 1/15/2016

I initially thought that I could use a combination of dcast and reshape to trick R into making the dataset, e.g.dcast(df,id~orders,fun.aggregate=length) but this took me down the wrong path.

But, one must walk before they crawl. Anyone able to help?

Source Link

asked Feb 28, 2016 at 23:51

roody

2.7k
6
41
51

Create individual rows based on sum value for fake dataset

I am creating a fake dataset, and would like to essentially disaggregate a sum to create dummy rows that I can populate with random dates.

For example, my df might look like this:

id orders skips joe 3 0 mary 2 1 jack 5 1

I want to produce is a data.frame or data.table that looks like this, where a successful order is 1 and a skip is 0:

id order joe 1 joe 1 joe 1 mary 1 mary 0 mary 1 jack 1 jack 1 jack 1 jack 1 jack 1 jack 0

In a perfect world, I'd then assign a random start_date from a given range to each order within id, such that:

id order date joe 1 1/2/2016 joe 1 1/3/2016 joe 1 1/8/2016 mary 1 1/10/2016 mary 0 1/3/2016 mary 1 1/5/2016 jack 1 1/7/2016 jack 1 1/2/2016 jack 1 1/1/2016 jack 1 1/10/2016 jack 0 1/12/2016 jack 1 1/15/2016

But, one must walk before they crawl. Anyone able to help?

Collectives™ on Stack Overflow

Return to Question

Create individual rows based on sum value for fake dataset