1
$\begingroup$

I am using Python and I have a sample dataset of this kind:

columns = ['product_id', 'market_value2015', 'market_value2016', 'market_value2017', 'market_value2018', 'market_value2019', 'market_value2020', 'market_value2021', 'retired'], 

where market_value2015...2021 are floats and retired is a Boolean.

The objective is to train a neural network that predicts if a product will be retired or not.

Initially, my idea was to use floats variables from 2015 to 2020 in order to predict retirement in 2021 by considering these variables as time series, but I got stucked since I've never seen a dataset like this before. So, I don't know how to let the neural network understand that floats variables have the concept of time from 2015 to 2020, since they are always on the same level. I hope that this is clear.

$\endgroup$

2 Answers 2

1
$\begingroup$

How about trying formatting the experiment this way?

'product_id' | 'Year' | 'market_value' | 'retired'

01222 | 2015 | 20.0 | Yes

01222 | 2016 | 19.0 | Yes

01222 | 2017 | 18.0 | Yes

01222 | 2018 | 17.0 | Yes

. . . .

. . . .

01222 | 2021 | 12.0 | Yes

Alternatively, a retired product must have experienced deterioration over the years. Thus assuming a linear pattern of deterioration starting 2015 until 2021 :

'product_id' | slope | y-intercept | 'retired'

01222 | 2 | 0 | Yes

Furthermore, this data is "Panel data", a short time series per unique productid , not a regular time series problem. On that account, do not introduce the product id column at all to the neural network or ML when using that representation. You don't want it to treat every product separately! Instead, replace it with an integer value if possible, perhaps some product specifications.

Finally, probably RNNs will not be most useful for this short panel data, perhaps CNNs (they can capture longitudinal patterns ) or maybe LSTNets.

$\endgroup$
0
$\begingroup$

From a model point of view, there is no "time" concept.

What you are trying to do is to use a feature vector $x_t$, for $t$ ranging from 2015 to 2020 (why not 2021 as well?).

This is not a problem of time series forecast, but rather a time series (binary) classification. Each example is a time series, with an outcome retired or not.

$\endgroup$
5
  • $\begingroup$ So I'm not able to use Recurrent Neural Networks, right? In your opinion which model should I use? @Oscar $\endgroup$ Commented Apr 14, 2022 at 7:45
  • $\begingroup$ You can use RNN to encode your input time series but the aim of the model is to predict 1 element (the label) for each time series. In the same way you predict the class of an iris flower for each 4-feature vector. Just for instance, have a look at developer.ibm.com/learningpaths/… $\endgroup$ Commented Apr 15, 2022 at 8:29
  • $\begingroup$ Ok thanks. So just for clarity, there is no concept of "sequence of more rows" because each row is different in meaning from each other. But each row is still a time series of 7 features (value_2015 to value_2021), right? $\endgroup$ Commented Apr 18, 2022 at 9:55
  • $\begingroup$ that's correct. $\endgroup$ Commented Apr 20, 2022 at 10:57
  • $\begingroup$ ok thanks a lot. However, maybe I found a way to transform the dataset in order to introduce the concept of sequences and so the concept of time, but I would like to know your opinion first. This is the result of the transformation with the shape of the data structures reported ibb.co/8YJk2Bz . So in practice I create sequences of shape (9, 1) with a step of 9 (the entire sequence) each time, to differentiate the product and make each sequence different from others. The first layer of the network is an LSTM layer with an input shape of (None, 9, 1) in this case. Can it work? $\endgroup$ Commented Apr 20, 2022 at 15:53

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.