I know RNNs (with LSTMs or GRUs) are now one of the most promising options for modelling sequential data, when ordering of the data matters. However, sometimes there are also some categorical variables associated to each data point from the sequence that also matter to model the pattern. For example, consider some time-series with daily frequency; one could have a sequence like $343, 365, 222, 34, 675, 546$ corresponding to the days-of-year $1, 2, 3, 4, 5, 6$ and another similar sequence corresponding to the days $45, 46, 47, 48, 49, 50$. For some reason, day $7$ could have high probability of yielding a value very similar to the previous ones, e.g. $388$, while day $51$ would have high probability of yielding some value 2 orders of magnitude greater, e.g. $23467$. Therefore, my question is:
How can we model sequential data together with class variables associated to each data point?
It occurred to me I could think about this as modelling more than a sequence simultaneously - one sequence being the data points (the "data sequence") and the others being the class-labels ("class sequences"), which for some use-cases could be deterministic (as in "Tuesday always comes after Monday" or "day 7 comes after day 6", etc.). I would then feed two (or more) synchronized sequences and try to predict the next time-step value for each. E.g.,
Input: [[ 1, 2, 3, 4, 5, 6], -> Output: [[ 7], [343,365,222,34,675,546]] [388]] Input: [[ 45, 46, 47,48, 49, 50], -> Output: [[ 51], [342,368,242,43,612,549]] [23467]] Is this the correct approach? How should one model an RNN with several inputs? In particular, what's the input and what's the output? Should one use one-hot encoding with a high dimensional vector? No, right?
Another case where I think this could be applied would be trying to model many correlated "data sequences" simultaneously (e.g. temperature time-series for several cities in a country)
Also, how can we implement something like this using Tensorflow?