Skip to main content
Tim J's user avatar
Tim J's user avatar
Tim J's user avatar
Tim J
  • Member for 3 years, 11 months
  • Last seen more than 1 year ago
  • the Netherlands
awarded
comment
Will a pre-trained model work in a totally different data domain?
It might also be very worth your while to take a look at fine-tuning. It will allow you to combine the strength of a pre-trained network and the domain knowledge advantage of training a new one. stats.stackexchange.com/questions/331369/…
Loading…
awarded
awarded
reviewed
Looks OK
reviewed
Looks OK
comment
How does Scikit learn KNN handle categorical input variables?
No free lunch, so it's hard to give a straight answer. But I would experiment with 'jaccard' and 'matching'. They are also in the docs, so should work like a charm.
answered
Loading…
awarded
awarded
comment
What is meant by averaging inhibits it in the paper 'Attention is All You Need'?
I’m afraid that’s a completely different question: “Where does averaging occur in single-head attention mechanism X?” Like I said, I’m no expert on attention mechanisms, so I cannot answer your follow-up question without some in-depth research. But then it’s probably more efficient if you did that yourself.
Loading…