Skip to main content
Bumped by Community user
Bumped by Community user
Bumped by Community user
Bumped by Community user
A few improvements
Link
gung - Reinstate Monica
  • 150.4k
  • 90
  • 418
  • 748

Machine Learning What the relation between a random variable and Random Variablesa sample (or dataset) in machine learning?

I'm having trouble with the Machine Learningmachine learning vocabulary, especially with the concept of Random Variablesrandom variables.

My question is very simple, givenGiven a sample X $X$ (x1,x2,...xnwith features $x_1, x_2, \dots, x_n$) that you train your algorithm on (or predict), what is the Random Variable random variable? Is it X $X$? Or is it any of its feature x1,x2,...xn $x_1, x_2, \dots, x_n$?

The thing is it's often not clearQuoting The Deep Learning book (at least for me) in ML litteracy, for instance quoting The Deep Learning Book fromby Ian Goodfellow):

A random variable is a variable that can take on different values randomly. We typically denote the random variable itself with a lowercase letter in plain typeface

Where as in the definition of a Random Variable in the Wikipedia article:

A random variable is a measurable function from a set of possible outcomes to a measurable space E

Then we also have the definition of multivariate random variable

A multivariatemultivariate random variable is a column vector (or its transpose, which is a row vector) whose components are scalar-valued random variables on the same probability space as each other.

Should a sample be in fact considered as a multivariate random variable  ?

Machine Learning and Random Variables

I'm having trouble with the Machine Learning vocabulary, especially with the concept of Random Variables.

My question is very simple, given a sample X (x1,x2,...xn) that you train your algorithm on (or predict), what is the Random Variable ? Is it X ? Or is it any of its feature x1,x2,...xn ?

The thing is it's often not clear (at least for me) in ML litteracy, for instance quoting The Deep Learning Book from Ian Goodfellow

A random variable is a variable that can take on different values randomly. We typically denote the random variable itself with a lowercase letter in plain typeface

Where as in the definition of a Random Variable

A random variable is a measurable function from a set of possible outcomes to a measurable space E

Then we also have the definition of multivariate random variable

A multivariate random variable is a column vector (or its transpose, which is a row vector) whose components are scalar-valued random variables on the same probability space as each other.

Should a sample be in fact considered as a multivariate random variable  ?

What the relation between a random variable and a sample (or dataset) in machine learning?

I'm having trouble with the machine learning vocabulary, especially with the concept of random variables.

Given a sample $X$ (with features $x_1, x_2, \dots, x_n$) that you train your algorithm on (or predict), what is the random variable? Is it $X$? Or is it any of its feature $x_1, x_2, \dots, x_n$?

Quoting The Deep Learning book (by Ian Goodfellow):

A random variable is a variable that can take on different values randomly. We typically denote the random variable itself with a lowercase letter in plain typeface

Where as in the definition of a Random Variable in the Wikipedia article:

A random variable is a measurable function from a set of possible outcomes to a measurable space E

Then we also have the definition of multivariate random variable

A multivariate random variable is a column vector (or its transpose, which is a row vector) whose components are scalar-valued random variables on the same probability space as each other.

Should a sample be in fact considered as a multivariate random variable?

Source Link
thomas.g
  • 161
  • 1
  • 3

Machine Learning and Random Variables

I'm having trouble with the Machine Learning vocabulary, especially with the concept of Random Variables.

My question is very simple, given a sample X (x1,x2,...xn) that you train your algorithm on (or predict), what is the Random Variable ? Is it X ? Or is it any of its feature x1,x2,...xn ?

The thing is it's often not clear (at least for me) in ML litteracy, for instance quoting The Deep Learning Book from Ian Goodfellow

A random variable is a variable that can take on different values randomly. We typically denote the random variable itself with a lowercase letter in plain typeface

Where as in the definition of a Random Variable

A random variable is a measurable function from a set of possible outcomes to a measurable space E

Then we also have the definition of multivariate random variable

A multivariate random variable is a column vector (or its transpose, which is a row vector) whose components are scalar-valued random variables on the same probability space as each other.

Should a sample be in fact considered as a multivariate random variable ?