Skip to main content
Search type Search syntax
Tags [tag]
Exact "words here"
Author user:1234
user:me (yours)
Score score:3 (3+)
score:0 (none)
Answers answers:3 (3+)
answers:0 (none)
isaccepted:yes
hasaccepted:no
inquestion:1234
Views views:250
Code code:"if (foo != bar)"
Sections title:apples
body:"apples oranges"
URL url:"*.example.com"
Saves in:saves
Status closed:yes
duplicate:no
migrated:no
wiki:no
Types is:question
is:answer
Exclude -[tag]
-apples
For more details on advanced search visit our help page
Results tagged with
Search options not deleted user 609

Machine Learning is a subfield of computer science that draws on elements from algorithmic analysis, computational statistics, mathematics, optimization, etc. It is mainly concerned with the use of data to construct models that have high predictive/forecasting ability. Topics include modeling building, applications, theory, etc.

2 votes

Prerequisites for Data Science

So far the answers have focused on learning particular methods. They are fine, but they won't make you a Data Scientist. Being a Data Scientist is not solely or even primarily about having mastery of …
MrMeritology's user avatar
  • 1,840
3 votes

Does reinforcement learning only work on grid world?

Reinforcement learning does not depend on a grid world. It can be applied to any space of possibilities where there is a "fitness function" that maps between points in the space to a fitness metric. …
MrMeritology's user avatar
  • 1,840
6 votes
Accepted

What is the term for when a model acts on the thing being modeled and thus changes the concept?

There are three terms from social science that apply to your situation: Reflexivity - refers to circular relationships between cause and effect. In particular, you could use the definition of the t …
MrMeritology's user avatar
  • 1,840
2 votes

How to model this "un predicatability" problem?

tl;dr: This is not a great candidate for machine learning solutions. It is not exactly true that "the strength of an encryption scheme is measured by the randomness (unpredictability, or entropy) o …
MrMeritology's user avatar
  • 1,840
31 votes
Accepted

How to generate synthetic dataset using machine learning model learnt with original dataset?

The general approach is to do traditional statistical analysis on your data set to define a multidimensional random process that will generate data with the same statistical characteristics. The virt …
MrMeritology's user avatar
  • 1,840
0 votes

Looking for algebras designed to transform time series

The most direct and obvious transformation is from time domain to frequency domain. Possible methods include Fourier transform and wavelet transform. After the transform the signal is represented by …
MrMeritology's user avatar
  • 1,840
4 votes
Accepted

Rough vs Fuzzy vs Granular Computing

"Granularity" refers to the resolution of the variables under analysis. If you are analyzing height of people, you could use course-grained variables that have only a few possible values -- e.g. "abo …
MrMeritology's user avatar
  • 1,840
5 votes

Unstructured text classification

Topic Modeling would be a very appropriate method for your problem. Topic Models are a form of unsupervised learning/discovery, where a specified (or discovered) number of topics are defined by a lis …
MrMeritology's user avatar
  • 1,840
2 votes

non query-based document ranking

You could use Topic Modeling as described in this paper: http://faculty.chicagobooth.edu/workshops/orgs-markets/pdf/KaplanSwordWin2014.pdf They performed Topic Modeling on abstracts of patents (limit …
MrMeritology's user avatar
  • 1,840
1 vote

Distance calculation/vector range significance

I created a scoring system ("Thomas Scoring System") to deal with this problem. If you treat "distance" as a similarity score, this system should work for you. http://exploringpossibilityspace.blogsp …
MrMeritology's user avatar
  • 1,840
2 votes
Accepted

R Script to generate random dataset in 2d space

None of the algorithms you mention are good with data that has uniform distribution. size <- 20 #length of random number vectors set.seed(1) x <- runif(size) # generate samples …
MrMeritology's user avatar
  • 1,840
5 votes

Data driven approach to define a churn user

Another approach would be to model "churn" (aka "diminished use of the service, including non-use") as a process and not an event. Years ago in retention marketing this was called a "defection funnel …
MrMeritology's user avatar
  • 1,840
6 votes
Accepted

Word2Vec for Named Entity Recognition

Instead of "recursive neural nets with back propagation" you might consider the approach used by Frantzi, et. al. at National Centre for Text Mining (NaCTeM) at University of Manchester for Termine (s …
MrMeritology's user avatar
  • 1,840