Sunday, October 31, 2010

About Data Mining Concept (2)

Data Mining Language 

  • PMML (Predictive Model Markup Language) which provides a standard way to represent data mining models so that these can be shared between different statistical applications. PMML is an XML-based language developed by the Data Mining Group (DMG), an independent group composed of many data mining companies.
  • R (Programming Language) R is a programming language and software environment for statistical computing and graphics. The R language has become a de facto standard among statisticians for the development of statistical software, and is widely used for statistical software development and data analysis.

Predictive Modeling

Predictive modeling is the process by which a model is created or chosen to try to best predict the probability of an outcome. In many cases the model is chosen on the basis of detection theory to try to guess the probability of an outcome given a set amount of input data
Models or classifiers can use one or more classifiers in trying to determine the probability of a set of data belonging to another set. Here are some modeling technologies:

·         Naive Bayes

·         K-nearest neighbor algorithm

·         Majority classifier

·         Support vector machines

·         Logistic regression

·         Uplift Modeling

(To be continued)