1/10
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
|---|
No study sessions yet.
SMART
Spécifique, Mesurable, Atteignable, Réaliste, Temporellement défini
Big Data 3V
Velocity, Volume, Variety
Method
Business Understanding (only for CRISP-DM)
2. Data Understanding
3. Data selection and prep(+transfo) (SEMMA Sample and touch datas Modify)
4. Model (KDD ouvert à tous)
5. Evaluate
6. Implement (only for CRISP-DM)
Preprocessing
Clean, Selection, Transformation, Feature Engineering, Dim reduc and web scraping
support
freq of a rule over data
confiance
freq of a rule over freq of antecedent
lift
prob of confidence of rule over proba of consequence
confidence of a rule over support of antecedent
KDD, SEMMA, CRSIP-DM
{Recherche et decouverte de modèle}, {SAS, focus exploration and modelization}, {Business,flexible,itératif}
Data Mining tools
R, SAS, WEKA, Orange, SQL Server Data Mining, KNIME, RAPID MINER, ORACLE Mining CORP, Google COLLAB avec Python
QQ Plot
Quantile Quantile plot, to see if data follow a distrib; take a quantile of data and quantile of theoretical distrib and plot them; if it fit a 45° line data follows distrib; quantile is $F^{-1}$ (with proba gives a value)
Stepwise : Forward, backward, stepwise(combined)