1/22
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
what is the name to qualify columns in Power Bi ? The 3 classifications ?
Column Quality, Valid, empty, error
Different façon de stocker les données ?
Flat, normalized, star, snowflake (normalised perfect for relational database) Dimensional perfect for data analysis
OLTP VS OLAP
Online transactions processing et Online analytical processing. OLAP more for data analysis.
Types of business Analytics ?
Descriptive what happened ? Power Bi
Predictive What will happen ? Knime
Prescriptive What should we do ? Excel
Data preprocessing
consolidation (access and collect data)
Cleaning (handle missing values in the data, error)
transform (aggregate, ad calcul)
reduction (Reduce number of records, number of attributes)
Types of reports ?
Metric managements reports
Dashboards reports graphic presentation in one page
Balanced scorecards reports Finance, customer, Business process, learn and growth indicator)
What is data mining ?
Data mining is dig data to find something valuables
Types of knowledge discovering on data mining?
Associations Occurrences links to single events when you buy chips => buy also coke
Sequences Events linked overtime buy house =>refrigerator, oven
Classifications recognise patterns that describe group to which item belong tree decision
Clustering similar to classifications
Forecasting uses series of real values to forecast futur value
Occurrences links to single event Exemple when people buy corn chips => Corn chips 65% of time is bought (if promotion its 85% of the time)
Technique : Association rule mining (market basket Analysis)
Uses series of existing values to forecast what other values will be Exemple predicting future sales / demand regression
c’est quoi les data mining methods
Classifications How to classify objects/ entities based on their characteristics. Most frequently used DM METHOD
Cluster Automated identification of natural grouping of things (ex-customers).Unsupervised Learning Used for Marketing segmentation Image processing, anormaly detection
Associations analytics
Associate rule Mining/Market
type sof nodes in knime 4
I/O CSV READER /WRITER
Manipulations Column Filter, missing values
Analytics/Mining Decision Tree Learner, k-Means
Analytics/Statistics Nodes linear regression
NLP & NLU
Natural language processing
Natural language understanding understanding syntax semantic and sentiment NLG
Text mining Knowledge (patterns,…..,) discovery from …. data
Data ( ...............................................................)…. data
information , relationship trends unstructured text
structured
How can we find the best solution ?
from large data :
1 Using mathematical modeling => Mathematical Formulas linear Programming
2using analytics formula
from small data:
1) Using Sensitivity Analysis
=> What if Analysis Goal Seek
2) Using Decision Analysis
=> Decision Tables, Decision Trees
Improvements methods find a good enough methods 1) Using Simulation (Trying to do/ by experimentation)
2) Using Heuristic (by if-Then rules)
what is linear programming ?
A mathematical modeling for the optimal solution for ressource solution problems
how can we improve models ?Find a good enough solution
1) Using Simulation (Trying to do/ by experimentation)
2) Using Heuristic (by if-Then rules)
what is big data ?
Volume big data from social media
Velocity
variety
Big data technologies MAP REDUCE 5 step
5 step
1 Input split
2 Mapping
3 Shuffling
4 Reducer
5 Final Output
what if
goal seeking
sensitivity
you try a whole different scénarios to see what changes
you give to the model your goal and let the model find the best input.
You change one input and to see how much the result move