Data Availability, Data Accuraccy, Data Qualiity
Data Profiling to examin data available. It provide stats such as,
- Column stats: type, unique values, missing values
- Potential keys and foreign keys
- data quality at column level. missing values, distinct values, ...
Rating | Type | Topic |
---|---|---|
📰 | NextRoll - Making 1M Click Predictions per Second using AWS |
- Probabilistic data structures and algorithms (PDSA) are a family of advanced approaches that are optimized to use fixed or sublinear memory and constant execution time.
- They are often based on hashing and have many other useful features.- However, they also have some disadvantages such as they cannot provide the exact answers and have some probability of error (that, actually, can be controlled).