Article originally posted on LinkedIn.
I cannot overstate the importance of the data model behind any embedded analytics project.
The data model is the pillar of calculations, data concepts and relationships that will make the visualization output understandable and logical for the end user consuming it.
So what is a data model? A data model is the structure that defines the relationships between data elements.
There are database data models, like the ones you build in a Postgres or MySQL database, and analytical data models, used for performing analytical calculations and creating dashboards. It usually combines data from different sources: databases, files, APIs, etc.
A well-designed analytical data model should:
- Provide a unified view of data from multiple sources.
- Provide the basis for your data security.
- Provide meaningful names, i.e. semantics, for the data concepts. Usually the data in databases are coded, not fit for the visual outcome expected. The user should navigate into the data in an effortless way.
- Provide hierarchical structures with a level of granularity needed by the end user that will be consuming the data output / dashboards. For example, a location hierarchy (how you drill down into the location) could either be continent, country, city or state, city, postal code.
- Enable efficient data querying and processing, which is critical for performance. This can be achieved in several ways depending on what is the underlying query, drop me a comment if you want to learn more.
- Be maintainable in the long run. A clean and clear data model definition allows for long term efficient management.
In summary, do not take for granted the creation of your data model! Take into account the data in the different sources, the needed end user output and how to make it maintainable in the long run.
Take care! (Of you and your data model!)