Elevator Pitch
Design a user-friendly, highly available and scalable data analytics Platform to enable data scientists and analyst extract significant insights from the data. Designed leveraging a variety of SaaS/PaaS services, consisting of distinct layers of: load, store, process, serve, and consume,
Description
The goal of Data Analytics Platform is to facilitate decision-making process based on analytical, transactional and raw data; and to enable data scientists and analyst extract significant insights from the data. This platform also provides its stakeholders the opportunity to consume and add value to data. Data Analytics Platform is designed leveraging a variety of SaaS/PaaS cloud services, within distinct layers consisting of: load, store, process, serve, and consume. These various components are disseminated and integrated via standard APIs Layer.
Data analytics platform enables: Stakeholders to take evidence-based decisions based on analytical, transactional and raw data and extract value from data; Data Scientist and analyst to consume and extract value from platform data. Enable data scientists to collaborate and use predictive and prescriptive analytics pipelines identifying metadata, insights and patterns in structured and unstructured data using analytics engines for big data processing; Expose the outcomes of these pipelines as human and machine interfaces (Web UIs and services) and enable end-users and systems to search and consume the outcomes of data analytics activities. Making use of advanced analytics such as text mining, NLP and AI to past and future unstructured information to augment it with metadata that lends itself to data-driven interrogation; Implement and enforce data governance policies and processes
Data Analytics platform enables data analytics for multiple purposes, on which services are built upon, and at the same time they meet the privacy and information security standards. Specifically, this foundation layer offer a number of critical capabilities and building blocks:
- Common data integration layer that allows to ingest structured, semi-structured and unstructured data, supporting both batch and near real-time capabilities.
- Common hot and cold storage layer with segregated areas for raw data, derived data and curated data. All of this data can also be segregated by subject area/domain and security policies applied to each of these;
- Common data sets (like Substance, Product, Organization or Referential master data) as a building block for analysis across the board;
- Common multi-purpose, multi-format analysis tools for data scientists;
- Common Business Intelligence and collaboration spaces for power users;