In one of our first projects as OrderStack, we had to make a data warehouse and analytical platform. The challenge here was integrating it with their existing systems and complementing them by providing a robust way to process incoming data client wise.
Data processing engine
Insightful and flexible reports
Readily available Integrations
Avoid repetitive tasks and reduces errors
Enterprising and visually appealing reports
We discussed with the client how the integration with the existing systems will be. Also, we understood all the functionality that the client needed while data processing and also the input data formats and sources.
We outlined a raw structure of the flow of the data through the application. We found out that there was custom report generating capabilities that had to be provided as well.
We finally locked all of the features and functionalities in the data warehouse that could be thought of at that point in time. We also realized that the data processing that this application would do was in the range of millions of rows of flat data with minor and major mutations on it. At that point in time, we realized we had to go with MongoDB as the database technology because this data processing and storage was just about to close to get towards BigData as this application was going to handle multiple clients as well.
- We wrote a NodeJS server to create a REST API to control the entire application totally through the UI itself.
- We developed a custom data processing engine that would ETL all the data inside MongoDB itself and using the power of aggregation and indexes to provide a seamless data transformation.
- We developed a master-slave system for deploying data processing jobs on compute-heavy servers thus running a lot of jobs parallelly.
- We provided a report creation tool based on MongoDB’s aggregation framework.
- For the UI for analytics, we also added many display elements support which helped to display aggregated data in the form of charts, graphs, or tables to create more visually appealing reports.
- We developed integrations with the client’s existing systems so that the data in the warehouse can be used by their application as well.
By creating a test client we tested the entire system by creating custom data processing configs, test data, and test users for managing the entire test client. This entire process lasted for 3 weeks.
We deployed the main REST app on a server with at least 4GB RAM a local MongoDB instance which served as the data warehouse in itself, we provisioned a compute-heavy server with 8 GB RAM to do all the data processing. Thus dividing the processing load from the main application.