Confidential CRM Data Synchronization Platform
Project Details
Project Name:
Confidential CRM Data Synchronization Platform
Tech Use:
Node.js (ESM runtime), Express.js, HubSpot CRM APIs, Axios, ssh2-sftp-client, csv-parser, Archiver, p-queue (concurrency management), dotenv, Google Cloud Run, File system utilities (fs, path),
Overview
For this confidential project, The Webplant developed an automated data synchronization and ETL platform that connects a legacy system with HubSpot CRM. The organization relied on periodic CSV data exports delivered through an SFTP server, but the process of updating CRM records manually was inefficient and prone to errors.
The objective was to build a fully automated pipeline capable of ingesting CSV files, transforming the data into structured formats, and synchronizing it with HubSpot CRM while maintaining accurate relationships between contacts, companies, deals, and other custom entities.
The resulting system ensures that CRM data remains consistently updated without requiring manual intervention from internal teams.
Problem Statement
- Manual CRM Updates
Data generated by a legacy system was delivered as CSV files, requiring manual imports and updates within HubSpot CRM. - Lack of Data Synchronization
There was no reliable mechanism to ensure that new records or updates from the external system were consistently reflected in HubSpot. - Missing Entity Relationships
Critical relationships between records—such as which transaction belonged to a contact, company, or deal—were not reliably maintained. - Operational Inefficiency
The process of managing, cleaning, and importing data consumed significant operational time and increased the risk of data inconsistencies.
Implementation and Capabilities
Automated ETL Pipeline
A headless Node.js service was built to orchestrate the entire ETL workflow through a single endpoint.
When triggered, the system automatically:
- Connects to an SFTP server
- Retrieves new CSV data files
- Converts the files into structured JSON datasets
- Processes and prepares payloads for HubSpot
This automation eliminates the need for manual data imports.
CSV Data Normalization
Incoming CSV files are parsed and converted into structured datasets.
Key processing steps include:
- Dataset merging and transformation
- Selection of the latest records based on timestamp patterns
- Standardization of data fields for CRM compatibility
The resulting structured data ensures consistent and reliable CRM updates.
HubSpot Object Upserts
The service performs batch upserts into HubSpot CRM across multiple objects, including:
- Contacts
- Companies
- Deals
- Custom objects
Records are matched using unique identifiers, preventing duplicate entries and ensuring accurate updates.
Cross-Object Relationship Mapping
To maintain CRM integrity, the system automatically builds associations between related records.
This includes linking:
- contacts to companies
- deals to contacts and companies
- custom objects to their corresponding CRM entities
In-memory ID maps ensure relationships remain intact during batch processing.
SFTP Data Lifecycle Management
After processing the datasets, the system:
- packages the processed files into a timestamped ZIP archive
- uploads the archive back to the SFTP server
- removes processed files from both local and remote directories
This ensures that only new datasets are processed in future runs while maintaining a clear audit trail.
Scalable Cloud Deployment
The ETL service is deployed on Google Cloud Run, allowing it to run as a lightweight serverless service that can be triggered on demand.
This architecture provides:
- scalable compute resources
- reliable execution environments
- simplified deployment and maintenance
Impact
- Eliminated manual CRM updates by fully automating the data ingestion process.
- Ensured consistent synchronization between legacy systems and HubSpot CRM.
- Improved data accuracy by maintaining relationships between CRM entities.
- Reduced operational overhead through automated ETL workflows.
- Delivered a scalable architecture capable of handling ongoing data imports.
Subtle Agency Signal
Complex HubSpot CRM integrations delivered through automated data pipelines—helping agencies connect legacy systems with modern marketing and sales platforms efficiently.
