This service is designed to efficiently process and import product data from a large XML file into a PostgreSQL database. It handles large datasets by streaming the data, ensuring that it does not overload system memory. The service is containerized with Docker, making it easy to deploy and scale.
- Efficient Data Processing: Uses
lxml.etree.iterparsefor memory-efficient parsing of large XML files. - Dockerized: Both the application and the PostgreSQL database run in Docker containers.
- Database Integration: Automatically sets up the database schema upon startup.
- Code Quality: Integrates pre-commit hooks for code formatting and linting.
- Docker
- Docker Compose
- Git (for cloning the repository)
-
Clone the repository:
git clone <your-repository-url> cd <repository-name>
-
Environment Setup:
Copy the example environment file to create your own environment variables file:
cp .env.example .env
Edit the
.envfile to suit your environment settings like database credentials. -
Build and Run with Docker Compose:
Use Docker Compose to build and run the services defined in the
docker-compose.ymlfile:docker compose up --build -d
This command will start all the required services in detached mode.
Once the services are running, the product importer will begin processing
the XML file specified in the .env file. The data will be inserted into
the PostgreSQL database.
To stop all services, use the following Docker Compose command:
docker compose down -v