1. API Purpose and Scope
• Develop a RESTful API or GraphQL API to provide seamless communication between client and server applications.
• Define clear endpoints for CRUD operations, data processing, and third-party integrations.
• Support both internal systems and external consumers (e.g., mobile apps, web apps, or partners).
2. Authentication and Security
• Implement secure user authentication using JWT (JSON Web Tokens) or OAuth 2.0.
• Add role-based access control (RBAC) to limit endpoint permissions for different user types.
• Enable SSL encryption and API rate limiting to prevent misuse and DDoS attacks.
3. Scalability and Performance
• Ensure the API is optimized for high performance and can scale with increasing workloads.
• Use caching mechanisms (e.g., Redis) to improve response times for frequently accessed data.
4. Database Integration
• Connect the API to a MariaDB database for structured data storage.
• Define database models and relationships for efficient data management.
• Ensure queries are optimized and protected against SQL injections.
5. API Documentation
• Provide comprehensive documentation using Swagger or Postman for testing and onboarding developers.
• Include endpoint details, request/response examples, and error handling guidelines.
6. Logging and Error Handling
• Implement centralized logging to monitor API requests, errors, and performance.
• Set up robust error-handling middleware to return user-friendly error responses.
7. Testing and CI/CD Pipeline
• Write unit and integration tests to ensure endpoint reliability.
• Automate deployment workflows using a CI/CD pipeline for continuous delivery.
1. Technology Stack
• Backend: Node.js with Express.js or Fastify for building the API.
• Database: MariaDB for relational data storage with Sequelize ORM for easy query management.
• Caching: Redis for quick data access and performance optimization.
• Authentication: JWT-based authentication with bcrypt for password hashing.
2. API Structure
• Set up a modular folder structure:
• routes/: Define endpoints for each resource.
• controllers/: Handle business logic.
• models/: Database schemas.
• middlewares/: Authentication, logging, and error handling.
3. Security Implementation
• Add JWT middleware
4. Database Integration
• Use Sequelize to define models and relationships
5. Performance Optimization
• Use Redis for caching frequently requested endpoints
6. API Documentation
• Use Swagger to generate API documentation
Project Summary
For this project, I am looking for a freelance developer to create a python script that downloads data via and API, and then manipulates that data through google sheets.
Detailed Project Requirements
Step 1 – navigating to the mastersheet
First, please navigate to the following google drive location:
In this folder, you will see the file “input-edited-w-dates”. This sheet contains a large dataset with 26 columns.
Create a copy of this file, name the copy as “input” and place the copy in the following google drive location:
Step 2 – downloading and processing bouncebacks
The next step of the script is to create a create a program that interacts with the https://www.mailwizz.com/ API to download data and automate the creation of csv files for this data.
API Documentation > https://api-docs.mailwizz.com/#introduction
Understanding how Mailwizz works
Mailwizz is an email delivery application that is used to create email campaigns. If you are unfamiliar with Mailwizz, please watch the following videos to familiarize yourself with how the application works:
Introduction - https://www.youtube.com/watch?v=jTaYTjevsog
Creating a subscriber list - https://www.youtube.com/watch?v=VSPxmUvdSxI
Creating Campaigns - https://www.youtube.com/watch?v=gol9NaRWqSs&t=488s
Sending Campaigns - https://www.youtube.com/watch?v=aU9_m5M6cmU
You can watch more videos via the official Mailwizz YouTube account if you wish: https://www.youtube.com/c/mailwizz/videos.
Please take a look through the API documentation by following this link https://api-docs.mailwizz.com/?python#introduction.
Please find the login details and API key needed to log in to my MailWizz account and to access the authentication needed to use the API associated with this account in the following google sheet:
For this project, looking at the API documentation, we are most interested in the “Campaigns” and “Campaign Bounces” commands. Mailwizz is used to create email campaigns to contact a large number of emails, however, sometimes there are a number of emails which cannot be contacted. These emails are labelled by Mailwizz as “Bouncing Emails”, as the emails are delivered, but then “bounce” back, so to speak.
For each campaign created an MailWizz, you can use the API to retrieve the bouncing email data. In order to do this, you must first retrieve the Campaign Unique ID, which is then used in the Campaign Bounces>Get All Bounces API command.
For the python program that we are creating, we need it to do the following:
Step 2: Sub-step 1
· Using the last 7 days from the time the program launches as a date range, get the unique IDs of all campaigns with a start date that falls within that date range.
· For each campaign, using the unique ID, run the “Get All Bounces” API command to download all email bounces for each campaign.
· Store all data from all campaigns in a csv file in the program’s working directory called “bounces”.
· Insert a column in this csv file titled “campaignName”, and place the name of the campaign for which each email bounce belongs to.
Step 2: Sub-step 2
Once the “bounces” csv file has been created, it will have 5 columns:
· campaignName
· Bounce type
· Message
· Date added
We need to manipulate this data, reducing it to 3 columns, as seen in the “bounces-edited.csv” file. The manipulations needed are as follows:
For each row seen in “bounces”:
· Take the value in column C and surround it with square brackets/parenthesis – [ ].
· Concatenate this value with the adjacent value in column D.
· Delete all columns except A and B and place this new concatenated value in column C for all rows. Name column C “Reason”.
Step 3 – manipulate the output from step 2
Now, open the “bounces-edited” file. This file has 3 columns. Column B contains emails.
For all emails in “bounces-edited” column B, if the email can be found in the “input” file’s column H, copy the adjacent value from “bounces-edited” column C into the adjacent cell in “input” column N.
In “input”, create a new sheet called “DOMAINS&MESSAGES” and copy all values from column E, G, Y, I and N from the sheet1 tab into columns A, B, C, D and E respectively in the DOMAINS&MESSAGES tab, naming the columns “companyName”, “emailDomain”, “domainFormat”, “status” and “bounceMessage” respectively.
In DOMAINS&MESSAGES, de-duplicate the table rows based on email addresses.
In DOMAINS&MESSAGES, for all cells in column A, that do not have a value in the adjacent column E cell, populate the adjacent column E cell with the value “delivered”.
In DOMAINS&MESSAGES, create a duplicate of column E and place it to the right in column F, and name it “shortBounceMessage”.
In DOMAINS&MESSAGES, for all cells in column F, delete all characters after the first “]” value in the cell, to trim all messages down to [Hard], [Internal] or [Soft] without any remaining text.
Now, create a new tab in the input file called “SUMMARY”. And create a table that displays data from DOMAINS&MESSAGES in the following columns:
A. Each unique company name.
B. Each unique domain.
C. The domain format.
D. Status
E. Total number of [Hard] messages.
F. Total number of [Internal] messages.
G. Total number of [Soft] messages.
H. Total number of Delivered values.
I. Total number of [Hard], [Internal] and [Soft] messages.
J. Total number of Delivered values, [Hard], [Internal] and [Soft] messages.
K. Total number of Delivered values divided by the (sum of all Delivered values, [Hard], [Internal] and [Soft] messages).
And name the columns as such as seen in the mastersheet.xlsx file:
A. CompanyName
B. Domain
C. Domain Format
D. Status
E. [Hard]
F. [Internal]
G. [Soft]
H. Total Delivered
I. Total Bounced
J. Total Attempted
K. EmailDeliverySuccess%
However, for each row that contains a value of 0 in the “status” column in DOMAINS&MESSAGES, ensure that this data is not displayed in the summary tab. Additionally, for each row that contains a value of 0 in the “status” column, if any rows in the SUMMARY tab contain the corresponding “emailDomain” value in the “Domain” column, delete that row in the SUMMARY tab.
Finally, insert a new column to the right of column A in the SUMMARY tab, and title this column as “Registrations”, and assign a value of 0 to all cells in the table.
Part 4 – measuring emailDeliverySuccess% and pausing campaigns accordingly
Once the summary page has been populated with data, the next part of the script will focus on the “EmailDeliverySuccess%”.
For all values in the EmailDeliverySuccess% column, if the value is greater than or equal to 60%, take the “CompanyName” and “domain format” value for that row, and find all rows in the “sheet1” of the “input” file that have a matching value for “companyName”, and a “domain format” value that does not equal the “domain format” value picked up in the SUMMARY tab.
For all rows that meet this criteria, populate the corresponding cells in the “status” column, with a 0.
Part 5 – take all emails with “0” status, and unsubscribe
For each row in the “sheet1” tab that is assigned a value of “0” in the status column, the value in the email column must be copied and pasted in to the “blacklist” file found in the following google drive folder:
For all emails that are added to this file, the corresponding value in the “reason” column in this “blacklist” file, must be set to “obsolete”.
Part 6 – Adding registrations to the Summary tab
Please navigate to the following google drive location:
In this folder, you will see the “registrion-log” sheet. In this sheet, you will see a list of domains in column “E”. Edit the script so that for each domain value in column E, if the domain is also found in domain column in the “input” file SUMMARY tab, increment the value in the SUMMARY tab “registrations” column by 1 for the corresponding row.
If the domain is not found in domain column in the “input” file SUMMARY tab, append the domain in the domain column in the a new row, in the “input” file SUMMARY tab. Then, each time the domain appears in “registration-log”, increment the value in the SUMMARY tab “registrations” column by 1 for the corresponding row.
Also, each time 1 row is processed in the “registration-log”, the corresponding value in column K for that row must be populated with a “1” integer value.
Part 7 – cloud-based cycle, every three days
For this project, we would like you to develop a python script that runs once every 24 hours, 7 days a week, even when the local device is switched off, therefore this may have to be a cloud-based solution.
Project complete