Saturday, 20 July 2024

Developing RemindMe App, the Journey, the Mistakes, the Lessons, and the Triumphs

RemindMe is an innovative project designed to provide a platform for users to display and manage their favourite sayings, quotes, and images - called reminders - that keep them positively excited and focused. It is a go-to app for users who wish to derive inspiration to make everyday a perfect day. RemindMe is a user-friendly, web-based reminder application that provides seamless authentication and intuitive management of reminders, including the ability to attach images to reminders for better context.

Our team consisted of two members:

Anozie Innocent Onyekachi: Focused on backend development, API creation, and integration.
Loay Al-Said: Handled, Application server, frontend development, user interface design, and image handling integration.

The project timeline spanned about four weeks, during which we collaborated intensively to bring RemindMe to life. RemindMe is designed for anyone who needs a reliable and straightforward way to keep track of reminders that help them focus, stay positive, inspired and motivated.

I focused primarily on the backend aspects of the project, ensuring secure user authentication, developing robust APIs, and implementing data management using SQLAlchemy. Our project helps users create and manage reminders with additional functionality like attaching images to reminders, which adds significant value for users needing visual aids.

The Story Behind RemindMe

Loay had been using notebooks and pens to keep track of his "whys", favourite sayings, daily to-dos, etc. He thought of an app to help him seamlessly do this; that's when the idea of RemindMe was born.

During early days of me learning Software Engineering at ALX, I struggled with keeping track of my school assignments and personal projects. I would often forget deadlines, leading to a lot of last-minute stress. When the need to choose a partner for the portfolio project arose, and Loay introduced the idea of RemindMe to me, a tool that I wished I had back then and one that I hope will help others avoid the same struggles, it was easy for me to accept to contribute to bring it to life.

Project Accomplishments

RemindMe was built with a strong focus on providing a seamless user experience. We achieved several milestones:

User Authentication: Implemented secure user registration and login using Flask-Login and Flask-JWT. This ensures that user data is protected and only accessible by authorized users.
Reminder Management: Developed a dynamic interface for creating, updating, and deleting reminders. Users can add descriptions, set visibility (public or private), and even attach images to their reminders.
Image Handling: Integrated ImageKit.io for image uploads, allowing users to enhance their reminders with visual content. This feature is particularly useful for users who rely on visual cues.

Technologies Used

Backend: Flask, Flask-Login, Flask-JWT, SQLAlchemy
Frontend: HTML5, CSS3, JavaScript
Database: SQL/MySQL
Image Handling: ImageKit.io

We chose these technologies to ensure a smooth learning curve since we have been working with them in foundations phase of our curriculum. Flask provided a flexible framework for API and Application server development, while SQLAlchemy facilitated efficient data management. For the frontend, we opted for JQuery, HTML and CSS to deepen our understanding of the languages and their capabilities.

Key Features

Secure Authentication: Users can register and log in securely, with their sessions managed effectively.
Reminder Management: Users can create, update, and delete reminders with ease, adding detailed descriptions and setting reminders as public or private.
Image Attachments: Users can upload images to their reminders, enhancing the context and visual appeal of their tasks.

The Most Difficult Technical Challenge

One of the most challenging aspects of this project was integrating the image upload functionality. Initially, we faced numerous issues with handling file uploads securely and efficiently. The situation required us to find a reliable solution for storing and retrieving images without compromising user data.

Situation: We needed to implement a feature allowing users to upload images to their reminders, but our initial attempts were filled with security and performance issues.

Task: Our task was to integrate a third-party service, ImageKit.io, to handle image uploads and ensure that the images were securely stored and easily retrievable by our users.

Action: We researched various image handling services and chose ImageKit.io for its robust features and ease of integration. I implemented the backend functionality to handle image uploads, including creating temporary files, uploading them to ImageKit.io, and handling the response to store the image URL in our database. This required thorough testing and debugging to ensure the process was secure and efficient.

Result: After several iterations and extensive testing, we successfully integrated ImageKit.io into our application. Users can now upload images to their reminders, and the images are securely stored and efficiently retrieved, enhancing the overall user experience.

Lessons Learned

This project was a significant learning experience, both technically and personally:

Technical Skills: Deepened my understanding of Flask, SQLAlchemy, and JavaScript. Learned how to integrate third-party services like ImageKit.io effectively.
Problem-Solving: Gained valuable experience in troubleshooting and debugging complex issues, particularly in handling file uploads and securing user data.
Team Collaboration: Improved my ability to work collaboratively with a team, coordinating tasks, and integrating our work seamlessly.

Conclusion

Working on RemindMe has been an incredibly rewarding experience. It has solidified my technical skills and taught me the importance of meticulous planning and collaboration. I am excited to continue developing and refining RemindMe, making it an indispensable tool for anyone needing the services it provides.

About Me

I am Anozie Innocent Onyekachi, a passionate Software Engineer with a keen interest in creating solutions that make everyday tasks easier. You can find more about RemindMe and my other projects on my GitHub. Feel free to connect with me on LinkedIn.

GitHub Link for the Project: RemindMe GitHub
Deployed Project Page: RemindMe Web App
Project’s Landing Page: RemindMe App

Saturday, 22 June 2024

Postmortem of Debugging a Web Outage of iygeal.com, an E-commerce Service/Website

On 18th June, 2024, we received a barrage of calls from approximately 500 users complaining that iygeal.com was inaccessible. Our robust monitoring system had alerted about this almost concurrently. This unfortunate incident lasted for about 2 hours before we were able to resolve it.

Issue Summary

Duration of the Outage: June 18, 2024, 14:00 WAT to 16:00 WAT (2 hours)

Impact:

The e-commerce website was completely inaccessible.
Users experienced 404 errors when trying to access any page.
100% of active users were affected.

Root Cause: A missing configuration file in the deployment package caused the application to fail during initialization.

Timeline

14:00 WAT: Issue detected by automated monitoring system alerting about a sudden spike in 404 errors.

14:05 WAT: Incident response team headed by Loay were notified via Slack.
14:10 WAT: Initial investigation by on-call engineer focused on potential web server misconfigurations.
14:15 WAT: Engineer Innocent used tmux to create two terminal instances for simultaneous debugging.
14:20 WAT: Ran curl -sI 127.0.0.1 on one terminal to test the local server response, which showed a 500 Internal Server Error.
14:25 WAT: Attached strace to the Apache process using sudo strace -p <apache_pid> on the second terminal to trace system calls and signals.
14:30 WAT: strace output revealed an attempt to open /var/www/html/wp-includes/class-wp-locale.phpp, which resulted in an ENOENT (No such file or directory) error.
14:40 WAT: Misleading path: Assumed the issue was due to a misconfigured database connection.
15:00 WAT: Escalated to the DevOps team to verify the deployment process.
15:10 WAT: DevOps team confirmed the typo in the filename (.phpp instead of .php).
15:20 WAT: Developed a Puppet script to automate the correction of .phpp to .php in the deployment files.
15:30 WAT: Deployed the Puppet script, which scanned the affected directory and corrected the file extension.
15:40 WAT: Retried the deployment with the corrected files.
16:00 WAT: Status code of 200 indicated system is fully operational, users confirmed site accessibility.

Detailed Root Cause and Resolution

Root Cause

The root cause of the outage was a typo in the deployment package where a critical file was named class-wp-locale.phpp instead of class-wp-locale.php. This incorrect filename caused the application to fail during initialization, leading to 404 errors across the site.

Resolution

The issue was identified using strace to trace system calls and detect the incorrect filename. A Puppet script was then developed and deployed to automate the correction of the typo in the deployment files. Once the deployment was retried with the corrected files, the application started successfully, and the site became accessible to users.

Corrective and Preventative Measures

Improvements:

Implement automated checks to verify the completeness and correctness of deployment packages before deployment.
Enhance monitoring to include checks for critical configuration files and common file naming conventions.
Update the deployment process to include a pre-deployment verification step.
Improve incident response procedures to utilize debugging tools like strace, tmux, and curl more effectively.

To-Do's:

1. Patch Deployment Script: Update the deployment script to include a verification step for critical files and naming conventions.

2. Add Monitoring: Implement file existence monitoring for critical configuration files and extensions.

3. Review Deployment Checklist: Revise the deployment checklist to ensure all necessary files are correctly named and included.

4. Conduct Training: Train the deployment team on the updated process, including the use of debugging tools like strace, tmux, and curl.

5. Post-Mortem Review: Schedule a review meeting to discuss the incident and the new measures with the entire engineering team.

By taking these measures, we aim to prevent similar issues from occurring in the future and improve our overall deployment and monitoring processes.

Saturday, 25 May 2024

What happens when you type google.com in your browser and press Enter?

Computers like numbers. A lot. To effectively communicate with them, one has to talk numbers. The Internet, to put it in simple terms, is a collection of lots of computers communicating with one another across the world.

For context, contents on the internet are actually files saved on other computers in various locations. The ones looking for and trying to access files are called clients while the ones with the files to serve as aptly called servers. So when you type 'google.com' - a domain name - in your web browser, you are basically trying to access contents available in that location. Whether or not you are able to access these contents and how you access them depend on a series of processes that happen pretty fast which you don't see. But don't worry, I am going to explain them to you.

DNS Lookup

DNS stands for Domain Name System. Picture yourself trying to visit a friend whose house you do not know. You know his name (domain name) but do not know his exact address. Luckily, you have a contact book where you can look it up. DNS represents this contact book in the internet world.

When you type 'google.com' in the browser, the first thing that happens is a DNS lookup. Your browser acts like you and asks the DNS server, "Hey, what's the IP address for google.com?". The IP (Internet Protocol) address is like a unique identifier for every computer on the internet, similar to your friend's exact house address. Remember I told you that computers like numbers? The IP address is a bunch of numbers. In this case, these numbers would represent 'google.com'.

DNS Lookup Processes

Searching Browser Cache: The browser cache is a local storage mechanism used by web browsers to store copies of web resources (such as IP addresses, HTML files, images, stylesheets, and scripts) on the user’s device. So when you type 'google.com' in your browser's address box and press the Enter key, your browser first checks its cache to see if it recently resolved the domain and saved the IP address. If this is the case, it uses this IP address and the search process stops there.
Searching Operating System Cache: Like the browser, the Operating System (OS) has its cache. So if the browser does not find the IP address, it asks the OS to check its recently resolved domain names.
Router Cache: Your router also has cache memory which serves the same purpose. So if neither the browser nor the OS has the IP address, the next in line is the router.
Searching ISP DNS Server: If your router does not know the IP, the query is forwarded to your Internet Service Provider (ISP) to search its DNS server. ISPs normally have their own DNS servers that store recent requests to speed up the process.
Recursive DNS Servers: If the DNS server of your ISP fails to resolve the domain name to an IP address, it performs a recursive DNS query, which means it will query other DNS servers on the internet to find the IP address for 'google.com'. This involves many steps:

Root DNS Servers: The ISP's DNS server first contacts a root DNS server, which directs it to the DNS servers for the top-level domain. Top Level Domain (TLD) is the last part of a web address following the dot(.). Examples include .com, .org. .edu, etc. In this case, the TLD is .com.
TLD DNS Servers: The root server points the ISP servers to the TLD DNS server responsible for .com domains.
Authoritative DNS Servers: The TLD server then points to the authoritative DNS server that knows the IP address for our domain name 'google.com'.

The Response: After finding the IP address, it is sent back through the hierarchy - from the authoritative server to the TLD server, then to the root server and back to the ISP's DNS server, then to your router, and finally to your computer. As this happens, the IP address is cached (saved) at each level for future use.
Connection: Now that your browser has found the IP address for 'google.com', it can connect to the web server at that address. I will explain.

TCP/IP Connection

Once your browser resolves 'google.com' to an IP address, it needs to establish connection to the server at that IP address. This is done using the TCP/IP protocol suite, which stands for Transmission Control Protocol/Internet Protocol. Here is a breakdown of how this connection is made:

IP (Internet Protocol): IP is responsible for addressing and routing packets of data so they can travel across networks and arrive at the correct destination. Think of IP here as the system that guides packets of data like letters in an envelope across the internet.
TCP (Transmission Control Protocol): TCP works on top of IP to ensure reliable, ordered, and error-checked delivery of data between applications running on hosts communicating via an IP network. It establishes a connection between the client (your browser) and the server (google.com) before any data is exchanged. This is done through a process called the TCP Three-Way Handshake.

TCP Three-Way Handshake:

SYN (Synchronize) Packet: Your computer (the client) sends a TCP packet to the server with the SYN flag set. This packet asks the server if it is open for a new connection and includes an initial sequence number (a random number used to keep track of the connection).
SYN-ACK (Synchronize-Acknowledge) Packet: The server responds with a SYN-ACK packet. This packet acknowledges receipt of the client's SYN packet and includes the server's own initial sequence number.
ACK (Acknowledge) Packet: Finally, the client sends an ACK packet back to the server, acknowledging receipt of the server's SYN-ACK packet. At this point, the connection is established, and data transfer can begin.

Data Transfer:
With the connection established, your browser can start sending HTTP requests to the server. The HTTP (HyperText Transfer Protocol) is the protocol used for transferring web pages on the internet.
Here's a simplified view of the process:

Your browser sends an HTTP request to the server at "google.com".
The server processes the request and sends back an HTTP response, which contains the requested web page data (like HTML, CSS, JavaScript, images, etc.).
This data is broken down into packets and sent over the established TCP connection.

Firewalls and NAT:
During this process, the packets may pass through several firewalls and NAT (Network Address Translation) devices:

Firewalls: These are security devices that monitor and control incoming and outgoing network traffic based on predetermined security rules. They can block or allow traffic to protect the network. Sometime during the presidential reign of Buhari in Nigeria, the federal government banned the use of X (then Twitter) in the country. This is a typical work of firewalls.
NAT: This technique allows multiple devices on a local network to share a single public IP address for accessing the internet. It modifies the IP address information in packet headers while in transit across a traffic routing device.

SSL/TLS Handshake (if using HTTPS)

I have been talking about HTTP as the protocol through which communication happens on the web. But HTTP is not secure. Most modern websites, including Google, use HTTPS (HyperText Transfer Protocol Secure) to encrypt the data exchanged between your browser and the server. This ensures privacy and data integrity. HTTPS is HTTP layered over SSL/TLS (Secure Sockets Layer / Transport Layer Security).

Here's how the SSL/TLS handshake works:

Client Hello: Your browser (the client) sends a "Client Hello" message to the server. This message includes information like the SSL/TLS version your browser supports, the cipher suites (encryption algorithms) it can use, and a randomly generated number. Remember, computers like numbers.
Server Hello: The server responds with a "Server Hello" message. This message includes the SSL/TLS version and cipher suite chosen by the server, another random number, and the server's digital certificate.
Certificate Verification: Your browser verifies the server's digital certificate. This certificate is issued by a trusted Certificate Authority (CA) and confirms the server's identity. Your browser checks that the certificate is valid and that it trusts the CA that issued it.
Key Exchange: After verifying the certificate, your browser and the server generate a shared secret key. This key will be used to encrypt the data exchanged during the session. The key exchange can be done in different ways, but a common method is using the Diffie-Hellman algorithm.
Finished Messages: Both the client and the server send a "Finished" message to each other, encrypted with the session key. These messages confirm that the handshake is complete and that the encrypted communication can begin.

Encrypted Communication:

Once the SSL/TLS handshake is complete, all data sent between your browser and the server is encrypted using the shared secret key. This ensures that anyone intercepting the data cannot read it without the key.

Example of an HTTPS Request-Response:

HTTPS Request: Your browser sends an HTTPS request to "google.com" to retrieve a web page. The request includes details like the URL, headers, and any data your browser needs to send.
HTTPS Response: The server processes the request and sends back an HTTPS response. This response includes the HTML content of the requested web page, which your browser then renders.

This secure connection ensures that any sensitive data, such as login credentials or personal information, is protected from eavesdroppers.

Load Balancer

After your browser establishes a connection and sends an HTTPS request to "google.com," the request reaches Google's data center. Here, a load balancer is the first component to handle your request.

A load balancer is like a traffic cop sitting in front of your web servers and routing client requests across all servers capable of handling those requests in a manner that maximizes speed and capacity utilization while ensuring that no single server is overworked. This is crucial for handling the massive number of requests Google receives every second.

Here is how it works

Distributes Traffic: When your request reaches Google's data center, the load balancer determines which of the many servers will handle your request. It uses algorithms like round-robin (sending requests to servers in a rotating order) or least connections (sending requests to the server with the fewest active connections) to make this decision.
Health Checks: The load balancer continuously monitors the health of the servers. If one server is down or not performing well, the load balancer stops sending traffic to it until it recovers.
Failover: If a server fails, the load balancer reroutes your request to another available server, ensuring you don’t experience any downtime.

By efficiently distributing the load, the load balancer ensures that Google can handle your request quickly and reliably.

Web Server

Once the load balancer directs your request to a specific server, the web server takes over. A web server’s primary function is to serve web pages to clients (like your browser) over the internet.

A web server is a software or hardware that uses HTTP (HyperText Transfer Protocol) to respond to client requests. When you request "google.com," the web server receives your request and serves the corresponding web pages.

The web server does this in the following ways:

Handling Requests: The web server listens for incoming requests from clients. When it receives a request for "google.com," it processes the request to determine what content to send back.
Serving Static Content: If the requested content is static (like HTML files, images, CSS files, or JavaScript files), the web server retrieves these files from its storage and sends them directly to your browser.
Forwarding Dynamic Requests: If the requested content is dynamic (like search results, user data, etc.), the web server forwards the request to an application server, which handles the business logic and data processing.
Security and Logging: Web servers often handle security (like SSL/TLS encryption) and log requests for monitoring and troubleshooting purposes.

Application Server

An application server is a software framework that provides an environment where applications can run, regardless of what they do or what they are. It is designed to facilitate the construction and operation of dynamic, data-driven applications.

After the web server receives your request and determines that dynamic content is needed (like search results), it forwards the request to an application server. The application server handles the business logic and processing required to generate dynamic content.

Here is how Application Server works:

Processing Requests: The application server receives the request from the web server. For example, when you search for "cats," the application server handles your query.
Executing Business Logic: The application server runs the necessary code to process the request. This might involve executing complex business rules, running algorithms, or performing calculations.
Interacting with Databases: Often, generating dynamic content requires retrieving or storing data in a database. The application server communicates with the database to fetch search results, user data, or other information required to fulfill the request.
Generating Dynamic Content: Once the necessary data is retrieved and processed, the application server generates the dynamic content (like the search results page) and sends it back to the web server.

Database

To generate dynamic content such as search results or user-specific data, the application server often needs to retrieve information from a database. The database is a critical component that stores and organizes data for quick retrieval.

What is a Database?

A database is a structured collection of data that can be easily accessed, managed, and updated. Databases are designed to handle large amounts of information efficiently and support various operations like querying, updating, and managing data.

Here is how it works:

Data Storage: Databases store data in tables that consist of rows and columns, similar to a spreadsheet. Each table holds data related to a specific topic, like users, products, or search indexes.
Querying Data: When the application server needs data, it sends a query to the database. A query is a request for specific information, written in a language like SQL (Structured Query Language). For example, the application server might query the database for all records related to the keyword "cats."
Data Retrieval: The database processes the query and retrieves the relevant data. It uses indexes to quickly locate and return the required information.
Sending Data Back: The retrieved data is sent back to the application server, which then uses it to generate the dynamic content. For instance, the search results for "cats" are fetched from the database and sent to the application server to be formatted and displayed.

For "google.com," Google's databases store massive amounts of data about web pages, user profiles, and search histories. When you search for "cats," the application server queries these databases to find the most relevant web pages and information related to your query.

Summary:

DNS Lookup:
- Your browser sends a request to resolve "google.com" to an IP address.
- The request is processed through the browser cache, OS cache, router cache, ISP DNS server, and finally recursive DNS servers if needed.
- The IP address of "google.com" is returned to your browser.
TCP/IP Connection:
- Your browser establishes a connection with the server using the IP address obtained.
- This involves a TCP handshake (SYN, SYN-ACK, ACK) to establish a reliable connection.
SSL/TLS Handshake (if using HTTPS):
- Your browser initiates an SSL/TLS handshake to establish a secure connection.
- The server's certificate is verified, and a secure session key is exchanged.
HTTP Request and Response:
- Your browser sends an HTTP request to the server.
- The server processes the request and sends back an HTTP response with the requested content (e.g., HTML, CSS, JavaScript).
Firewalls:
- Firewalls between your computer and the server inspect and control the incoming and outgoing network traffic based on security rules.
- They ensure that only legitimate traffic reaches its destination.
Load Balancer:
- The load balancer at the data center distributes the incoming request to one of the multiple web servers.
- It uses algorithms to balance the load and ensures no single server is overwhelmed.
Web Server:
- The chosen web server receives the request.
- It serves static content directly if requested, or forwards the request to the application server for dynamic content.
Application Server:
- The application server processes the request for dynamic content.
- It executes business logic and interacts with the database to retrieve the necessary data.
Database:
- The database stores and organizes data needed by the application server.
- It processes queries from the application server and returns the required data.
Rendering the Web Page:
- The application server sends the processed data back to the web server.
- The web server sends the final response to your browser.
- Your browser parses the HTML, CSS, and JavaScript, creates the DOM and CSSOM, constructs the render tree, and paints the content on the screen.

Below is a pictorial representation of these processes.

Thanks for reading :-)

WELCOME TO IYGEAL'S BLOG