Improve deployment infrastructure & experience with 12 factor: Phase 1 Report
July 8, 2023 · 1860 words · 9 min · gsoc
It’s hard to believe that it has already been six weeks since I began my journey as a GSoC student at CircuitVerse.org. Time flies by quickly! Throughout this period, I have gained extensive knowledge and experience by exploring various tools and have significantly enhanced my skills in managing servers and handling applications in production serving a vast user base. It was also a remarkable experience to directly impact a large number of users through my contributions, such as migrating all user avatars and circuits’ images.
I will try to balance this blog with the technical work as well my general experience as a GSoC student at CircuitVerse. If you’re interested in more technical details, you can find detailed information in the blog posts I wrote during the coding period, mentioned below.
Project Overview 📋
During the past 6 weeks, my focus at CircuitVerse has been on improving the deployment infrastructure and user experience in line with the 12-factor methodology.
12factor is a methodology for building and deploying scalable, maintainable, and suitable software applications. My project focuses on implementing various features in CircuitVerse and to make it 12 factor compatible, typical of SRE(Site Reliability Engineering) practices.
In the initial phase, my project was more SRE-oriented, and I dedicated my time to finding engineering solutions for challenges that didn’t have straightforward answers. At every stage, I made sure to validate CircuitVerse’s adherence to the 12-factor methodology.
I thoroughly enjoy working on production applications because it taught me a great deal about the general workflows that need to be followed when introducing any changes or adding new features. This experience has taught me how to make modifications while ensuring minimal disruption for regular users, with a key focus on maintaining 100% availability.
Now, let’s delve into the details…
Community Bonding Period 🏙 (May 4, 2023 - May 28, 2023)
During the initial days following the announcement of the GSoC results, contributors were assigned separate private channels for their respective projects. After a few days, we had our first meet, which aimed to break the ice and foster a sense of camaraderie among all the GSoC students and mentors. The meet provided an opportunity for informal conversations, where students and mentors engaged in friendly chats.
One week prior to the start of coding period I had a detailed call with my mentor Aboobacker and Vedant, where he showed us the behind the scenes at CircuitVerse, the server config - general workflows, monitoring tools, basically he gave a comprehensive tour of the server infrastructure. In addition to this, we had a discussion about the timeline of my project and the order of tasks.
Initially, I had planned to prioritize the migration of assets to AWS S3. However, considering the resource-intensive nature of this task and the need for data migration and validation, my mentor advised me to work on certain smaller tasks alongside it.
At the end of community bonding period I was invited to join the CircuitVerse organization on Github.
For the best part regarding the bonding period go here.
Coding Period ~ Phase 1 💻
Throughout my journey, I have devoted significant time to exploring various codebases, including the Ruby STL and Rails codebase, among others. This experience has provided me with valuable exposure to different design practices and best coding practices that should be followed.
As someone who enjoys writing blogs, I made it a priority to document my progress in detail, ensuring that I published a blog post every week, sometimes even more frequently. Some of these posts delve into technical topics that could be beneficial to individuals in the Ruby/Rails community.
Communication is the key for GSoC, I had weekly meets with my mentor on every Saturday to discuss my project progress. We have also had numerous pair programming sessions to implement my changes on production together.
References
Week 1 ~ (May 29, 2023 - June 4, 2023)
In the first week of the coding period, I worked on migrating CircuitVerse attachments from local storage (using deprecated tools such as Paperclip & CarrierWave) to object storage on AWS S3 using ActiveStorage.
My approach involved first mirroring the attachments to ensure that all new data is saved in both configurations: (Paperclip & ActiveStorage) and (CarrierWave & ActiveStorage). Subsequently, backfilling all the previous data.
The main goal of this task was to achieve:
- Zero downtime, to ensure uninterrupted service
- Avoid duplicate attachments
- Successful migration all previous data.
Hence, I added configurations to duplicate attachments and data migrations to backfill past data.
References
Week 2 ~ (June 5, 2023 - June 11, 2023)
During this week, After receiving initial feedback from my mentor regarding the AWS S3 migration task, I made the necessary changes. Finally, my modifications were deployed to production after setting up the S3 bucket for attachments.
I had a meeting with my mentor where we reviewed the code together. I also provided a quick demo to demonstrate how my changes ensure that the old configuration remains unaffected.
Then we ran migrations to migrate profile_pictures
of users at CircuitVerse.
References
- feat: mirror pfp & projects, backfill profile_pictures #3786
- Hitchhiker’s Guide to GSoC: Week 2, Don’t Panic Edition
Week 3 ~ (June 12, 2023 - June 18, 2023)
In week 3, I completed the Monit configurations for CircuitVerse, as advised by my mentor to work on small tasks alongside the AWS migration task.
Monit is a small Open Source utility for managing and monitoring Unix systems. Monit conducts automatic maintenance and repair and can execute meaningful causal actions in error situations.
Here is an example of how monit sends alerts:
The Monit configuration can be found here,
After thorough discussions with my mentor, we decided to integrate Monit and Opentelemetry into production after completing the S3 migration task and setting up a continuous deployment workflow. This decision was made because these integrations would require additional RAM and CPU resources from the server.
References
Week 4 ~ (June 19, 2023 - June 25, 2023)
Week 4 was dedicated to exploring Distributed Tracing. I delved deeper into Opentelemetry and completed the OTEL configuration for CircuitVerse. Additionally, I wrote a comprehensive technical guide on monitoring a Rails app using distributed tracing with multiple vendors such as Jaeger, New Relic, Prometheus, and more.
Jaeger Dashboard:
Jaeger Single Trace
New Relic Dashboard:
Apart from working on OTEL, my mentor advised me to initialize runbooks for CircuitVerse, covering all the essential workflows and infrastructure details. These runbooks would serve as a resource similar to GitLab’s runbooks. Although we considered having a static site for runbooks like GitLab, due to the small size of our infrastructure team, we decided to document them as markdown files instead.
If you are interested in joining us at CircuitVerse, you can start contributing. Contributions are not limited to the codebase alone; we also encourage students and teachers to contribute to CircuitVerse by creating amazing digital design circuits and sharing them with friends.
Week 4 was one of the most productive weeks during my GSoC journey.
References
- feat: Intialise runbook #3
- feat: migrate image_preview to AWS S3 #3813
- GSoC Week 4: Distributed Tracing and Runbook Development
- The Ultimate Guide to Distributed Tracing: Monitor your Rails app using Opentelemetry, Jaeger and New Relic Agent
Week 5 ~ (June 26, 2023 - July 2, 2023)
During Week 5, I devoted most my time to mrsk & working on serving assets using ActiveStorage, the second step that completesthe migrate to object storage task.
mrsk is a tool that allows you to deploy docker containers to bare metal servers, (or cloud VMs) using docker with zero downtime. mrsk is still in aplha stage but is now in production at 37signals.
During this time, I connected with my mentor and we had detailed discussions, exchanging ideas and exploring different strategies for deploying CircuitVerse to a production environment using mrsk. One important topic we deliberated on was whether to maintain the existing configuration of having PostgreSQL and Redis on the server itself or to transition them to containers while persisting data using volumes.
Additionally, this week marked the completion of migrating User Circuits’ image_preview
to AWS S3. Up until Week 5 in
production at circuitverse.org, we have been mirroring attachments to both the old and new configurations. Now that we have
successfully backfilled all the data, it is time to serve them using ActiveStorage and remove the deprecated gems, namely
Paperclip and CarrierWave.
However, this process caused a few hours of downtime for CircuitVerse due to an OOM (Out of Memory) error, as the server
ran out of memory. Despite this, we were able to successfully migrate hundreds of thousands of attachments to AWS S3.
References
- chore: update rails to 7.0.5.1 (#3842)
- fix: use env[] instead of fetch (#3856)
- feat: Serve assets using active storage #3860
- GSoC Week 5 at CircuitVerse.org
Week 6 ~ (July 3, 2023 - July 9, 2023)
During Week 6, my main focus was on continuing the work related to the mrsk task and updating the Rails API for the Serving assets using ActiveStorage pull request at CircuitVerse.
The mrsk task itself was quite time-consuming, taking around 25-30 minutes for all the steps on my intel Mac. The process involved creating the Docker image for the rails app, pushing the image to the repository, pulling it to the server, and then restarting the application. Fortunately, I had previously invested effort in creating a personal deployment of CircuitVerse using mrsk, which helped me set up the basic configuration quickly. To streamline the debugging process, I opted to build the Docker images separately. I would then run these containers on a Docker network and test my changes there. This approach helped me efficiently identify and resolve any issues.
As I continue my work, I will prioritize implementing proper logging, monitoring, and alerting mechanisms for the container- based deployment of CircuitVerse. Additionally, I will optimize resource consumption, such as RAM and CPU, to ensure efficient utilization and enhance the scalability of CircuitVerse. By fine-tuning these aspects, we aim to provide users with an optimized and reliable experience.
These tasks are expected to take the first few weeks of the next phase to make it to production after thorough validations.
After completing these tasks, only the Hyperloglog task remains, as outlined in my GSoC proposal. Additionally, I intend to contribute to diversifying deployment options for CircuitVerse, although it falls outside the scope of GSoC. Regardless, I am committed to making progress in this area.
Special mention to my mentor, Aboobacker for generously dedicating his valuable time to review my pull requests and providing guidance whenever I encountered challenges.
References
Commits Merged
Featured Blog Posts
All blogs written during GSoC Period are on my website
Some of my favourite being: