MLH Fellowship Insider: Weeks 7–12

Cassey Shao
4 min readSep 17, 2023

--

This summer, I am doing the Major League Hacking (MLH) Site Reliability Engineering Fellowship which is run in partnership with Meta. The MLH Fellowship is a 12 week technical program where fellows complete an educational curriculum, build open source software, and receive mentorship from professional engineers to learn more about the world of site reliability engineering. I wanted to share my experience as I go through the program to provide more insights for anyone interested in applying!

The Second Half

The second half of the fellowship continued to transform and improve my approach to building software. I learned tools and strategies for CI/CD, monitoring, and troubleshooting. Here are some notes on those!

Continuous Integration and Continuous Deployment (CI/CD)

Continuous Integration is the automation of processes that thoroughly test code changes, and Continuous Deployment is the automation of processes that deploy those changes after CI requirements are satisfied. CI/CD is useful because it makes software delivery more robust and efficient. Specifically, it reduces risks of bugs in production, reduces manual work, generates extensive logs, and allows for efficient rollback to prevent outages.

In the fellowship, I got to implement CI/CD through setting up GitHub Actions for my website. Here are the workflow files that we used for testing and deploying code changes.

Workflow file for running tests.
Workflow file for deploying changes.

These workflow files ensured that when I created a pull request in my project’s repository, the associated GitHub Actions would automatically run. The “Run Tests” action ran tests to ensure that code updates didn’t break anything, and the “Deploy” action made those changes available on the VPS. This automation relieved me from having to manually ssh onto my VPS and pull and test the latest changes–which I would previously do–every time I needed to update the site.

GitHub Actions.

Monitoring

Monitoring entails maintaining a real-time view of the performance–such as CPU and memory usage–of applications and services. Monitoring is valuable because it helps developers find and respond to issues more quickly, and better plan for resource utilization.

In the fellowship, I was introduced to the Linux command “top” and Grafana for monitoring my site. The “top” command is called to show running processes with performance information such as the amount of CPU power each process is using. Grafana is a web application tool where you can set up dashboards to visualize these metrics. Here is a screenshot of Grafana charts for my portfolio website.

Grafana charts of CPU and memory usage for my website’s mysql, nginx, and myportfolio containers.

Troubleshooting

Troubleshooting is a problem-solving skill to identify the source of a software issue and rectify it. It follows a systematic process that includes identifying symptoms, examining information (e.g., reading logs), hypothesizing what the issue is, verifying the hypothesis, planning a solution, and verifying that the problem is resolved.

Troubleshooting skills are important as a production engineer because there will be times when an issue you encounter may be a first for your team or not found online such as on StackOverflow. In these times, you will need a systematic approach to resolve the problem efficiently and robustly.

In the fellowship, I got to try troubleshooting through an exercise where a VPS used to host a website was set up with some issues. To resolve them, I followed the process outlined above: I noticed that the system was running slow, hypothesized that there were programs using a lot of CPU power, used tools such as the “top” command to see what processes had high CPU usage, and then used other commands such as “find” to identify the programs that were calling these processes and rectify the issue (e.g., killing processes that didn’t need to run).

Final Thoughts

Last Friday was the fellowship graduation ceremony and it was bittersweet. In 12 weeks, I learned new tools that are used in industry, felt inspired at speaker events about production engineering, grew through feedback on my approach to problem-solving, and became energized by discussions with other fellows–and now friends–who are passionate about building good software. As I continue to pursue a career in technology, these learnings and experiences will better equip me to take on challenges and push myself. Thank you to Major League Hacking for being this special part of my coding journey.

Certificate of completion.

--

--