![]() It's great when you can tell bad code was ignored from the developer's docker environment to perf and spike a release before it goes to prod. Popular linux apps (Nginx/Apache/Redis/*Sql/etc) tuned properly don't just die, bad code kills them. Get a way to quickly roll back bad deployments, control the deployment process, test code before it becomes a problem (Jenkins, Sonarcube, etc), create perf and prod identical test environments for the devs (Containers are often a good option here), make sure something like bugsnag with deep error reporting is enabled. DevOps makes more than a plain linux admin and has a better long term job outlook, these "Everything prod is your responsibility forever" jobs are often a great way to move from linux admin to devops. I would say a lot of people move from this position into DevOps because they automated the infrastructure and resiliency, and now they need to prevent SQL / Memory / Security / CPU impacting events. ![]() Setup quality monitoring, tune it to only alarm on real issues after your scripts have failed, nagios is one option but SASS offerings are often the least work/stress/most accurate, pingdom, uptimerobot, your cloud provider's option etc.īe proactive and defensive of what code / software you run, you are the final gate that means a bad developer's code that pushes out on a thursday/friday before they leave on vacation becomes your problem. ![]() Load balance / make redundant anything you can, create automated failovers / restarts that notify you so you can take action the next morning, make as many sleep losing / life interrupting events try 1 or 5 things to repair itself before you have to. ![]() ![]() In addition specifically regarding the 24/7 oncall.Īdd your snippets to the app, add your ssh-keys, so you can get status/reboot/restart/reload/break-fix in 2 clicks from your phone. ![]()
0 Comments
Leave a Reply. |