With this practical book, you’ll discover how to catch complications in your distributed system before they develop into costly problems. Based on his extensive experience in systems ops at large technology companies, author Slawek Ligus describes an effective data-driven approach for monitoring and alerting that enables you to maintain high availability and deliver a high quality of service.
Learn methods for measuring state changes and data flow in your system, and set up alerts to help you recover quickly from problems when they do arise. If you’re a system operator waging the daily battle to provide the best performance at the lowest cost, this book is for you.
Book year: 2012
Book pages: 166
Book language: en
File size: 5.84 MB
File type: pdf
Published: 09 February 2019 - 21:43