Experience Monitoring at Scale
(Yahoo! : 5/07-2/09)

Yahoo invests a lot of resources into making sure that each of its properties is available around the clock. To assist in that task, a centralized, black-box service was created as part of dev tools to help everyone from senior management to service engineers monitor and understand the health of properties.

On the backend, the service consists of the data store, a metrics collector, aggregation tools, and the configuration store (database-driven.) On the front end, there's dashboarding, custom reports, and a self-service configuration tool.

Results

  • built and maintained web tools for a Nagios-based experience management solution checking 10,000+ URLs worldwide daily generating 63M measurements per month
  • reduced workload of system engineers by creating (from scratch) a web-based, MySQL-driven, MVC-architected, self-service configuration tool for creation of and management of Nagios checks
  • led SCRUM-influenced development and improved the quality of the team's SE process by standardizing on championing the use of Catalyst (an MVC framework in Perl.) Improvements included shortened dev cycles, the introduction of TDD, improved performance, better documentation
  • created snappy, responsive interfaces using custom JavaScript along with YUI in conjunction with JSON-serving REST web services (Perl.) Also achieved performance gains through page-weight optimization
  • reduced development costs through the use of VMWare virtual machines for testing, building, and deploying as part of continuous integration. Implemented a packaged solution for automated regression testing using Firefox, Selenium, X, WWW::Mechanize

Technologies

Contributions

anthony at bluxomelabs dot com