Josh Wills on Building Resilient Data Engineering and Machine Learning Products at Slack
The InfoQ Podcast - Podcast tekijän mukaan InfoQ
Kategoriat:
Josh Wills, a software engineer working on data engineering problems at Slack, discusses the Slack data architecture and how they build and observe their pipelines. Josh, along with color commentary such as the move from IC to manager (and back), discusses recommendations, tips, tools, and lessons Slack engineering teams discovered while building products like Slack Search. The podcast covers machine learning, observability, data engineering, and general practices for building highly resilient software. Why listen to this podcast: - Slack has a philosophy of building only what they need. They have a don’t reinvent the wheel mindset. - Slack was originally a PHP monolith. Today, it is largely Hack-lang, HHVM, and several Java and Go binarys. On the data side, application logs are in Thrift (there is a plan to migrate to protobuf). Events are processed through a Kafka cluster that handles 100,000s of events per second. Everything is kept in S3 with a large Hive metastore. EMR is spun up on demand. Presto, Airflow, Slack, Snowflake (business analytics), Quiver (key value store) are all used. - ML worked best for Slack when it was used to help people answer questions. Things like Learn to Rank (LTR) become the most effective use of ML for Slack. - You can get pretty far with rules. Use machine learning when that’s all that’s left. - When you start applying observability to your data pipeline, a key lesson for Slack was to really focus on structured data, tracing, high cardinality events. This let them really use the tools they were already familiar with (ELK, Prometheus, Grafana) and go deep into understanding what’s happening in the systems. More on this: Quick scan our curated show notes on InfoQ https://bit.ly/2PsVA4q You can also subscribe to the InfoQ newsletter to receive weekly updates on the hottest topics from professional software development. bit.ly/24x3IVq Subscribe: www.youtube.com/infoq Like InfoQ on Facebook: bit.ly/2jmlyG8 Follow on Twitter: twitter.com/InfoQ Follow on LinkedIn: www.linkedin.com/company/infoq Check the landing page on InfoQ: https://bit.ly/2PsVA4q