Jet Tech Big Data Meetup

jeu. 7 février à 00:00

Fuseau horaire : Paris (GMT+01:00) HQ


Please make sure to RSVP on our Splashthat page ( ). This will ensure you are on our guest list for check-in on the day of. ### ☆ SCHEDULE ☆

6:00–6:30 p.m. // Check-In + Networking

First things first: Make sure to RSVP here on Splash, so our building's security team has your name on their list. When you arrive day-of, come through the street-side entrance, sign in with security, take the elevator to the 8th floor, and check in with us. Afterward, grab food and drinks and network with other visitors before we begin our first talk. 6:30—6:50 p.m. // Evolution of the Data Lake at Jet: Journey So Far and the Road Ahead with Qian Chen

Jet started its Data lake journey in 2016. It was fortunate to be Azure native at the start of the journey. In the last 2+ years, our Data lake has gone through significant evolution and benefited from introducing several new capabilities and technologies. And we are not done yet! In this session, we will discuss evolution so far, lessons learned and the Road ahead. 6:50–7:10 p.m. // Building End-to-End Observability for Jet's Data Lake with Sander Hartlage

An enterprise data lake has dozens of infrastructure components and 100s of data pipelines. At Jet, to run a reliable Data lake operations we have built an end-to-end observability solution using Influx TICK stack. In this session, we will share component-driven design of our monitoring and alerting solution and discuss lessons learned along our development journey. 7:10–7:30 p.m. // Building Data Processing and ETL pipelines Using Spark 301: Data Source V2, Structured Streaming, Integrating with RDBMS with Kevin Jerrard

Jet extensively uses many of advanced capabilities of Apache Spark. In this session, we will discuss how some of the newer capabilities can be leveraged to make your data pipelines much simpler, robust and responsive to change. We will also discuss how some of the common challenges in integrating distributed processing framework such as Spark with SNP architecture of RDBMS engine such as SQL Server. 7:30—7:50 p.m. // Enabling Data Science on Jet’s Data Lake with Praful Kava

Successful Data Science at scale requires a tight collaboration between Data Science and Big Data Engineering teams. In this session, Flash team (Big Data Engineering @Jet) will talk about components that you need to get right in order to make Data Science at scale successful. This segment will also include a quick demo of a cloud hosted notebook solution powered by Big Data technologies such as Spark ### ☆ SPEAKER BIOS ☆ Qian Chen: Qian is a data engineer at Jet working on both RDBMS and big data stack. Prior to joining Jet he worked in the finance industry. In his free time, he is trying to pick up playing acoustic guitar. Sander Hartlage: About the Speaker: Sander Hartlage is a data engineer at Jet. He drives Jet's Data Lake infrastructure and ensures that it keeps up with cutting edge Big data technologies. Sander has built data products for multiple high-velocity technology startups in the New York area for around ten years. Sander enjoys coffee and cats. Kevin Jerrard: Kevin is a software engineer working on one of the many data platforms at Jet. After studying Information Science at Cornell University, he worked with a handful of companies in finance and the big data space. In his free time, he enjoys obscure non-fiction and playing tourist in New York museums. Parful Kava: Prafulleads Jet's Data Lake platform. Over past 20 years, Praful has designed and built Application, Data and ML solutions and high performing engineering teams at companies like IBM, EMC, Dell, CapGemini and McKinsey. Away from work, Praful enjoys keeping up with his 6-year-old son.

Source: HQ

Nous avons temporairement désactivé la possibilité de naviguer vers les tags.