π Dear Bytewax Community,
As we continue our reflection on 2024, we're excited to share the third chapter of our year in review.
Things Built with Bytewax
One of the most exciting aspects of open source is seeing what people create with your tools. This year, the Bytewax community has truly impressed us with their creativity and ingenuity. The "Used By" feature on GitHub gives us a glimpse into the public repositories that integrate Bytewaxβand the results are incredible.
Here are some standout projects that made us pause and say, βWow!β:
π² Anomaly Detection for Air Quality: A real-time system that detects anomalies in live air quality data streams, built with Redpanda and Bytewax to provide instant, actionable insights.
π° Real-Time News Search Engine: This project takes real-time news ingestion to the next level, combining Bytewax, Kafka, Upstash, and LangChain to enable semantic search for breaking news.
πΉ Finance Feature Engineering: A 100% Python-based pipeline processes live trade data from Coinbase, transforms it into OHLC features with Bytewax, and stores it in Hopsworks for analysis.
Other notable mentions include Resumify (AI-generated tailored resumes), GutenbergV2 (a repository evaluator), LinkedIn RAG (embeddings for LinkedIn posts) and Spanda.ai (real-time embeddings with Weaviate's Verba for human-centered AI solutions). Bytewax is powering everything from clean transportation to next-gen search engines.
Have you built something cool with Bytewax? Share it with us!
Bytewax Cheatsheets
This year, Bytewax introduced not just one, but three impactful cheatsheets to elevate your data engineering workflows. Here's how each of them helps you master distributed dataflows and real-time stream processing with Bytewax:
1οΈβ£ Bytewax Cheatsheet βa concise yet powerful guide to help users quickly understand and implement Bytewax in their workflows.
Dataflows as DAGs: Learn how Bytewax uses Directed Acyclic Graphs to transform raw data into actionable insights.
Connectors Galore: From Kafka to files, Bytewax makes data ingestion and output seamless.
Python Power: By leveraging Pythonβs ecosystem, you can integrate libraries like NumPy or Shapely for advanced transformations.
2οΈβ£ Bytewax Operator Cheatsheet β For developers working on distributed dataflows, this cheat sheet dives deep into operatorsβstateless and stateful.
Stateless transformations like map and filter for basic data processing.
Stateful transformations to maintain information across events for tasks like counting or aggregating data.
Operator Highlights: Includes practical examples for real-world use, from enriching data with external sources to managing complex operations with Bytewax's powerful stateful_batch operator.
3οΈβ£ Bytewax Windowing Cheatsheet β Windowing is essential for working with continuous data streams. This cheatsheet explains how to divide unbounded streams into manageable "windows" for computation.
Tumbling Windows: Fixed-size, non-overlapping intervals for real-time analytics.
Sliding Windows: Overlapping windows for calculating moving averages or trends.
Session Windows: Dynamically-sized windows based on activity, perfect for tracking bursts of user interaction.
Core Concepts: Covers watermarks, clocks, and handling late or out-of-order data for accurate processing.
Podcasts: Conversations That Matter
Some of our favorite moments from 2024 came from sharing our story on podcasts:
1οΈβ£ Redis Podcast: Zander Matheson discusses how Bytewax simplifies real-time processing for developers and scales with ease.
2οΈβ£ Hopsworks Interview: Laura Funderburk highlights Bytewaxβs contributions to real-time RAG workflows and embedding advancements.
3οΈβ£ AI Chronicles: Zander breaks down the challenges of RAG and Bytewaxβs role in powering real-time AI solutions.
Workshops and Meetups: Advancing Real-Time Innovation
π‘ OSS4AI Meetup: Hosted by Yujian Tang, this event spotlighted Laura Funderburkβs presentation on real-time RAG pipelines with Bytewax. The Bytewax teamβs dedication shone through, with Oli Makhasoeva attending during her maternity leave to support the talk.
π Supercharge Slackbots with RAG in real-time by Softlindia Workshop: With 480+ registrants from over 10 countries, this workshop, led by Henrik Nyman and Mikko LehtimΓ€ki, provided deep dives into stateful dataflows, fault tolerance, and real-time deployment. Engaged questions elevated the sessions beyond expectations.
ποΈ Women Seattle Meetup: Focused on women in AI, this event featured inspiring talks, including Bytewaxβs Oli Makhasoeva. It was a powerful mix of networking and thought leadership.
These events reflected Bytewaxβs mission: advancing innovation through collaboration.
Looking Ahead
As we wrap up 2024, one thing is clear: this year wasnβt just about building a better Bytewax. It was about building a stronger community. Together, weβve pushed the boundaries of whatβs possible with real-time data processing, and weβre not stopping anytime soon.
Whatβs next? Youβll have to wait for the final part of our year in review to find out. But hereβs a hint: itβs big.
For now, thank you for being part of our journey. Bytewax wouldnβt be Bytewax without you. π