Latest Posts
-
Optimizing Inference for Image Generation Models: Memory Tricks and Quantization
Let's explore how I was able to run an image generation model, FLUX.1 Dev, with only 20% of its required total VRAM. Through quantization and memory optimization techniques, I'll show you practical strategies that make high-quality image generation accessible on consumer GPUs, complete with performance benchmarks and real-world examples. -
Agentic and workflows example implementations
Practical examples of agentic and workflow-based AI patterns, with code and design decisions inspired by Anthropic’s research. Learn how to implement direct LLM calls, prompt chaining, and routing workflows—while keeping your systems simple and maintainable. -
Vibe check for Claude Sonnet 3.7
Anthropic’s Claude 3.7 Sonnet introduces extended thinking, visible thought process, and impressive benchmarks. Here’s my first impressions and why this model feels like a big step forward for practical AI. -
What is the deal with Agentic AI systems?
Are agentic AI systems always the answer? In this post, I explore why simplicity often beats complexity, when agent-based architectures make sense, and why you should reach for agents only when truly needed.