Cut 90-min podcasts to a 3-min weekly digest
Save hours. Get smarter in 3 minutes every Monday — free.
Stop scrubbing through 90-minute episodes. We distill the "aha!" moments from the best shows and deliver them to you in 3 minutes.
No spam, unsubscribe anytimePodcasts are handpicked by our team.








Latest Highlights

That's right. What stands out is that the leaders of the world's biggest technology companies are effectively saying in unison that they don't care about the cost of building towards AGI or superintelligence, or wherever they believe this technology's final destination will be. Mark Zuckerberg, for example, told Alex Heath last week that he doesn't worry about overinvesting in AI; if anything, he's concerned about underinvesting. Other companies have expressed similar sentiments.

When we started, Quinn and I worked on the main branch with no code reviews, just pushing changes. It felt like a personal project. As experienced engineers, we each owned our work. If you broke the AI, you fixed it. This rapid pace, shipping frequently, requires making about 15 decisions daily. You constantly switch between a "duct tape, personal project, move fast" mode and a "Google-style" mode. This demands expertise and freedom from the mindset of always doing things the Google way, which assumes product-market fit and scaling up. Every company I've worked for operated on the assumption that once a product exists, it should be engineered for scale. However, AMP's core understanding is that even if something scales, we must be prepared for new technology to emerge and shift everything. We have to be ready for this. Our development mode reflects this. The AMP core team is small, around eight people. We still don't do formal code reviews, push directly to main, and ship 15 times daily. We dogfood our product extensively. In a fast-moving environment, this speed, combined with fast feedback loops and dogfooding (using the product to build the product), outperforms many established processes. We can get away with this because we dogfood it.

I agree with your initial point: we had a very broad definition. Evals encompass a wide spectrum of methods to measure application quality. Unit tests are one such method. For instance, if you have non-negotiable functionalities for your AI assistant, unit tests can verify them. However, since AI assistants perform open-ended tasks, you also need to measure their performance on vague or ambiguous things, like responding to new user requests or adapting to new data distributions. For example, new users might emerge whom you hadn't anticipated, requiring you to accommodate this new group. Evals can help identify these new cohorts by regularly analyzing your data. Additionally, evals can track metrics over time, such as positive user feedback. These basic, non-AI-specific metrics can feed back into the product improvement cycle. Ultimately, unit tests are a small piece of this much larger puzzle.
See What You'll Get
A sample of our weekly curated podcast highlights
Here are this week's carefully curated podcast highlights just for you.
© 2025 Podmark. All rights reserved. |
How it works
Discover the best podcast insights, curated by AI and refined by humans. Here's how Podmark transforms every episode into bite-sized brilliance.