Microsoft is previewing a redesigned Power Query interface in Power BI Desktop with updated visuals and workflows. If you spend significant time cleaning, transforming, or combining data before analysis, this refresh could streamline your editing experience and reduce clicks in your daily ETL steps—though you'll want to test it in the preview to see if it actually saves time versus your current process.
Authored by: Sara Lammini Rodriguez - Product Manager II, and Miguel Escobar - Senior Product Manager
This is a May 2026 Power BI release covering Copilot/AI improvements, reporting and modeling enhancements, and new data connectors. If you regularly use Copilot for exploration or spend time formatting reports and connecting new data sources, you'll likely find faster workflows—but you'll want to check the full release notes to see which specific features affect your typical tasks.
Author: Katie Murray, Senior Program Manager - Power BI continues to evolve with updates that make it easier to explore data, generate insights, and build more polished reports. This month’s release brings improvements across Copilot and AI experiences, reporting and modeling enhancements, new data connectivity flows, …
Outbound Access Protection now applies to semantic models, letting you restrict what external destinations your Power BI models can connect to—blocking everything by default and only allowing approved connections. For analysts, this means your organization can enforce stricter data security policies on semantic models, which could affect which data sources or APIs you can query from, potentially requiring your admin team to explicitly approve new connections you need for your work.
Author: Kay Unkroth, Principl Program Manager - Outbound Access Protection (OAP) is a workspace-level network security and governance feature that blocks outbound traffic from a workspace by default and lets you allow only the destinations you explicitly trust. With this preview, you can now extend OAP to semantic mode…
This is a governance framework for deploying and managing voice AI agents in production environments, covering security, compliance, and operational best practices across their full lifecycle. If you're building customer-facing analytics tools or automating data-driven conversations (like Q&A bots over your datasets), it's worth understanding the governance requirements and anti-patterns—especially around data access, audit trails, and handling sensitive information that voice agents might expose.
In this article Why real-time voice agents require a different governance lens Why real-time voice agents raise the stakes A governance framework for the full agent lifecycle Platform capabilities that support agent governance Security, privacy, and compliance for customer-facing agents Five anti-patterns that derail p…
April 2026 brings layout flexibility improvements, expanded mobile Copilot features, and several preview capabilities across Power BI's reporting and modeling tools. For your daily work, better layout controls mean faster report tweaking, while stronger mobile Copilot support lets you iterate and troubleshoot on-the-go instead of being tied to desktop—though you'll want to check the preview docs to see which features are production-ready for your environment.
Welcome to the April Power BI update! Power BI’s April 2026 update is here, bringing continued improvements across Copilot and AI, reporting, visuals, and modeling. This release includes more flexibility when working with layouts and visuals, expanded Copilot experiences—especially on mobile—and several preview feature…
Copilot Studio added governance controls for AI agents, improved workflow automation, and the ability to embed business applications directly into agents. For data analysts, this means you can now build self-service agent interfaces that connect to your BI tools and data platforms with better oversight, potentially reducing ad-hoc data requests and letting non-technical stakeholders access insights more independently.
In this article Build and scale agents with better visibility and control Expand workflows into intelligent, governed automation systems Bring business apps directly into your agents What else is new and improved in Copilot Studio Stay up to date on all things Copilot Studio As organizations scale their use of AI agent…
The episode argues that Power BI bookmarks—a feature for saving and switching between report states—should be reconsidered in favor of newer alternatives that likely offer better performance or user experience. If you're currently using bookmarks for navigation or report state management, this suggests evaluating whether Microsoft has released better tools to replace this functionality in your reports.
In Episode 526 of Explicit Measures, Mike Carlo and Tommy Puglia unpack the latest Power BI and Microsoft Fabric topics from the show. You’ll get a quick read on the episode’s biggest ideas, why they matter, and where to dig deeper in the full conversation.
Microsoft has released a preview REST API that lets you execute DAX queries programmatically against Power BI datasets without opening the desktop application. This matters because you can now automate data pulls, integrate Power BI queries into Python scripts or other workflows, and build custom solutions that query your semantic models on demand—eliminating manual exports and reducing friction between Power BI and your other analytical tools.
This episode covers recent Power BI and Microsoft Fabric feature updates discussed by the hosts. Without access to the full episode content, the excerpt doesn't specify which features are new, so check the full conversation to see if any changes affect your current dashboards, data modeling, or refresh schedules.
In Episode 525 of Explicit Measures, Mike Carlo and Tommy Puglia unpack the latest Power BI and Microsoft Fabric topics from the show. You’ll get a quick read on the episode’s biggest ideas, why they matter, and where to dig deeper in the full conversation.
Slicers can't directly filter measures because measures are calculations, not data values—but the article explains workarounds to achieve the filtering effect you need. This matters because you'll likely hit this limitation when building dashboards, and knowing the proper techniques (like using measure branches or filter context) will save you time troubleshooting why your slicer isn't working as expected.
A slicer cannot filter a measure: let’s analyze this common request by explaining how to use a slicer to filter a measure, after discussing the real meaning of using a measure with a slicer.
Slicers in Power BI fundamentally filter table columns, not measures—a distinction that trips up newcomers trying to dynamically filter aggregations. Understanding this difference matters for your reports because it clarifies when you need to restructure your data model or use alternative techniques like measure branching instead of expecting a slicer to do something it's architecturally designed not to do.
A slicer cannot filter a measure. In this article, we analyze this common request by explaining how to use a slicer to filter a measure, after discussing the real meaning of using a measure with a slicer. A very common request by Power BI newbies is, “How can I use a slicer to filter a measure rather than a regular mod…
This episode highlights Power BI and Fabric features that exist but aren't widely used, covering practical capabilities that could streamline your reporting work. If you're building dashboards and doing standard analysis, there's likely at least one overlooked feature here that could save you time on data modeling, visualization, or refresh optimization—worth a 20-minute listen to see what you're missing in your daily toolkit.
In Episode 524 of Explicit Measures, Mike Carlo and Tommy Puglia unpack the latest Power BI and Microsoft Fabric topics from the show. You’ll get a quick read on the episode’s biggest ideas, why they matter, and where to dig deeper in the full conversation.
This episode recaps recent developments across Power BI and Microsoft Fabric from the hosts' conversation. Since the excerpt doesn't specify which features or changes were discussed, you'd need to listen to the full episode to understand what's actually new and how it might affect your reporting workflows or data pipeline work.
In Episode 523 of Explicit Measures, Mike Carlo and Tommy Puglia unpack the latest Power BI and Microsoft Fabric topics from the show. You’ll get a quick read on the episode’s biggest ideas, why they matter, and where to dig deeper in the full conversation.
Microsoft Copilot Studio now supports real-time voice agents, letting you build AI assistants that can handle customer support conversations through voice instead of just text. For data analysts, this means you'll likely need to design new data pipelines and monitoring dashboards to track voice interaction metrics, compliance logs, and agent performance data that weren't part of text-only workflows.
Customers expect support that resolves issues quickly, delivers consistent answers, and works seamlessly across channels. For organizations, this creates a familiar tension: how do you deliver high‑quality service at scale without losing control over cost, compliance, or experience? That’s why we’re excited to announce…
This episode covers strategies for developing AI capabilities within Microsoft Fabric to help analysts work more effectively with modern data tools. Since Fabric increasingly integrates AI features into everyday tasks like data preparation and analysis, building these skills now means you'll be able to automate repetitive work and leverage AI-assisted features rather than learning them reactively when they become mandatory in your workflow.
In Episode 522 of Explicit Measures, Mike Carlo and Tommy Puglia unpack the latest Power BI and Microsoft Fabric topics from the show. You’ll get a quick read on the episode’s biggest ideas, why they matter, and where to dig deeper in the full conversation.
Databricks Genie is a tool that helps automate the creation of personalized customer experiences by connecting retail data to AI-driven recommendations without requiring manual query writing. For analysts, this means faster iteration on segmentation and targeting logic—you can ask questions about customer behavior in plain language and get results without building SQL from scratch, though you'd still own validation and refinement of the outputs.
USE CASECustomer Intelligence & Loyalty OptimizationRetail personalization has come...
Databricks has added an "Agent mode" to its AI/BI Genie tool that can handle more complex analytical questions than the standard chat interface, moving beyond simple queries to investigative "why" questions. For analysts, this means you could potentially spend less time manually digging through dashboards and writing queries to explain unexpected patterns—Genie Agent can do some of that reasoning work for you, though you'd still need to validate its conclusions.
See how AI/BI Genie Agent can answer much more complex questions than standard chat. Dashboards, Genie Chat, and Genie Agent all come together in AI/BI to form a comprehensive analytics suite.
Genie (Microsoft Fabric's natural language query tool) is getting attention for enterprise rollouts, but the article argues its real bottleneck isn't the chat interface—it's the semantic layer underneath that needs to be properly designed first. For your workflow, this means that before pushing Genie to your business users, you'll need to invest time ensuring your semantic models are clean, well-documented, and aligned with how the organization actually asks questions, or you'll end up supporting countless confused queries and inconsistent results.
I've been setting up Genie spaces for clients, and the conversations always start in the same place. Someone from the business sees the demo and asks how quickly we can roll it out across the organisation.
Databricks now offers SQL-based alerting to monitor data quality and KPI thresholds automatically instead of requiring manual checks. For your daily work, this means you can set up notifications when metrics drift or data freshness issues occur—reducing time spent on repetitive manual monitoring and letting you focus on analysis instead of babysitting dashboards.
In many organizations, data monitoring is still a manual, repetitive routine: open...
Databricks released Agent Bricks, a framework for building specialized AI agents that automate manufacturing workflows like planning, forecasting, and quality control—cutting planning cycles from days to minutes. If you're building analytics pipelines or forecasting models for manufacturing clients, this means you'll likely be asked to integrate agentic AI into your existing Databricks workflows rather than just serving static dashboards or reports.
Manufacturing moves too fast for generic AI. In this episode, see how specialized AI agents built on Agent Bricks help manufacturers improve planning, forecasting, logistics, and quality at scale. The impact: Planning & inventory: Manufacturers cut planning time from days to minutes and use real-time data from 18,000+ …
Consumers are already using AI chatbots like ChatGPT to research and compare insurance products in real time, shifting purchasing decisions away from traditional distribution channels. For data analysts, this means your organization's customer acquisition models, competitive positioning dashboards, and channel attribution analysis likely need urgent updates—the traffic patterns and conversion funnels you've been tracking are probably shifting faster than your current refresh cycles can detect.
A consumer is asking ChatGPT which home insurance to buy, where to get the cheapest car insurance... right now. Not next year. Not when the technology matures. Now.
A Prompt Registry is a centralized system for storing, versioning, and managing AI prompts—similar to how feature stores organize ML features. If you're building AI applications across your organization, this helps prevent duplicate work, track prompt changes, and maintain consistency instead of having prompts scattered across notebooks and Slack messages.
Introduction A few years back, feature stores became the standard way to bring order to machine learning features by centralising, governing and tracking them. Now we are facing the same challenge with prompts. They multiply quickly, get tweaked without context and become difficult to manage.
Databricks has documented how to use Spark's real-time streaming mode alongside Lakebase (their lakehouse metadata layer) to detect fraudulent transactions as they happen, rather than in batch windows. If you're building fraud models in Databricks, this shows you a concrete pattern for moving detection from hourly/daily jobs to sub-second latency—which matters because it lets you block or flag transactions before they complete, instead of catching fraud after the fact.
Card fraud operates in seconds. A stolen credit card number can fuel dozens of purchases...
Databricks added Unity Catalog support for governing AI agents, letting you apply consistent access controls, audit trails, and permission management across large numbers of agents the same way you would for data assets. If you're building or managing multiple AI agents in Databricks, this means you can now enforce who can use which agents and track their actions without setting up separate governance systems for each one.
A year ago, your organization had a dozen AI agents. Today, there are thousands.Every...
Databricks is partnering with Virtue Foundation, a nonprofit connecting medical volunteers to health services globally, likely providing data infrastructure or analytics support for their operations. For most practicing analysts, this is a partnership announcement rather than a product update—it doesn't change your Power BI, Fabric, or Databricks tooling, but it's worth noting if you work in healthcare or nonprofits where similar use cases might inform your own data strategy.
IntroductionVirtue Foundation is a nonprofit focused on global health delivery and...
This article announces Balaji J as the Databricks Community Champion for May 2026, recognizing their contributions to helping other users in the community. For practicing analysts, it's worth knowing who the active experts are in your tools' communities—you might follow their answers to common questions or learn from how they troubleshoot problems.
Our Community Champion Program celebrates members who consistently contribute their expertise, support fellow practitioners, and help shape a stronger and more collaborative Databricks Community. Every month, we recognize individuals whose passion for learning and willingness to share knowledge create a meaningful impa…
Someone built a working prototype showing how to deliver analytics to multiple customers using Databricks infrastructure, using Databricks' own partner framework and a real reference example (Firefly Analytics) as a guide. If you're building or planning a multi-tenant analytics platform on Databricks, this prototype shows you a tested architectural pattern rather than starting from scratch—which cuts down design time and reduces the risk of architectural mistakes.
Recently, I've been working with a customer to flesh out what Built-On Databricks could look like for them. We used the Databricks Partner Well Architected Framework (PWAF) and the Firefly Analytics example use case as reference, and built a working prototype.
I don't have the actual article excerpt to summarize — it appears the content didn't come through in your message. Could you paste the article excerpt or key details from the Databricks post? Once I have that, I can give you the 2–3 sentence breakdown of what's new and whether it affects your daily workflow.
Unity Catalog now offers APIs that let external tools (like Python scripts or third-party applications) access unstructured data files stored in Volumes using temporary, permission-based credentials instead of requiring you to manually set up cloud IAM roles. For your daily work, this means you can more easily integrate notebooks, ML pipelines, or external tools with your data lake without wrestling with access management—the system handles credential scoping automatically based on UC permissions you've already defined.
Learn how Unity Catalog Volumes and new credential-vending APIs let external tools securely access unstructured data with temporary, scoped credentials tied to UC permissions, eliminating manual IAM management. Govern tables, models, features, and unstructured data consistently across clouds and engines.
An enterprise improved their PII redaction system to handle 5 million documents in 17 days instead of 100+, moving from local processing to an Azure-based scaled solution. For analysts working with sensitive data, this means compliance redaction tasks that previously blocked projects for months can now complete in weeks, letting you move protected datasets into Fabric or Databricks pipelines without lengthy delays or manual workarounds.
Introduction When the first version of our PII redaction system went live, we could process roughly 1,500 insurance documents per hour on a local development machine. Impressive enough for a proof of concept. But when your client drops this on you: "We have 5 million documents that need compliance redaction. How long w…
The article covers using Python's itertools module to generate time-series features (like lags, rolling statistics, or aggregations) more efficiently than typical pandas or manual loops. If you're building ML models on time-stamped data in Python, this could speed up your feature engineering and reduce memory overhead, which matters when you're working with large datasets that don't fit comfortably in memory or when you need to iterate quickly on feature experiments.
Learn how to use Python itertools to build efficient and scalable time series features.
When you deploy AI agents for tasks like data processing or reporting, costs and resource use can spiral quickly unless you plan carefully—this article walks through using optimization techniques and data science methods to allocate agent skills and budgets efficiently. If you're building agent-based automation in Power BI, Fabric, or Python workflows, understanding how to frame these problems (which agents do what, who gets assigned where, what's your spending cap) directly affects whether your solution stays cost-effective or becomes a budget drain.
AI agents can quickly become expensive without a clear strategy for planning, skill coverage, and budgets. This article shows how to use operations research and data science to optimize AI agent cost and resource allocation. You will learn how to frame common agent problems—skill coverage, project assignment, and budge…
Mimesis is a Python library that generates fake but realistic data to replace sensitive information in production datasets. If you work with real customer or business data in development or testing, this tool lets you create safe anonymized copies for analysis and model training without exposing actual PII—useful when you need production-realistic data but can't use the real thing in non-production environments.
Learn how to utilize Python's Mimesis library for anonymizing sensitive production data, based on a step-by-step example to try yourself.
SageMaker AI endpoints now accept OpenAI-compatible API calls, meaning you can swap in a SageMaker endpoint URL without rewriting code that uses OpenAI's SDK, LangChain, or similar tools. For analysts building LLM-powered features or notebooks, this eliminates friction around authentication and client libraries—you can test models on SageMaker with minimal code changes, making it easier to experiment with or migrate between cloud providers without disrupting your workflow.
Today, Amazon SageMaker AI introduces OpenAI-compatible API support for real-time inference endpoints. If you use the OpenAI SDK, LangChain, or Strands Agents, you can now invoke models on SageMaker AI by changing only your endpoint URL. You don’t need a custom client, a SigV4 wrapper, or code rewrites. Overview With t…
AWS has released multimodal evaluators in Strands Evals that use AI models to judge whether image-based AI outputs are actually correct—for example, whether a generated caption accurately describes an image or whether extracted numbers from documents match reality. If you're building or validating models that work with images, documents, or charts, this gives you a programmatic way to automatically check quality instead of manually reviewing outputs or relying on text-only checks that can't see what the model is looking at.
If you’re building visual shopping, image or document understanding, or chart analysis, you need a way to verify whether your model’s response is actually grounded in the source image. A text-only evaluator cannot tell you whether a caption faithfully describes an image, whether an extracted invoice total matches the d…
AWS added streaming inference support to SageMaker for real-time voice applications, letting you send audio continuously and get transcriptions back on the same connection instead of waiting for batch responses. For most Power BI or Python analysts this won't affect your day-to-day work unless you're building voice-enabled dashboards or contact center analytics—but if you are, this removes a technical bottleneck that previously forced you to choose between latency and infrastructure complexity.
Voice agents, live captioning, contact center analytics, and accessibility tools all depend on real-time speech-to-text, where your application streams audio in and receives transcription back simultaneously over a single persistent connection. Traditional request-response inference falls short here because transcripti…
This article introduces Lean, a programming language designed around mathematical proof and formal verification. For data analysts, this is mostly academic interest—Lean doesn't integrate with Power BI, Fabric, Databricks, or standard Python ML workflows, so it won't change your day-to-day work unless you're specifically interested in formally verifying complex statistical logic or mathematical correctness in your analyses.
The syntax and semantics of mathematics The post Introduction to Lean for Programmers appeared first on Towards Data Science .
Proxy-Pointer RAG is a technique for cleaning up messy knowledge graphs by matching duplicate entities and relationships before feeding data into retrieval-augmented generation (RAG) systems. If you're building RAG pipelines on top of fragmented or poorly deduplicated reference data—common in enterprise environments—this approach could reduce hallucinations and improve answer quality without manual data cleanup.
A scalable semantic localization layer for entity and relationship reconciliation The post Proxy-Pointer RAG: Solving Entity and Relationship Sprawl in Large Knowledge Graphs appeared first on Towards Data Science .
Kimi WebBridge is a browser extension that lets AI agents automate web interactions—opening pages, filling forms, clicking buttons, and extracting data—without leaving your browser. For data analysts, this could streamline repetitive tasks like pulling data from web portals or APIs, though it's worth testing whether it integrates smoothly with your existing Power BI or Python workflows before relying on it for production work.
AI agents are evolving from answering questions to taking actions inside browsers. They can now open pages, click buttons, fill forms, extract data, and automate multi step workflows across websites. Moonshot AI’s Kimi WebBridge brings this capability to Chrome and Edge, allowing local AI agents to safely interact with…
Amazon Nova Sonic is a new voice AI model designed to handle real-time audio processing with lower latency and support for multi-agent coordination and tool integration. For data analysts, this mainly matters if your organization is building voice interfaces for data queries or dashboards—otherwise it's primarily relevant to ML engineers and backend teams building voice applications rather than someone working directly in Power BI, Fabric, or Python analytics.
Design patterns for scalable voice agents matter for organizations that need to deliver fast, natural, and reliable voice experiences. Many teams face challenges like high latency, managing real-time audio, and coordinating multiple agents in complex workflows. In this post, you’ll learn how to use Amazon Nova Sonic , …
Amazon Bedrock's Kiro CLI now retains conversational history and user preferences across multiple sessions instead of resetting after each one. For data analysts, this means you can reference earlier questions, context about your data schema, or analytical approaches from previous days without re-explaining your setup—similar to how a persistent chatbot remembers your work style rather than treating each session as brand new.
Agentic IDEs that forget what you told them in previous sessions aren’t very helpful. You work on your large codebase with complex business requirements for days or weeks. However, your IDE only remembers you during your current session and can’t recall your conversational history, preferences derived from the conversa…
AWS has published a benchmarking framework for comparing SQL processing engines on their platform, helping you evaluate which tool (Athena, Redshift, EMR, etc.) performs best for your specific workload patterns. This matters because choosing the wrong engine can mean slower queries and higher costs—this framework gives you concrete comparison criteria to test before committing to a solution for your analytics pipeline.
Selecting the right SQL processing solution for large-scale data analytics is a critical decision for organizations. As data volumes grow exponentially, the technology landscape has evolved to offer diverse options for processing and analyzing this information efficiently. This post presents a systematic framework for …
Amazon EMR on EC2 now supports generating large volumes of synthetic test data to replace production data in testing environments, avoiding the compliance and security risks of using real customer information. For analysts working with sensitive data in regulated industries, this means you can validate pipelines, test transformations, and troubleshoot without needing to set up separate anonymization processes or worry about accidentally exposing PII in development and QA environments.
As you scale your data systems, you face a challenge: how to test thoroughly without putting customer data at risk. Using production data for testing can expose sensitive customer information to unauthorized access or breaches. For customers in regulated industries like finance and healthcare, this risk isn’t only a co…
Amazon released new Redshift RG instances built on cheaper AWS Graviton processors that run 2.2–2.4x faster than the previous RA3 generation while costing 30% less per compute unit. If you're running Redshift warehouses or querying data lakes through Redshift, you could significantly reduce your infrastructure costs and speed up query times without changing your code or workflows.
On May 12, 2026, we announced the general availability of Amazon Redshift RG instances , powered by AWS Graviton processors. RG instances are up to 2.2x as fast for data warehouse workloads and up to 2.4x as fast for data lake workloads, all at 30% lower price per vCPU compared to RA3 instances. RG instances support al…
Google has developed ERA, an AI system that helps researchers generate and test hypotheses by analyzing large datasets and scientific literature to suggest experimental directions. For data analysts, this signals growing integration of AI-assisted discovery into research workflows—potentially affecting how you scope exploratory analysis, validate findings, and communicate insights to domain experts, though it's not yet a tool you'd use directly in Power BI or Python.
Ramp used OpenAI's Codex with GPT-5.5 to automate code review feedback, cutting review time from hours to minutes. For data analysts, this means AI-assisted code review could speed up your Pull Request cycles—useful if you're sharing Python scripts or DAX formulas with teammates—though the concrete impact depends on whether your organization adopts similar tooling.
How Ramp engineers use Codex with GPT-5.5 to review code and ship improvements, allowing them to get substantive feedback in minutes instead of hours.
SpaceX is building large-scale compute infrastructure (COLOSSUS II) to train AI models like Grok 5 while also selling spare capacity to other AI companies like Anthropic. For most data analysts working with BI and analytics tools, this doesn't directly affect your daily workflow—it's primarily relevant if your organization is considering alternative cloud compute providers or planning large-scale ML infrastructure investments.
We have the ability to use compute resources to support our proprietary AI applications (such as Grok 5, which is currently being trained at COLOSSUS II), while also providing access to select compute capacity to third-party customers. For example, in May 2026, we entered into Cloud Services Agreements with Anthropic P…
GitHub Copilot sessions can now be accessed outside your local machine, letting you pick up multiple concurrent coding tasks (like refactoring, debugging, and building features) from different devices. For data analysts, this means you can start a Python script or dbt transformation on your laptop, then continue or monitor it from a tablet or secondary machine without losing context or having to restart your work.
The best GitHub Copilot workflows don’t happen one–thing–at–a time. You might have an agent refactoring a module in VS Code, another debugging tests in the CLI, and a third scaffolding a new feature in the background. Managing all of that used to only be possible from your desk. The moment you stepped away from your la…
GitHub is testing an AI agent that automatically identifies and fixes accessibility issues in code. For data analysts, this could reduce manual code review work and help ensure your Python scripts, Power BI custom visuals, or Databricks notebooks meet accessibility standards without extra effort during development.
It is an understatement to say agents have become a popular way of working with code. GitHub has adopted agent-based code creation and editing for many of its initiatives, including piloting an agent to help with our commitment to accessibility . GitHub is currently piloting an experimental general-purpose accessibilit…
This article describes building a GitHub CLI extension in Go using GitHub Copilot, demonstrated through a fun project that generates procedurally created roguelike dungeons. For data analysts, it's not directly relevant to daily Power BI, Fabric, or Python ML work unless you're interested in how AI coding assistants can help you build custom CLI tools or automate repository-based workflows.
I got nerd-sniped into the GitHub Copilot CLI Challenge and made a questionable decision: I turned my codebase into a roguelike dungeon. It started with a simple prompt: Build a GitHub CLI extension in Go that takes the current repository and turns it into a playable roguelike dungeon, with dungeons generated with BSP …
Copilot Studio added governance controls for AI agents, improved workflow automation, and the ability to embed business applications directly into agents. For data analysts, this means you can now build self-service agent interfaces that connect to your BI tools and data platforms with better oversight, potentially reducing ad-hoc data requests and letting non-technical stakeholders access insights more independently.
In this article Build and scale agents with better visibility and control Expand workflows into intelligent, governed automation systems Bring business apps directly into your agents What else is new and improved in Copilot Studio Stay up to date on all things Copilot Studio As organizations scale their use of AI agent…
GitHub is making their AI agent workflows use fewer tokens (and thus cost less) when they automatically run maintenance tasks on code repositories. For data analysts, this matters if you use GitHub for version control with automated CI jobs—lower token costs mean you can run more frequent code quality checks and automated data pipeline maintenance without watching your bill spike.
GitHub Agentic Workflows is like a team of street sweepers that clean up little messes in your repo. These teams significantly improve repo hygiene and quality, but as with all agentic work, cost is a growing concern for developers. And because CI jobs like agentic workflows are automatically scheduled and triggered, c…
AI-generated code is becoming common in pull requests, but it often contains hidden technical debt and redundancy that passes review because it looks clean on the surface. As a data analyst, this matters because approving AI-generated ETL, transformation, or analysis code without scrutiny could leave you maintaining bloated, inefficient pipelines that are harder to debug and scale—particularly risky when you're working across Power BI, Fabric, or Databricks where performance directly impacts your reports and queries.
You’ve probably already approved one without realizing it. The tests passed. The code was clean. You merged it. But it was agent-generated—and that ease of approval is exactly the problem. A January 2026 study, “More Code, Less Reuse” , found that agent-generated code introduces more redundancy and more technical debt …
Microsoft is open-sourcing the Azure Integrated HSM, a hardware security module that encrypts and protects sensitive data at the infrastructure level. For data analysts working with regulated or sensitive datasets in Azure environments, this means better visibility into how your data encryption actually works and potentially stronger compliance assurance—though the day-to-day impact on Power BI or Fabric workflows depends on your organization's security policies and whether they adopt this for your specific workloads.
As cloud workloads become more agentic and AI systems increasingly handle mission‑critical data, trust must be engineered into the infrastructure at every layer. At Microsoft, security is designed into the foundation of our cloud infrastructure, from silicon to services. With the Azure Integrated Hardware Security Modu…
Microsoft Copilot Studio now supports real-time voice agents, letting you build AI assistants that can handle customer support conversations through voice instead of just text. For data analysts, this means you'll likely need to design new data pipelines and monitoring dashboards to track voice interaction metrics, compliance logs, and agent performance data that weren't part of text-only workflows.
Customers expect support that resolves issues quickly, delivers consistent answers, and works seamlessly across channels. For organizations, this creates a familiar tension: how do you deliver high‑quality service at scale without losing control over cost, compliance, or experience? That’s why we’re excited to announce…
This is a roundup of Google's 2026 announcements including Gemini Omni (likely an upgraded AI model), Antigravity, and Universal Cart, among 100 total updates. Without specifics on which tools integrate with your BI/analytics stack (Power BI, Fabric, Databricks, Python), it's unclear which announcements affect your daily workflow—you'd need to dig into whether any Google services you depend on got material upgrades or new API capabilities.
This year at Google I/O 2026, we announced Gemini Omni, Google Antigravity, Universal Cart and so much more. Here are the highlights.
OpenAI's AI model solved a longstanding mathematical problem in discrete geometry by disproving an 80-year-old conjecture about unit distances. For data analysts, this demonstrates AI's emerging capability to solve novel mathematical problems, which could eventually inform how algorithmic optimization and computational geometry are applied to complex data modeling—though this particular breakthrough doesn't directly change your day-to-day work with BI tools or Python analytics right now.
An OpenAI model solved the 80-year-old unit distance problem, disproving a major conjecture in discrete geometry and marking a milestone in AI-driven mathematics.
Mike Veerman built an interactive tool that lets you visually see what different LLM token speeds actually feel like in practice—so when a vendor says "30 tokens/second," you can watch it play out instead of guessing. If you're evaluating LLMs for reports, chatbots, or real-time analytics features, this helps you judge whether a model's speed will feel snappy or sluggish to your end users before you commit to it.
How fast is 10 tokens per second really? Neat little HTML app by Mike Veerman ( source code here ) which simulates LLM token output speeds from 5/second to 800/second. Useful if you see a model advertised as "30 tokens/second" and want to get a feel for what that actually looks like. Via Hacker News Tags: ai , generati…
# Summary
Simon Willison reviews Google I/O announcements but focuses only on features actually available to use now, rather than vaporware roadmap items—a practical stance since many "coming soon" products never ship or change significantly before launch. For data analysts, this means if you're considering Google's new tools, you might want to wait for his deeper dives on production-ready features rather than getting excited about announcements that may not materialize or may differ from initial promises.
It's hard to find much to write about Google I/O this year because I have a policy of not writing about anything that I can't try out myself, and a lot of the big announcements are "coming soon". I actually prefer to write about things that are in general availability, because I've had instances in the past where the p…
Google is adding tools to track and display the creation and editing history of web content. For data analysts, this could matter when you're evaluating source credibility or auditing how datasets and documentation have changed over time, though the excerpt doesn't specify whether these tools apply to data-specific platforms you'd use daily like Power BI or Databricks.
We're expanding our tools to help you understand how content was created and edited across the web.
DataRobot's certification track for analysts using the platform's AutoML, time-series, and predictive workflows. Validates day-to-day analyst skills like data prep, model selection, and explainability without writing ML code from scratch.
Builds and configures custom copilots and AI agents in Microsoft Copilot Studio. Directly relevant for Power BI and Fabric analysts: the same copilot patterns appear in Power BI's natural-language Q&A and in Copilot in Fabric, so the cert hardens the AI-prompting and grounding skills you already use day-to-day.
Anthropic Courses — Prompt Engineering & AI Fundamentals
Anthropic's official self-paced courses covering prompt engineering, AI fluency for everyday work, and building practical apps with Claude. Not a formal certification, but the closest thing to an authoritative learning path for analysts moving into AI-assisted workflows.
Agentic AI moved from demo to production this month — Databricks shipped Genie as a real data agent, Power BI opened a Data Analysis Expressions (DAX) Representational State Transfer Application Programming Interface (REST API) for programmatic access, and Anthropic raised Claude Code limits while patching a sandbox escape. The takeaway: analysts now need to think about semantic layers, governance, and code-agent security as core skills, not side projects.
Per-section highlights
⚡ Power BI & Fabric
Execute DAX Queries REST API is in preview — you can now hit semantic models programmatically from Python, Fabric pipelines, or any external app without building a report or embedding the model.
Composite semantic models mixing Direct Lake and import tables hit public preview, so you can keep large fact tables in Direct Lake for speed and import smaller dimensions for transformation flexibility — no more all-or-nothing storage mode decisions.
DAX User-Defined Functions (UDFs) now support typed parameters (MEASUREREF, COLUMNREF, TABLEREF, CALENDARREF), making reusable DAX libraries across models actually safe to share.
Modern visual tooltips went generally available across Desktop, web, mobile, Teams, and embedded — one consistent hover experience everywhere reports are consumed.
A centralized semantic model settings pane (preview) and the April feature summary land more Copilot mobile features — worth a scan before your next model tune-up.
◈ Databricks
Genie, Databricks' data agent for natural-language questions over your lakehouse, got a major push as a production-ready answer to ad-hoc SQL requests — but Advancing Analytics' counterpoint argues the bottleneck is your semantic layer, not the chat UI.
Catalog Commits is generally available, aligning Delta with open catalogs so external engines can read and write Unity Catalog (UC) managed tables consistently.
External engines can now run Data Manipulation Language (DML) — insert/update/delete — directly against UC managed Delta tables, removing a big friction point for non-Databricks pipelines writing into governed storage.
New 5XL Structured Query Language (SQL) warehouse size for workloads where 4XL hits its limit on heavy Extract-Transform-Load (ETL) and complex analytics.
Semantic caching pattern using Databricks Lakebase plus pgvector — practical guide to cutting Large Language Model (LLM) costs when users ask the same question fifteen different ways.
⚗ ML & AI Tools
Towards Data Science makes the case that batch-vs-stream is the wrong framing — the real question is "when does the answer need to be true?" — useful gut-check before architecting your next Fabric or Databricks pipeline.
A practitioner argues LLM meeting summarizers skip the identification step (what does the data actually support?), the same way bad regressions skip exploratory data analysis — treat AI summaries as drafts, not truth.
Prompt compression techniques for agentic loops — direct cost relief if you're running multi-step agents that burn tokens on context every iteration.
Refresher pieces on PySpark fundamentals and modern Python type annotations — worth bookmarking if you're onboarding teammates or hardening notebook code into production scripts.
🤖 Agentic AI & LLMs
Anthropic raised Claude Code usage limits (credited to a new SpaceX/Colossus compute deal) — fewer rate-limit interruptions when iterating on Python, SQL, or DAX.
Claude Code CVE-2026-39861: sandbox escape via symlink. Patch now if you're running Claude Code against sensitive local data or production credentials.
OpenAI launched DeployCo, a dedicated enterprise-deployment arm, plus published its internal Codex security playbook (sandboxing, approvals, network policy, agent telemetry) — a reasonable checklist to bring to your own platform team.
Anthropic published Natural Language Autoencoders (NLAs), which translate Claude's internal activations into readable text — early but real progress on auditing why a model answered the way it did.
Shopify's internal coding agent "River" works entirely in public Slack channels — a governance pattern worth stealing if your team is rolling out coding agents.
Cross-cutting themes
The semantic layer is the new battleground
Three stories pointed at the same thing this month. Databricks shipped Genie as a frontier data agent, but Advancing Analytics' "Genie is a Semantic Layer Problem, Not a Chat Problem" cut straight to the point: natural-language querying only works when the underlying model is clean. Meanwhile Power BI opened the DAX REST API, made composite Direct Lake plus import models a real option, and added typed parameters to DAX UDFs — all investments in making semantic models more programmable and reusable. The pattern: whoever owns the well-modeled, well-governed semantic layer owns the AI experience on top of it, and that's increasingly the analyst's job.
Agentic AI is now a production concern, not a demo
Genie went from preview to a real Databricks data agent, Agent Bricks showed up in a healthcare pilot-to-production series, supply chain workloads moved "from dashboards to agents" in a BrickTalk, and Copilot Studio added real-time voice agents. On the developer side, Claude Code raised limits, OpenAI published how it runs Codex safely, and the Claude Code CVE reminded everyone that code-running agents need real sandboxing. The shift for analysts: you're no longer evaluating whether to use agents — you're being asked how to govern them, cost-control them (see semantic caching, prompt compression), and validate their outputs.
Open catalogs and external engines blur the platform lines
Catalog Commits going generally available and external engines getting full DML on Unity Catalog managed tables means your Python jobs, Fabric pipelines, and non-Databricks tooling can now write directly into governed Delta tables. Combined with Power BI's DAX REST API and composite Direct Lake models, the practical effect is that the wall between "Databricks work" and "Fabric work" keeps thinning. Analysts who can stitch pipelines across both — and keep governance intact through UC — are increasingly the ones in demand.
The single most useful framing this month for anyone being asked to "roll out Genie" or any natural-language BI tool. Read this before your next stakeholder meeting on AI-driven analytics.
This unlocks a real architectural choice — keep big facts in Direct Lake, import smaller dimensions — that can cut refresh times and complexity. Worth the full read so you understand the tradeoffs before mixing modes.
A genuinely new integration surface. If you script in Python or orchestrate in Fabric, this is the first credible way to treat your semantic model as a calculation API for external apps.
Concrete cost-control pattern with code. If you've shipped any LLM-powered feature, this approach to deduplicating semantically-similar requests will pay for itself fast.
Pairs well with the semantic caching piece. Multi-step agents balloon token spend through repeated context — this gives you a practical lever to pull.
What to watch next month
Databricks Data + AI Summit (June 15–18, San Francisco)Expect Genie, Agent Bricks, Catalog Commits, and 5XL warehouses to get their full reveal. If you're planning Q3 architecture, this sets the roadmap.
Microsoft Build 2026 (June 2–3, San Francisco)Likely landing zone for DAX REST API general availability, composite Direct Lake model GA, and the next round of Copilot-in-Fabric features.
Snowflake Summit 26 (June 1–4, San Francisco)Worth watching for Snowflake's response to Catalog Commits and Genie — open catalog interoperability is now a competitive front.
Claude Code security posture after CVE-2026-39861If your team adopted Claude Code in the last quarter, expect security to ask hard questions. Get ahead of it: patch, document sandboxing, and review what files the agent can actually reach.
Power BI preview features moving to general availabilityDAX REST API, composite Direct Lake models, and the semantic model settings pane are all in preview. Track the monthly feature summaries — GA timing affects whether you can use them in production reports.
Signal
Cutting through the noise in modern analytics — AI-curated intelligence on Power BI, Databricks, ML/AI, and Agentic AI, built for practicing analysts.
How it works
AI-curated daily across Power BI, Databricks, ML/AI, and Agentic AI
AI summaries rewritten for analysts — what changed and why it matters to your day-to-day
Monthly executive briefs that pull the cross-cutting themes together
By the numbers
303Articles processed
303AI summaries
1Briefs generated
71Sources monitored
DailyUpdates
Tech stack
Claude AIFirebase HostingPythonYouTube Data APIGitHub Actions
About the creator
Built by Matt Valladarez. Signal exists because the analytics ecosystem moves faster than any one practitioner can track — this is the digest I wanted to read every morning, automated end-to-end so the curation never slips.
💡 Question of the Day
This video can't be embedded here. Watch it directly on YouTube instead.