April 2026 brings layout flexibility improvements, expanded mobile Copilot features, and several preview capabilities across Power BI's reporting and modeling tools. For your daily work, better layout controls mean faster report tweaking, while stronger mobile Copilot support lets you iterate and troubleshoot on-the-go instead of being tied to desktop—though you'll want to check the preview docs to see which features are production-ready for your environment.
Welcome to the April Power BI update! Power BI’s April 2026 update is here, bringing continued improvements across Copilot and AI, reporting, visuals, and modeling. This release includes more flexibility when working with layouts and visuals, expanded Copilot experiences—especially on mobile—and several preview feature…
Copilot Studio added governance controls for AI agents, improved workflow automation, and the ability to embed business applications directly into agents. For data analysts, this means you can now build self-service agent interfaces that connect to your BI tools and data platforms with better oversight, potentially reducing ad-hoc data requests and letting non-technical stakeholders access insights more independently.
In this article Build and scale agents with better visibility and control Expand workflows into intelligent, governed automation systems Bring business apps directly into your agents What else is new and improved in Copilot Studio Stay up to date on all things Copilot Studio As organizations scale their use of AI agent…
The episode argues that Power BI bookmarks—a feature for saving and switching between report states—should be reconsidered in favor of newer alternatives that likely offer better performance or user experience. If you're currently using bookmarks for navigation or report state management, this suggests evaluating whether Microsoft has released better tools to replace this functionality in your reports.
In Episode 526 of Explicit Measures, Mike Carlo and Tommy Puglia unpack the latest Power BI and Microsoft Fabric topics from the show. You’ll get a quick read on the episode’s biggest ideas, why they matter, and where to dig deeper in the full conversation.
Microsoft has released a preview REST API that lets you execute DAX queries programmatically against Power BI datasets without opening the desktop application. This matters because you can now automate data pulls, integrate Power BI queries into Python scripts or other workflows, and build custom solutions that query your semantic models on demand—eliminating manual exports and reducing friction between Power BI and your other analytical tools.
# Summary
Microsoft is adding a new settings pane in Power BI's semantic model interface (currently in preview), giving you centralized controls for model configuration without jumping between different menus. This could speed up your setup work by consolidating commonly-adjusted parameters in one place, though the concrete impact depends on which specific settings are included—worth checking once it's available in your environment.
This episode covers recent Power BI and Microsoft Fabric feature updates discussed by the hosts. Without access to the full episode content, the excerpt doesn't specify which features are new, so check the full conversation to see if any changes affect your current dashboards, data modeling, or refresh schedules.
In Episode 525 of Explicit Measures, Mike Carlo and Tommy Puglia unpack the latest Power BI and Microsoft Fabric topics from the show. You’ll get a quick read on the episode’s biggest ideas, why they matter, and where to dig deeper in the full conversation.
Slicers can't directly filter measures because measures are calculations, not data values—but the article explains workarounds to achieve the filtering effect you need. This matters because you'll likely hit this limitation when building dashboards, and knowing the proper techniques (like using measure branches or filter context) will save you time troubleshooting why your slicer isn't working as expected.
A slicer cannot filter a measure: let’s analyze this common request by explaining how to use a slicer to filter a measure, after discussing the real meaning of using a measure with a slicer.
Slicers in Power BI fundamentally filter table columns, not measures—a distinction that trips up newcomers trying to dynamically filter aggregations. Understanding this difference matters for your reports because it clarifies when you need to restructure your data model or use alternative techniques like measure branching instead of expecting a slicer to do something it's architecturally designed not to do.
A slicer cannot filter a measure. In this article, we analyze this common request by explaining how to use a slicer to filter a measure, after discussing the real meaning of using a measure with a slicer. A very common request by Power BI newbies is, “How can I use a slicer to filter a measure rather than a regular mod…
This episode highlights Power BI and Fabric features that exist but aren't widely used, covering practical capabilities that could streamline your reporting work. If you're building dashboards and doing standard analysis, there's likely at least one overlooked feature here that could save you time on data modeling, visualization, or refresh optimization—worth a 20-minute listen to see what you're missing in your daily toolkit.
In Episode 524 of Explicit Measures, Mike Carlo and Tommy Puglia unpack the latest Power BI and Microsoft Fabric topics from the show. You’ll get a quick read on the episode’s biggest ideas, why they matter, and where to dig deeper in the full conversation.
You can now mix Direct Lake and import tables in the same semantic model, giving you flexibility to use fast direct queries on some tables while importing others for performance or compatibility reasons. This matters because you're no longer locked into one storage mode per model—you can optimize each table individually based on your refresh cadence, query patterns, and data freshness needs instead of rebuilding models to accommodate different requirements.
Getting your data job done just got easier with composite semantic models , mixing Direct Lake tables with import tables, now available in public preview. Direct Lake on OneLake table storage mode already could mix tables from other Fabric data sources, such as lakehouses, warehouses, SQL databases in Fabric, and mirro…
Power BI has rolled out updated visual tooltips across all its platforms (Desktop, web, mobile, Teams, and embedded reports), replacing the older tooltip design. For analysts sharing reports with stakeholders, this means tooltips now look and behave consistently everywhere your reports appear, reducing confusion when people view the same dashboard on different devices or in different contexts.
Power BI’s latest update introduces an enhancement to how users interact with reports with the general availability of modern visual tooltips . All Power BI reports—from Power BI Desktop to Power BI reports in the web, in the mobile app, in Teams, and embedded in any website—now use the updated visual tooltips, making …
This episode recaps recent developments across Power BI and Microsoft Fabric from the hosts' conversation. Since the excerpt doesn't specify which features or changes were discussed, you'd need to listen to the full episode to understand what's actually new and how it might affect your reporting workflows or data pipeline work.
In Episode 523 of Explicit Measures, Mike Carlo and Tommy Puglia unpack the latest Power BI and Microsoft Fabric topics from the show. You’ll get a quick read on the episode’s biggest ideas, why they matter, and where to dig deeper in the full conversation.
Microsoft Copilot Studio now supports real-time voice agents, letting you build AI assistants that can handle customer support conversations through voice instead of just text. For data analysts, this means you'll likely need to design new data pipelines and monitoring dashboards to track voice interaction metrics, compliance logs, and agent performance data that weren't part of text-only workflows.
Customers expect support that resolves issues quickly, delivers consistent answers, and works seamlessly across channels. For organizations, this creates a familiar tension: how do you deliver high‑quality service at scale without losing control over cost, compliance, or experience? That’s why we’re excited to announce…
This episode covers strategies for developing AI capabilities within Microsoft Fabric to help analysts work more effectively with modern data tools. Since Fabric increasingly integrates AI features into everyday tasks like data preparation and analysis, building these skills now means you'll be able to automate repetitive work and leverage AI-assisted features rather than learning them reactively when they become mandatory in your workflow.
In Episode 522 of Explicit Measures, Mike Carlo and Tommy Puglia unpack the latest Power BI and Microsoft Fabric topics from the show. You’ll get a quick read on the episode’s biggest ideas, why they matter, and where to dig deeper in the full conversation.
DAX now lets you specify parameter types in user-defined functions using MEASUREREF, COLUMNREF, TABLEREF, and CALENDARREF—essentially declaring what kind of object each parameter should accept. This matters because it prevents runtime errors from passing the wrong object type into a function and makes your DAX code more self-documenting, so you (or colleagues) won't accidentally pass a table where a column was expected.
Learn how to specify the parameter types in DAX user-defined functions using MEASUREREF, COLUMNREF, TABLEREF, and CALENDARREF.
Databricks has added an "Agent mode" to its AI/BI Genie tool that can handle more complex analytical questions than the standard chat interface, moving beyond simple queries to investigative "why" questions. For analysts, this means you could potentially spend less time manually digging through dashboards and writing queries to explain unexpected patterns—Genie Agent can do some of that reasoning work for you, though you'd still need to validate its conclusions.
See how AI/BI Genie Agent can answer much more complex questions than standard chat. Dashboards, Genie Chat, and Genie Agent all come together in AI/BI to form a comprehensive analytics suite.
Genie (Microsoft Fabric's natural language query tool) is getting attention for enterprise rollouts, but the article argues its real bottleneck isn't the chat interface—it's the semantic layer underneath that needs to be properly designed first. For your workflow, this means that before pushing Genie to your business users, you'll need to invest time ensuring your semantic models are clean, well-documented, and aligned with how the organization actually asks questions, or you'll end up supporting countless confused queries and inconsistent results.
I've been setting up Genie spaces for clients, and the conversations always start in the same place. Someone from the business sees the demo and asks how quickly we can roll it out across the organisation.
Databricks now offers 5XL SQL Warehouses—a larger compute tier above the existing 4XL option—for workloads that consistently hit performance or timeout limits. If you're running ETL jobs or analytical queries that max out your current warehouse capacity and cause SLA misses, this gives you a path to scale up without redesigning your queries or data model.
A practical guide to evaluating 5XL SQL Warehouses for intensive ETL and complex analytical queries that push 4XL to its limits.
Databricks released Agent Bricks, a framework for building specialized AI agents that automate manufacturing workflows like planning, forecasting, and quality control—cutting planning cycles from days to minutes. If you're building analytics pipelines or forecasting models for manufacturing clients, this means you'll likely be asked to integrate agentic AI into your existing Databricks workflows rather than just serving static dashboards or reports.
Manufacturing moves too fast for generic AI. In this episode, see how specialized AI agents built on Agent Bricks help manufacturers improve planning, forecasting, logistics, and quality at scale. The impact: Planning & inventory: Manufacturers cut planning time from days to minutes and use real-time data from 18,000+ …
Consumers are already using AI chatbots like ChatGPT to research and compare insurance products in real time, shifting purchasing decisions away from traditional distribution channels. For data analysts, this means your organization's customer acquisition models, competitive positioning dashboards, and channel attribution analysis likely need urgent updates—the traffic patterns and conversion funnels you've been tracking are probably shifting faster than your current refresh cycles can detect.
A consumer is asking ChatGPT which home insurance to buy, where to get the cheapest car insurance... right now. Not next year. Not when the technology matures. Now.
A Prompt Registry is a centralized system for storing, versioning, and managing AI prompts—similar to how feature stores organize ML features. If you're building AI applications across your organization, this helps prevent duplicate work, track prompt changes, and maintain consistency instead of having prompts scattered across notebooks and Slack messages.
Introduction A few years back, feature stores became the standard way to bring order to machine learning features by centralising, governing and tracking them. Now we are facing the same challenge with prompts. They multiply quickly, get tweaked without context and become difficult to manage.
The excerpt provided doesn't specify what the "new trick" actually is—it only establishes that Databricks Feature Store exists as a centralized tool for cataloging and sharing ML features. Without details on the actual update, I can't tell you whether it changes your daily work or how.
Databricks Feature Store has been around for a while now, giving data scientists a central place to catalogue, share, and reuse well-crafted features for machine learning models. It’s not the first feature store on the market, but it’s become a key part of Databricks’ AI and ML ecosystem.
This is a tutorial video walking through workspace setup in Databricks, covering what a workspace is and the steps to create one, with information on free tier access. If you're new to Databricks or setting up your first environment, this gives you the practical steps to get running; if you're already working in Databricks, it won't change your daily workflow.
To use Databricks, you need a workspace. In this video, I explain what a Databricks workspace is and how to create one. I also tell you how you can use Databricks for free. Join my Patreon Community https://www.patreon.com/bePatron?u=63260756 Slides https://github.com/bcafferky/shared/blob/master/MasterDatabricks_2nd/L…
Databricks has released a structured learning program designed to help analysts develop analytics engineering skills—things like data transformation, modeling, and pipeline building on their platform. If you're working in Databricks and want a clear path to level up beyond basic querying (or you're evaluating whether to invest time there), this gives you a documented curriculum instead of piecing together tutorials yourself.
Today, we are launching the new Databricks Analytics Engineer Learning Pathway. This...
I don't have the actual article excerpt to summarize — it appears the content didn't come through in your message. Could you paste the article excerpt or key details from the Databricks post? Once I have that, I can give you the 2–3 sentence breakdown of what's new and whether it affects your daily workflow.
Unity Catalog now offers APIs that let external tools (like Python scripts or third-party applications) access unstructured data files stored in Volumes using temporary, permission-based credentials instead of requiring you to manually set up cloud IAM roles. For your daily work, this means you can more easily integrate notebooks, ML pipelines, or external tools with your data lake without wrestling with access management—the system handles credential scoping automatically based on UC permissions you've already defined.
Learn how Unity Catalog Volumes and new credential-vending APIs let external tools securely access unstructured data with temporary, scoped credentials tied to UC permissions, eliminating manual IAM management. Govern tables, models, features, and unstructured data consistently across clouds and engines.
Databricks has opened up Unity Catalog's APIs so external tools and systems can directly read and write data governed by Unity Catalog, rather than being locked into the Databricks ecosystem. If you work across multiple platforms (Power BI, Python, Fabric), this means your metadata, permissions, and data lineage can stay consistent across tools instead of creating separate governance layers for each one.
Unity Catalog was designed for the open lakehouse. Previously, data teams were stuck...
The article argues that healthcare organizations should consolidate clinical data in a lakehouse architecture rather than treating storage as the primary bottleneck. For analysts, this likely means your clinical workflows could shift toward unified data models and real-time query capabilities instead of managing fragmented data sources—potentially reducing ETL complexity and giving you faster access to integrated datasets for operational reporting.
The clinical data problem is not a storage problem. Most organizations already have...
Databricks has made three data governance features generally available in Unity Catalog: attribute-based access control (ABAC) for row-level filtering, column masking, governed tags, and data classification. For your daily work, this means you can now enforce fine-grained access policies at scale—so different users automatically see different rows and columns based on their attributes—without having to build custom filtering logic into every query or report.
Scale data protection with automated governance in Unity CatalogAs data estates grow...
An enterprise improved their PII redaction system to handle 5 million documents in 17 days instead of 100+, moving from local processing to an Azure-based scaled solution. For analysts working with sensitive data, this means compliance redaction tasks that previously blocked projects for months can now complete in weeks, letting you move protected datasets into Fabric or Databricks pipelines without lengthy delays or manual workarounds.
Introduction When the first version of our PII redaction system went live, we could process roughly 1,500 insurance documents per hour on a local development machine. Impressive enough for a proof of concept. But when your client drops this on you: "We have 5 million documents that need compliance redaction. How long w…
Polars is a Python dataframe library built as a faster, more memory-efficient alternative to pandas. If you're currently using pandas for data cleaning and transformation work, Polars could meaningfully speed up your workflows—especially when handling large datasets—though it would require learning a slightly different API and potentially rewriting existing scripts.
Introduction It’s been a while since I’ve posted anything on the blog. One of the primary reasons for the hiatus is that I have been using python and pandas but not to do anything very new or different. In order to shake things up and hopefully get back into the blog a bit, I’m going to write about polars . This articl…
This article outlines practical trade-off decisions you face once an ML model moves to production—choices that aren't covered in standard training but directly impact model performance, cost, and reliability. For your daily work, it likely covers decisions like accuracy vs. latency, retraining frequency vs. drift monitoring, or infrastructure costs vs. prediction speed that you'll encounter when moving from notebooks to production dashboards and reporting in Power BI or Fabric.
The production trade-offs that only appear once your model is live. The post Six Choices Every AI Engineer Has to Make (and Nobody Teaches) appeared first on Towards Data Science .
Pandas remains a practical choice for most real-world data work—it handles the vast majority of use cases efficiently despite newer alternatives like Polars gaining attention. If you're already using Pandas in your Python workflows, there's no urgent need to switch unless you're consistently hitting performance walls with multi-billion-row datasets.
Billions of rows might be the exception, but for everything else, Pandas is still a highly reliable tool. The post Pandas Isn’t Going Anywhere: Why It’s Still My Go-To for Data Wrangling appeared first on Towards Data Science .
This is a practical guide on how to transform raw data into risk categories for credit scoring models. If you build credit risk or classification models, it covers the data preparation and feature engineering steps you'd actually use—bucketing continuous variables, handling missing values, and encoding categorical features in ways that improve model performance and interpretability.
A practical guide to categorization in credit scoring The post From Raw Data to Risk Classes appeared first on Towards Data Science .
The author refactored a large codebase using CodeSpeak, an AI-native development approach, to see how automation could handle code generation and maintenance at scale. For analysts building production pipelines in Python or managing complex repositories, this shows a practical path for offloading boilerplate and repetitive code tasks to AI—potentially speeding up feature development and reducing manual upkeep, though you'd want to evaluate the quality guardrails and testing overhead on your own projects.
What happened when I migrated a 10K+ line project into an AI-native workflow The post I Let CodeSpeak Take Over My Repository appeared first on Towards Data Science .
This is a beginner's tutorial on exploratory data analysis (EDA) techniques using Pandas, Matplotlib, and Seaborn, demonstrated on the Titanic dataset. If you work primarily in Power BI or Fabric, this won't change your workflow, but if you use Python for ad-hoc analysis or data profiling before loading into your main tools, it's a useful refresher on structuring EDA code and visualization patterns.
A beginner's tutorial on exploratory data analysis using Pandas, Matplolib, and Seaborn The post Exploring Patterns of Survival from the Titanic Dataset appeared first on Towards Data Science .
Prompt compression reduces the number of tokens sent to LLMs in agentic workflows, lowering API costs when agents repeatedly call external services. If you're building AI-assisted analytics tools or automating analysis tasks with agents, implementing this technique could meaningfully cut your LLM spend without sacrificing output quality.
Agentic loops in production can be synonymous with high costs, especially when it comes to both LLM and external application usage via APIs, where billing is often closely related to token usage.
Pandas' groupby operation has a subtle behavior that can lead to errors in your code, similar to copy-paste mistakes in Excel but harder to spot. Understanding this issue matters because groupby is a core operation in data transformation workflows, and catching these mistakes early prevents silent data quality problems that could propagate into your analyses or reports.
Introduction One of the reasons I like using pandas instead of Excel for data analysis is that it is easier to avoid certain types of copy-paste Excel errors. As great as pandas is, there is still plenty of opportunity to make errors with pandas code. This article discusses a subtle issue with pandas groupby code that …
When you generate multiple Excel reports from Jupyter Notebooks, tracking which notebook created which file becomes messy at scale—this article explains how to embed metadata directly into Excel document properties to solve that problem. For analysts managing dozens of ad-hoc reports, this saves time auditing your work and helps you quickly locate the source code behind any output your stakeholders are viewing.
Introduction When doing analysis with Jupyter Notebooks, you will frequently find yourself generating ad-hoc Excel reports to distribute to your end-users. After time, you might end up with dozens (or hundreds) of notebooks and it can be challenging to remember which notebook generated which Excel report. I have starte…
Text cleaning in pandas can be a performance bottleneck at scale, and this post covers multiple techniques to handle it efficiently. For your workflow, this matters because poorly optimized text cleaning (regex operations, string replacements, case conversions) can slow down your data prep pipeline significantly—especially if you're working with large datasets before moving them into Power BI or Fabric for analysis.
Introduction It’s no secret that data cleaning is a large portion of the data analysis process. When using pandas, there are multiple techniques for cleaning text fields to prepare for further analysis. As data sets grow large, it is important to find efficient methods that perform in a reasonable time and are maintain…
This case study demonstrates how to use Pandas to programmatically generate Excel files and send them via Outlook, automating what's typically done manually. For data analysts, this is relevant if you're regularly pushing reports or data exports to stakeholders—a Python script handling file creation and email distribution could eliminate repetitive manual steps in your reporting workflow.
Introduction I enjoy hearing from readers that have used concepts from this blog to solve their own problems. It always amazes me when I see examples where only a few lines of python code can solve a real business problem and save organizations a lot of time and money. I am also impressed when people figure out how to …
marimo pair is an AI coding agent that helps with data science tasks like data wrangling and exploratory research, essentially acting as a pair programmer within your data science workflow. For analysts working in Python, this could speed up repetitive data cleaning and transformation work, though the excerpt doesn't detail whether it integrates with Power BI or Fabric environments or how it compares to existing tools you might already use.
How do you add agent skills to your data science workflow? How can a coding agent assist with data wrangling and research? This week on the show, Trevor Manz from marimo joins us to discuss marimo pair. [ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple …
Altair is a Python visualization library that lets you build charts with a declarative syntax—you specify what data and visual encoding you want, rather than manually configuring axes, scales, and figure objects like you would in Matplotlib. If you spend time wrestling with boilerplate plotting code before actually exploring your data, Altair could let you go from dataset to chart faster, though it's worth testing whether its abstraction level and interactivity features fit your existing Power BI or notebook workflows.
There’s a moment many data analysts know well: you have a new dataset and a clear question, and you open a notebook only to find yourself writing boilerplate axis and figure setup before you’ve even looked at the data. Matplotlib gives you fine-grained control, but that control comes with a cost. Altair takes a complet…
This podcast episode covers automating exploratory data analysis (EDA) in Python and Python comprehensions, addressing practical techniques for quickly understanding new datasets and sharing findings with teammates. For analysts working with Python, this could streamline your initial data investigation phase—potentially reducing manual steps when you're profiling new sources before visualization or modeling work in Power BI or Fabric.
How do you quickly get an understanding of what's inside a new set of data? How can you share an exploratory data analysis with your team? Christopher Trudeau is back on the show this week with another batch of PyCoder's Weekly articles and projects. [ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Py…
This is a podcast episode about Real Python's internal editorial process—how they decide which topics to cover, review submissions, and ensure quality in their tutorials. It's not directly relevant to your daily analytics work unless you're actively writing or publishing documentation; it's more useful if you want to improve your own technical writing or understand what goes into creating the learning resources you use.
What goes into creating the tutorials you read at Real Python? What are the steps in the editorial process, and who are the people behind the scenes? This week on the show, Real Python team members Martin Breuss, Brenda Weleschuk, and Philipp Acsany join us to discuss topic curation, review stages, and quality assuranc…
Simon Willison built a QR code generator tool with Claude's help that can encode text, URLs, and WiFi credentials into scannable QR codes. For data analysts, this is a lightweight utility that could streamline sharing reports, dashboards, or documentation links—you could generate a QR code to embed in presentations or documents rather than pasting long URLs, though it's more of a nice-to-have productivity tool than something that changes core analytical work.
Tool: QR code generator Claude helped me build this tool for creating QR codes, for both text/URLs and for connecting to WiFi networks. Tags: vibe-coding , tools , generative-ai , ai , llms
Anthropic has developed Natural Language Autoencoders (NLAs) that convert Claude's internal numerical representations into human-readable text, making the model's reasoning process interpretable. For data analysts, this could improve debugging and validation of AI-assisted analysis by letting you see *how* Claude arrived at conclusions rather than just accepting outputs, though the practical integration into your Power BI or Python workflows isn't addressed in this excerpt.
AI models like Claude talk in words but think in numbers. These numbers, called activations, encode Claude’s thoughts, but not in a language we can read. We are introducing Natural Language Autoencoders, or NLAs, which translate AI models’ activations into readable text. NLAs have already helped us improve how we test …
Project Glasswing is a consortium of major tech companies (AWS, Microsoft, Google, Apple, etc.) partnering to improve security in widely-used open-source software. For data analysts, this matters because it could reduce vulnerabilities in the Python libraries, databases, and cloud tools you rely on daily—potentially meaning fewer security patches disrupting your workflows and less risk that your data pipelines get compromised.
Project Glasswing is a new initiative that brings together Amazon Web Services, Anthropic, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorganChase, the Linux Foundation, Microsoft, NVIDIA, and Palo Alto Networks in an effort to secure the world’s most critical software. We formed Project Glasswing because of capabil…
Anthropic's research found that Claude (and similar AI models) develop internal representations of emotions from training data, which then influence how the model responds to users—essentially acting as a form of personality rather than true emotion. For data analysts using Claude or similar AI assistants in their workflows, this means the model's behavior is somewhat predictable and role-consistent, but also that you should be aware the assistant isn't reasoning from ground truth about emotions; it's pattern-matching from text, which could affect reliability in sensitive analytical contexts where emotional framing matters.
AI models sometimes act like they have emotions—why? We studied one of our recent models and found that it draws on emotion concepts learned from text to inhabit its role as Claude, the AI assistant. These representations influence its behavior the way emotions might influence a human. And that has real consequences, a…
Anthropic released Claude Opus 4.6, an updated AI model that can reason through complex tasks more carefully, maintain focus over longer conversations, and work more independently with fewer clarifications needed. For data analysts, this could mean faster iteration on exploratory analysis, less time refining prompts to get usable SQL or Python code, and better handling of multi-step workflows like data validation or report generation—though you'd want to test it against your current tools (Power BI, Databricks, Python) to see if the autonomy improvement actually reduces your back-and-forth in practice.
Our smartest model got an upgrade. Claude Opus 4.6 plans more carefully, stays on task longer, and works more autonomously, so you can do more with less back-and-forth. Read more: https://anthropic.com/news/claude-opus-4-6
Claude was used by NASA's Jet Propulsion Laboratory to plan the Perseverance rover's route on Mars, marking the first AI-planned drive on another planet. For data analysts, this demonstrates Claude's capability to work with complex spatial data and generate executable plans from unstructured requirements—potentially relevant if you're exploring AI assistants for route optimization, logistics planning, or converting natural language specifications into actionable analytical workflows.
On December 8, the Perseverance rover safely trundled across the surface of Mars. This was the first AI-planned drive on another planet. And it was planned by Claude. Engineers at NASA Jet Propulsion Laboratory used Claude to plot out the route for Perseverance to navigate an approximately four-hundred-meter path on th…
This is a foundational overview of how large language models work technically—what they are, their trajectory, and security implications—rather than a product announcement or feature release. For a working analyst, it's useful background if you're evaluating whether to integrate LLM tools into your pipeline (like using them for code generation, documentation, or exploratory analysis) or if your organization is considering deploying them, but it won't directly change how you currently use Power BI, Fabric, Databricks, or Python.
This is a 1 hour general-audience introduction to Large Language Models: the core technical component behind systems like ChatGPT, Claude, and Bard. What they are, where they are headed, comparisons and analogies to present-day operating systems, and some of the security-related challenges of this new computing paradig…
OpenAI and Dell are making Codex (an AI coding tool) available for companies to run on their own servers and hybrid setups instead of only in the cloud. For data analysts, this means your organization could potentially use AI-assisted code generation for Python, SQL, and Power BI scripts without sending your data or queries to OpenAI's servers—a big deal if you work with sensitive data or have strict compliance requirements.
OpenAI and Dell partner to bring Codex to hybrid and on-premise environments, helping enterprises deploy AI coding agents securely across data and workflows.
NVIDIA Cosmos Predict 2.5 now supports parameter-efficient fine-tuning via LoRA/DoRA, letting you customize a video generation model without retraining the entire network. For data analysts, this is mostly relevant if you're building computer vision pipelines or working on multimodal ML projects in Databricks/Python environments—otherwise it's specialized enough that it won't affect your typical Power BI or analytics workflows.
PaddleOCR 3.5 adds a Transformers-based backend option alongside its existing CNN models, giving you more accurate text extraction from images and documents—especially for handwriting and complex layouts. If you're building data pipelines that ingest scanned documents, receipts, or form data, this update means you can now choose a more powerful model without rewriting your OCR integration, potentially reducing the manual data entry or post-processing steps that currently slow down your workflows.
I don't have access to the article excerpt you mentioned—it looks like the text got cut off after "Article excerpt:".
Could you paste the actual content from the Hugging Face blog post? Once I can see what the Open Agent Leaderboard covers, I'll give you a concise summary of what's new and whether it affects your workflow.
The NHS closed access to their open source code repositories after security vulnerabilities were found, and the UK's Government Digital Service has now commented on this decision. This matters for your workflow if you use NHS-published open source tools or libraries for data work—you may lose access to code you depend on, and it signals potential instability in government-backed data tooling you might rely on.
GDS weighs in on the NHS's decision to retreat from Open Source Terence Eden continues his coverage of the NHS' poorly considered decision to close down access to their open source repositories in response to vulnerabilities reported to them as part of Project Glasswing . Now the Government Digital Service have joined …
OpenAI partnered with Malta to subsidize ChatGPT Plus access and provide AI training to the country's citizens. For data analysts, this is primarily a policy/regional news item with minimal immediate workflow impact—it doesn't introduce new tools, APIs, or features that change how you work in Power BI, Fabric, Databricks, or Python, though broader AI literacy in your organization could eventually affect how stakeholders request analyses.
OpenAI and Malta partner to expand AI access, offering ChatGPT Plus and training to help citizens build practical AI skills and use AI responsibly.
A new plugin adds spending controls to Datasette's LLM integration, letting you set daily or per-user cost limits on AI model queries. If you're using Datasette to explore data with AI assistance, this prevents unexpected bills from runaway queries and gives you budget guardrails—useful if you're piloting LLM features in your organization but need cost accountability.
Release: datasette-llm-limits 0.1a0 This plugin works in conjunction with datasette-llm and datasette-llm-accountant to let you configure a per-user (or global) spending limit for LLM usage inside of Datasette. Configuration looks something like this: plugins : datasette-llm-limits : limits : per-user-daily : scope : a…
Databricks has integrated OpenAI's GPT-5.5 model into its enterprise agent workflows, leveraging improved performance on document Q&A tasks. For data analysts, this means potentially faster, more accurate natural language queries against your data and automated report generation—though the practical impact depends on how your organization chooses to implement agents on top of your existing Databricks infrastructure.
Databricks uses GPT-5.5 for enterprise agent workflows after the model set a new state of the art on the OfficeQA Pro benchmark.
DataRobot's certification track for analysts using the platform's AutoML, time-series, and predictive workflows. Validates day-to-day analyst skills like data prep, model selection, and explainability without writing ML code from scratch.
Builds and configures custom copilots and AI agents in Microsoft Copilot Studio. Directly relevant for Power BI and Fabric analysts: the same copilot patterns appear in Power BI's natural-language Q&A and in Copilot in Fabric, so the cert hardens the AI-prompting and grounding skills you already use day-to-day.
Anthropic Courses — Prompt Engineering & AI Fundamentals
Anthropic's official self-paced courses covering prompt engineering, AI fluency for everyday work, and building practical apps with Claude. Not a formal certification, but the closest thing to an authoritative learning path for analysts moving into AI-assisted workflows.
Agentic AI moved from demo to production this month — Databricks shipped Genie as a real data agent, Power BI opened a Data Analysis Expressions (DAX) Representational State Transfer Application Programming Interface (REST API) for programmatic access, and Anthropic raised Claude Code limits while patching a sandbox escape. The takeaway: analysts now need to think about semantic layers, governance, and code-agent security as core skills, not side projects.
Per-section highlights
⚡ Power BI & Fabric
Execute DAX Queries REST API is in preview — you can now hit semantic models programmatically from Python, Fabric pipelines, or any external app without building a report or embedding the model.
Composite semantic models mixing Direct Lake and import tables hit public preview, so you can keep large fact tables in Direct Lake for speed and import smaller dimensions for transformation flexibility — no more all-or-nothing storage mode decisions.
DAX User-Defined Functions (UDFs) now support typed parameters (MEASUREREF, COLUMNREF, TABLEREF, CALENDARREF), making reusable DAX libraries across models actually safe to share.
Modern visual tooltips went generally available across Desktop, web, mobile, Teams, and embedded — one consistent hover experience everywhere reports are consumed.
A centralized semantic model settings pane (preview) and the April feature summary land more Copilot mobile features — worth a scan before your next model tune-up.
◈ Databricks
Genie, Databricks' data agent for natural-language questions over your lakehouse, got a major push as a production-ready answer to ad-hoc SQL requests — but Advancing Analytics' counterpoint argues the bottleneck is your semantic layer, not the chat UI.
Catalog Commits is generally available, aligning Delta with open catalogs so external engines can read and write Unity Catalog (UC) managed tables consistently.
External engines can now run Data Manipulation Language (DML) — insert/update/delete — directly against UC managed Delta tables, removing a big friction point for non-Databricks pipelines writing into governed storage.
New 5XL Structured Query Language (SQL) warehouse size for workloads where 4XL hits its limit on heavy Extract-Transform-Load (ETL) and complex analytics.
Semantic caching pattern using Databricks Lakebase plus pgvector — practical guide to cutting Large Language Model (LLM) costs when users ask the same question fifteen different ways.
⚗ ML & AI Tools
Towards Data Science makes the case that batch-vs-stream is the wrong framing — the real question is "when does the answer need to be true?" — useful gut-check before architecting your next Fabric or Databricks pipeline.
A practitioner argues LLM meeting summarizers skip the identification step (what does the data actually support?), the same way bad regressions skip exploratory data analysis — treat AI summaries as drafts, not truth.
Prompt compression techniques for agentic loops — direct cost relief if you're running multi-step agents that burn tokens on context every iteration.
Refresher pieces on PySpark fundamentals and modern Python type annotations — worth bookmarking if you're onboarding teammates or hardening notebook code into production scripts.
🤖 Agentic AI & LLMs
Anthropic raised Claude Code usage limits (credited to a new SpaceX/Colossus compute deal) — fewer rate-limit interruptions when iterating on Python, SQL, or DAX.
Claude Code CVE-2026-39861: sandbox escape via symlink. Patch now if you're running Claude Code against sensitive local data or production credentials.
OpenAI launched DeployCo, a dedicated enterprise-deployment arm, plus published its internal Codex security playbook (sandboxing, approvals, network policy, agent telemetry) — a reasonable checklist to bring to your own platform team.
Anthropic published Natural Language Autoencoders (NLAs), which translate Claude's internal activations into readable text — early but real progress on auditing why a model answered the way it did.
Shopify's internal coding agent "River" works entirely in public Slack channels — a governance pattern worth stealing if your team is rolling out coding agents.
Cross-cutting themes
The semantic layer is the new battleground
Three stories pointed at the same thing this month. Databricks shipped Genie as a frontier data agent, but Advancing Analytics' "Genie is a Semantic Layer Problem, Not a Chat Problem" cut straight to the point: natural-language querying only works when the underlying model is clean. Meanwhile Power BI opened the DAX REST API, made composite Direct Lake plus import models a real option, and added typed parameters to DAX UDFs — all investments in making semantic models more programmable and reusable. The pattern: whoever owns the well-modeled, well-governed semantic layer owns the AI experience on top of it, and that's increasingly the analyst's job.
Agentic AI is now a production concern, not a demo
Genie went from preview to a real Databricks data agent, Agent Bricks showed up in a healthcare pilot-to-production series, supply chain workloads moved "from dashboards to agents" in a BrickTalk, and Copilot Studio added real-time voice agents. On the developer side, Claude Code raised limits, OpenAI published how it runs Codex safely, and the Claude Code CVE reminded everyone that code-running agents need real sandboxing. The shift for analysts: you're no longer evaluating whether to use agents — you're being asked how to govern them, cost-control them (see semantic caching, prompt compression), and validate their outputs.
Open catalogs and external engines blur the platform lines
Catalog Commits going generally available and external engines getting full DML on Unity Catalog managed tables means your Python jobs, Fabric pipelines, and non-Databricks tooling can now write directly into governed Delta tables. Combined with Power BI's DAX REST API and composite Direct Lake models, the practical effect is that the wall between "Databricks work" and "Fabric work" keeps thinning. Analysts who can stitch pipelines across both — and keep governance intact through UC — are increasingly the ones in demand.
The single most useful framing this month for anyone being asked to "roll out Genie" or any natural-language BI tool. Read this before your next stakeholder meeting on AI-driven analytics.
This unlocks a real architectural choice — keep big facts in Direct Lake, import smaller dimensions — that can cut refresh times and complexity. Worth the full read so you understand the tradeoffs before mixing modes.
A genuinely new integration surface. If you script in Python or orchestrate in Fabric, this is the first credible way to treat your semantic model as a calculation API for external apps.
Concrete cost-control pattern with code. If you've shipped any LLM-powered feature, this approach to deduplicating semantically-similar requests will pay for itself fast.
Pairs well with the semantic caching piece. Multi-step agents balloon token spend through repeated context — this gives you a practical lever to pull.
What to watch next month
Databricks Data + AI Summit (June 15–18, San Francisco)Expect Genie, Agent Bricks, Catalog Commits, and 5XL warehouses to get their full reveal. If you're planning Q3 architecture, this sets the roadmap.
Microsoft Build 2026 (June 2–3, San Francisco)Likely landing zone for DAX REST API general availability, composite Direct Lake model GA, and the next round of Copilot-in-Fabric features.
Snowflake Summit 26 (June 1–4, San Francisco)Worth watching for Snowflake's response to Catalog Commits and Genie — open catalog interoperability is now a competitive front.
Claude Code security posture after CVE-2026-39861If your team adopted Claude Code in the last quarter, expect security to ask hard questions. Get ahead of it: patch, document sandboxing, and review what files the agent can actually reach.
Power BI preview features moving to general availabilityDAX REST API, composite Direct Lake models, and the semantic model settings pane are all in preview. Track the monthly feature summaries — GA timing affects whether you can use them in production reports.
Signal
Cutting through the noise in modern analytics — AI-curated intelligence on Power BI, Databricks, ML/AI, and Agentic AI, built for practicing analysts.
How it works
AI-curated daily across Power BI, Databricks, ML/AI, and Agentic AI
AI summaries rewritten for analysts — what changed and why it matters to your day-to-day
Monthly executive briefs that pull the cross-cutting themes together
Tech stack
Claude AIFirebase HostingPythonYouTube Data APIGitHub Actions
About the creator
Built by Matt Valladarez. Signal exists because the analytics ecosystem moves faster than any one practitioner can track — this is the digest I wanted to read every morning, automated end-to-end so the curation never slips.
💡 Question of the Day
This video can't be embedded here. Watch it directly on YouTube instead.