📊
May 2026 Brief 5-minute read on what mattered this month →
Power BI & Microsoft Fabric 15 posts
▶ Featured Videos
Power BI Update - April 2026

Power BI Update - April 2026

Power BI Update - March 2026

Power BI Update - March 2026

Power BI Update - February 2026

Power BI Update - February 2026

Power BI Update - January 2026

Power BI Update - January 2026

Microsoft Power BI

New Power Query experience in Power BI Desktop (Preview)

✨ What this means for analysts

Microsoft is previewing a redesigned Power Query interface in Power BI Desktop with updated visuals and workflows. If you spend significant time cleaning, transforming, or combining data before analysis, this refresh could streamline your editing experience and reduce clicks in your daily ETL steps—though you'll want to test it in the preview to see if it actually saves time versus your current process.

Authored by: Sara Lammini Rodriguez - Product Manager II, and Miguel Escobar - Senior Product Manager

Read full article →
Microsoft Power BI

Power BI May 2026 Feature Summary

✨ What this means for analysts

This is a May 2026 Power BI release covering Copilot/AI improvements, reporting and modeling enhancements, and new data connectors. If you regularly use Copilot for exploration or spend time formatting reports and connecting new data sources, you'll likely find faster workflows—but you'll want to check the full release notes to see which specific features affect your typical tasks.

Author: Katie Murray, Senior Program Manager - Power BI continues to evolve with updates that make it easier to explore data, generate insights, and build more polished reports. This month’s release brings improvements across Copilot and AI experiences, reporting and modeling enhancements, new data connectivity flows, …

Read full article →
Microsoft Power BI

Outbound Access Protection for semantic models (Preview)

✨ What this means for analysts

Outbound Access Protection now applies to semantic models, letting you restrict what external destinations your Power BI models can connect to—blocking everything by default and only allowing approved connections. For analysts, this means your organization can enforce stricter data security policies on semantic models, which could affect which data sources or APIs you can query from, potentially requiring your admin team to explicitly approve new connections you need for your work.

Author: Kay Unkroth, Principl Program Manager - Outbound Access Protection (OAP) is a workspace-level network security and governance feature that blocks outbound traffic from a workspace by default and lets you allow only the destinations you explicitly trust. With this preview, you can now extend OAP to semantic mode…

Read full article →
Copilot Studio

The in-depth guide to managing real-time voice agents at scale

✨ What this means for analysts

This is a governance framework for deploying and managing voice AI agents in production environments, covering security, compliance, and operational best practices across their full lifecycle. If you're building customer-facing analytics tools or automating data-driven conversations (like Q&A bots over your datasets), it's worth understanding the governance requirements and anti-patterns—especially around data access, audit trails, and handling sensitive information that voice agents might expose.

In this article Why real-time voice agents require a different governance lens Why real-time voice agents raise the stakes A governance framework for the full agent lifecycle Platform capabilities that support agent governance Security, privacy, and compliance for customer-facing agents Five anti-patterns that derail p…

Read full article →
Microsoft Power BI

Power BI April 2026 Feature Summary

✨ What this means for analysts

April 2026 brings layout flexibility improvements, expanded mobile Copilot features, and several preview capabilities across Power BI's reporting and modeling tools. For your daily work, better layout controls mean faster report tweaking, while stronger mobile Copilot support lets you iterate and troubleshoot on-the-go instead of being tied to desktop—though you'll want to check the preview docs to see which features are production-ready for your environment.

Welcome to the April Power BI update! Power BI’s April 2026 update is here, bringing continued improvements across Copilot and AI, reporting, visuals, and modeling. This release includes more flexibility when working with layouts and visuals, expanded Copilot experiences—especially on mobile—and several preview feature…

Read full article →
Copilot Studio

New and improved: Agent governance, intelligent workflows, and connected app experiences

✨ What this means for analysts

Copilot Studio added governance controls for AI agents, improved workflow automation, and the ability to embed business applications directly into agents. For data analysts, this means you can now build self-service agent interfaces that connect to your BI tools and data platforms with better oversight, potentially reducing ad-hoc data requests and letting non-technical stakeholders access insights more independently.

In this article Build and scale agents with better visibility and control Expand workflows into intelligent, governed automation systems Bring business apps directly into your agents What else is new and improved in Copilot Studio Stay up to date on all things Copilot Studio As organizations scale their use of AI agent…

Read full article →
PowerBI.tips
PodcastPower BI

Stop Using Bookmarks - Ep.526 - Power BI tips

✨ What this means for analysts

The episode argues that Power BI bookmarks—a feature for saving and switching between report states—should be reconsidered in favor of newer alternatives that likely offer better performance or user experience. If you're currently using bookmarks for navigation or report state management, this suggests evaluating whether Microsoft has released better tools to replace this functionality in your reports.

In Episode 526 of Explicit Measures, Mike Carlo and Tommy Puglia unpack the latest Power BI and Microsoft Fabric topics from the show. You’ll get a quick read on the episode’s biggest ideas, why they matter, and where to dig deeper in the full conversation.

Read full article →
Microsoft Power BI

Execute DAX Queries REST API (Preview)

✨ What this means for analysts

Microsoft has released a preview REST API that lets you execute DAX queries programmatically against Power BI datasets without opening the desktop application. This matters because you can now automate data pulls, integrate Power BI queries into Python scripts or other workflows, and build custom solutions that query your semantic models on demand—eliminating manual exports and reducing friction between Power BI and your other analytical tools.

Author: Kay Unkroth - Principal Program Manager

Read full article →
PowerBI.tips
PodcastPower BI

Less Guessing? More Building! - Ep.525 - Power BI tips

✨ What this means for analysts

This episode covers recent Power BI and Microsoft Fabric feature updates discussed by the hosts. Without access to the full episode content, the excerpt doesn't specify which features are new, so check the full conversation to see if any changes affect your current dashboards, data modeling, or refresh schedules.

In Episode 525 of Explicit Measures, Mike Carlo and Tommy Puglia unpack the latest Power BI and Microsoft Fabric topics from the show. You’ll get a quick read on the episode’s biggest ideas, why they matter, and where to dig deeper in the full conversation.

Read full article →
SQLBI
DAX

Filtering measures through slicers

✨ What this means for analysts

Slicers can't directly filter measures because measures are calculations, not data values—but the article explains workarounds to achieve the filtering effect you need. This matters because you'll likely hit this limitation when building dashboards, and knowing the proper techniques (like using measure branches or filter context) will save you time troubleshooting why your slicer isn't working as expected.

A slicer cannot filter a measure: let’s analyze this common request by explaining how to use a slicer to filter a measure, after discussing the real meaning of using a measure with a slicer.

Read full article →
SQLBI
DAXPower BI

Filtering measures through slicers

✨ What this means for analysts

Slicers in Power BI fundamentally filter table columns, not measures—a distinction that trips up newcomers trying to dynamically filter aggregations. Understanding this difference matters for your reports because it clarifies when you need to restructure your data model or use alternative techniques like measure branching instead of expecting a slicer to do something it's architecturally designed not to do.

A slicer cannot filter a measure. In this article, we analyze this common request by explaining how to use a slicer to filter a measure, after discussing the real meaning of using a measure with a slicer. A very common request by Power BI newbies is, “How can I use a slicer to filter a measure rather than a regular mod…

Read full article →
PowerBI.tips
PodcastPower BI

Fabric's Most Underrated Features - Ep.524 - Power BI tips

✨ What this means for analysts

This episode highlights Power BI and Fabric features that exist but aren't widely used, covering practical capabilities that could streamline your reporting work. If you're building dashboards and doing standard analysis, there's likely at least one overlooked feature here that could save you time on data modeling, visualization, or refresh optimization—worth a 20-minute listen to see what you're missing in your daily toolkit.

In Episode 524 of Explicit Measures, Mike Carlo and Tommy Puglia unpack the latest Power BI and Microsoft Fabric topics from the show. You’ll get a quick read on the episode’s biggest ideas, why they matter, and where to dig deeper in the full conversation.

Read full article →
PowerBI.tips
PodcastPower BI

Databricks, Fabric, Development and Vibe - Ep.523 - Power BI tips

✨ What this means for analysts

This episode recaps recent developments across Power BI and Microsoft Fabric from the hosts' conversation. Since the excerpt doesn't specify which features or changes were discussed, you'd need to listen to the full episode to understand what's actually new and how it might affect your reporting workflows or data pipeline work.

In Episode 523 of Explicit Measures, Mike Carlo and Tommy Puglia unpack the latest Power BI and Microsoft Fabric topics from the show. You’ll get a quick read on the episode’s biggest ideas, why they matter, and where to dig deeper in the full conversation.

Read full article →
Copilot Studio

Extend AI voice support: Introducing real-time voice agents in Microsoft Copilot Studio

✨ What this means for analysts

Microsoft Copilot Studio now supports real-time voice agents, letting you build AI assistants that can handle customer support conversations through voice instead of just text. For data analysts, this means you'll likely need to design new data pipelines and monitoring dashboards to track voice interaction metrics, compliance logs, and agent performance data that weren't part of text-only workflows.

Customers expect support that resolves issues quickly, delivers consistent answers, and works seamlessly across channels. For organizations, this creates a familiar tension: how do you deliver high‑quality service at scale without losing control over cost, compliance, or experience? That’s why we’re excited to announce…

Read full article →
PowerBI.tips
PodcastPower BI

Improving Your AI Skills for Fabric - Ep.522 - Power BI tips

✨ What this means for analysts

This episode covers strategies for developing AI capabilities within Microsoft Fabric to help analysts work more effectively with modern data tools. Since Fabric increasingly integrates AI features into everyday tasks like data preparation and analysis, building these skills now means you'll be able to automate repetitive work and leverage AI-assisted features rather than learning them reactively when they become mandatory in your workflow.

In Episode 522 of Explicit Measures, Mike Carlo and Tommy Puglia unpack the latest Power BI and Microsoft Fabric topics from the show. You’ll get a quick read on the episode’s biggest ideas, why they matter, and where to dig deeper in the full conversation.

Read full article →
Databricks 15 posts
▶ Featured Videos
What is Databricks SQL?

What is Databricks SQL?

SQL Warehouses in Notebooks

SQL Warehouses in Notebooks

Databricks | Notebook Development Overview

Databricks | Notebook Development Overview

Tutorial - Notebook Basics | Databricks Academy

Tutorial - Notebook Basics | Databricks Academy

Databricks
IndustriesRetail & Consumer Goods

How Databricks Genie improves retail personalization

✨ What this means for analysts

Databricks Genie is a tool that helps automate the creation of personalized customer experiences by connecting retail data to AI-driven recommendations without requiring manual query writing. For analysts, this means faster iteration on segmentation and targeting logic—you can ask questions about customer behavior in plain language and get results without building SQL from scratch, though you'd still own validation and refinement of the outputs.

USE CASECustomer Intelligence & Loyalty OptimizationRetail personalization has come...

Read full article →
Databricks Community

The Four-Minute Investigation: How AI/BI Genie Agent mode closes the gap between "what" and "why"

✨ What this means for analysts

Databricks has added an "Agent mode" to its AI/BI Genie tool that can handle more complex analytical questions than the standard chat interface, moving beyond simple queries to investigative "why" questions. For analysts, this means you could potentially spend less time manually digging through dashboards and writing queries to explain unexpected patterns—Genie Agent can do some of that reasoning work for you, though you'd still need to validate its conclusions.

See how AI/BI Genie Agent can answer much more complex questions than standard chat. Dashboards, Genie Chat, and Genie Agent all come together in AI/BI to form a comprehensive analytics suite.

Read full article →
Advancing Analytics
DataAIAnalyticsGenerative AI

Genie Is a Semantic Layer Problem, Not a Chat Problem

✨ What this means for analysts

Genie (Microsoft Fabric's natural language query tool) is getting attention for enterprise rollouts, but the article argues its real bottleneck isn't the chat interface—it's the semantic layer underneath that needs to be properly designed first. For your workflow, this means that before pushing Genie to your business users, you'll need to invest time ensuring your semantic models are clean, well-documented, and aligned with how the organization actually asks questions, or you'll end up supporting countless confused queries and inconsistent results.

I've been setting up Genie spaces for clients, and the conversations always start in the same place. Someone from the business sees the demo and asks how quickly we can roll it out across the organisation.

Read full article →
Databricks
Product

Automate Data & KPI Monitoring with SQL Alerts

✨ What this means for analysts

Databricks now offers SQL-based alerting to monitor data quality and KPI thresholds automatically instead of requiring manual checks. For your daily work, this means you can set up notifications when metrics drift or data freshness issues occur—reducing time spent on repetitive manual monitoring and letting you focus on analysis instead of babysitting dashboards.

In many organizations, data monitoring is still a manual, repetitive routine: open...

Read full article →
Databricks Community

Agent Bricks | A Pilot to Production Series - Manufacturing

✨ What this means for analysts

Databricks released Agent Bricks, a framework for building specialized AI agents that automate manufacturing workflows like planning, forecasting, and quality control—cutting planning cycles from days to minutes. If you're building analytics pipelines or forecasting models for manufacturing clients, this means you'll likely be asked to integrate agentic AI into your existing Databricks workflows rather than just serving static dashboards or reports.

Manufacturing moves too fast for generic AI. In this episode, see how specialized AI agents built on Agent Bricks help manufacturers improve planning, forecasting, logistics, and quality at scale. The impact: Planning & inventory: Manufacturers cut planning time from days to minutes and use real-time data from 18,000+ …

Read full article →
Advancing Analytics
Generative AIInsurance

Insurance Distribution Just Changed. Most Carriers Weren't in the Room.

✨ What this means for analysts

Consumers are already using AI chatbots like ChatGPT to research and compare insurance products in real time, shifting purchasing decisions away from traditional distribution channels. For data analysts, this means your organization's customer acquisition models, competitive positioning dashboards, and channel attribution analysis likely need urgent updates—the traffic patterns and conversion funnels you've been tracking are probably shifting faster than your current refresh cycles can detect.

A consumer is asking ChatGPT which home insurance to buy, where to get the cheapest car insurance... right now. Not next year. Not when the technology matures. Now.

Read full article →
Advancing Analytics
AIDatabricks

Support your local Prompt Engineers with Prompt Registry

✨ What this means for analysts

A Prompt Registry is a centralized system for storing, versioning, and managing AI prompts—similar to how feature stores organize ML features. If you're building AI applications across your organization, this helps prevent duplicate work, track prompt changes, and maintain consistency instead of having prompts scattered across notebooks and Slack messages.

Introduction A few years back, feature stores became the standard way to bring order to machine learning features by centralising, governing and tracking them. Now we are facing the same challenge with prompts. They multiply quickly, get tweaked without context and become difficult to manage.

Read full article →
Databricks
EngineeringFinancial Services

How to Build Real-Time Fraud Detection using Spark Real-Time Mode and Lakebase

✨ What this means for analysts

Databricks has documented how to use Spark's real-time streaming mode alongside Lakebase (their lakehouse metadata layer) to detect fraudulent transactions as they happen, rather than in batch windows. If you're building fraud models in Databricks, this shows you a concrete pattern for moving detection from hourly/daily jobs to sub-second latency—which matters because it lets you block or flag transactions before they complete, instead of catching fraud after the fact.

Card fraud operates in seconds. A stolen credit card number can fuel dozens of purchases...

Read full article →
Databricks
PlatformProduct

Governing AI agents at scale with Unity Catalog

✨ What this means for analysts

Databricks added Unity Catalog support for governing AI agents, letting you apply consistent access controls, audit trails, and permission management across large numbers of agents the same way you would for data assets. If you're building or managing multiple AI agents in Databricks, this means you can now enforce who can use which agents and track their actions without setting up separate governance systems for each one.

A year ago, your organization had a dozen AI agents. Today, there are thousands.Every...

Read full article →
Databricks
CompanyCustomers

Databricks for Good and Virtue Foundation: Partnering to Connect Medical Volunteers to Critical Health Services in 72 Countries

✨ What this means for analysts

Databricks is partnering with Virtue Foundation, a nonprofit connecting medical volunteers to health services globally, likely providing data infrastructure or analytics support for their operations. For most practicing analysts, this is a partnership announcement rather than a product update—it doesn't change your Power BI, Fabric, or Databricks tooling, but it's worth noting if you work in healthcare or nonprofits where similar use cases might inform your own data strategy.

IntroductionVirtue Foundation is a nonprofit focused on global health delivery and...

Read full article →
Databricks Community

Databricks Community Champion - May 2026 - Balaji J

✨ What this means for analysts

This article announces Balaji J as the Databricks Community Champion for May 2026, recognizing their contributions to helping other users in the community. For practicing analysts, it's worth knowing who the active experts are in your tools' communities—you might follow their answers to common questions or learn from how they troubleshoot problems.

Our Community Champion Program celebrates members who consistently contribute their expertise, support fellow practitioners, and help shape a stronger and more collaborative Databricks Community. Every month, we recognize individuals whose passion for learning and willingness to share knowledge create a meaningful impa…

Read full article →
Advancing Analytics
DataChief Data OfficerEngineeringAnalytics

Built-On Databricks: Delivering Multi-Tenant Analytics

✨ What this means for analysts

Someone built a working prototype showing how to deliver analytics to multiple customers using Databricks infrastructure, using Databricks' own partner framework and a real reference example (Firefly Analytics) as a guide. If you're building or planning a multi-tenant analytics platform on Databricks, this prototype shows you a tested architectural pattern rather than starting from scratch—which cuts down design time and reduces the risk of architectural mistakes.

Recently, I've been working with a customer to flesh out what Built-On Databricks could look like for them. We used the Databricks Partner Well Architected Framework (PWAF) and the Firefly Analytics example use case as reference, and built a working prototype.

Read full article →
Databricks Community

Speed Up Data Warehouse Migration Validation

✨ What this means for analysts

I don't have the actual article excerpt to summarize — it appears the content didn't come through in your message. Could you paste the article excerpt or key details from the Databricks post? Once I have that, I can give you the 2–3 sentence breakdown of what's new and whether it affects your daily workflow.

Read full article →
Databricks Community

Processing Unstructured Data in Volumes with Unity Catalog Open APIs

✨ What this means for analysts

Unity Catalog now offers APIs that let external tools (like Python scripts or third-party applications) access unstructured data files stored in Volumes using temporary, permission-based credentials instead of requiring you to manually set up cloud IAM roles. For your daily work, this means you can more easily integrate notebooks, ML pipelines, or external tools with your data lake without wrestling with access management—the system handles credential scoping automatically based on UC permissions you've already defined.

Learn how Unity Catalog Volumes and new credential-vending APIs let external tools securely access unstructured data with temporary, scoped credentials tied to UC permissions, eliminating manual IAM management. Govern tables, models, features, and unstructured data consistently across clouds and engines.

Read full article →
Advancing Analytics

From 100+ days to 17 days: Novel Enterprise PII Redaction at Scale with Azure

✨ What this means for analysts

An enterprise improved their PII redaction system to handle 5 million documents in 17 days instead of 100+, moving from local processing to an Azure-based scaled solution. For analysts working with sensitive data, this means compliance redaction tasks that previously blocked projects for months can now complete in weeks, letting you move protected datasets into Fabric or Databricks pipelines without lengthy delays or manual workarounds.

Introduction When the first version of our PII redaction system went live, we could process roughly 1,500 insurance documents per hour on a local development machine. Impressive enough for a proof of concept. But when your client drops this on you: "We have 5 million documents that need compliance redaction. How long w…

Read full article →
ML & AI Tools for Data Analysts 15 posts
▶ Featured Videos
Xgboost Classification Indepth Maths Intuition- Machine Learning Algorithms🔥🔥🔥🔥

Xgboost Classification Indepth Maths Intuition- Machine Learning Algorithms🔥🔥🔥🔥

Kaggle's 30 Days Of ML (Day-14 Part-1): Intro to XGBoost

Kaggle's 30 Days Of ML (Day-14 Part-1): Intro to XGBoost

Why are Eigenvectors and Values Important

Why are Eigenvectors and Values Important

KDnuggets

Time-Series Feature Engineering with Python Itertools

✨ What this means for analysts

The article covers using Python's itertools module to generate time-series features (like lags, rolling statistics, or aggregations) more efficiently than typical pandas or manual loops. If you're building ML models on time-stamped data in Python, this could speed up your feature engineering and reduce memory overhead, which matters when you're working with large datasets that don't fit comfortably in memory or when you need to iterate quickly on feature experiments.

Learn how to use Python itertools to build efficient and scalable time series features.

Read full article →
Towards Data Science
Agentic AIAi AgentData ScienceDeep Dives

Optimizing AI Agent Planning with Operations Research and Data Science

✨ What this means for analysts

When you deploy AI agents for tasks like data processing or reporting, costs and resource use can spiral quickly unless you plan carefully—this article walks through using optimization techniques and data science methods to allocate agent skills and budgets efficiently. If you're building agent-based automation in Power BI, Fabric, or Python workflows, understanding how to frame these problems (which agents do what, who gets assigned where, what's your spending cap) directly affects whether your solution stays cost-effective or becomes a budget drain.

AI agents can quickly become expensive without a clear strategy for planning, skill coverage, and budgets. This article shows how to use operations research and data science to optimize AI agent cost and resource allocation. You will learn how to frame common agent problems—skill coverage, project assignment, and budge…

Read full article →
KDnuggets

Anonymizing Production Data for Data Science with Mimesis

✨ What this means for analysts

Mimesis is a Python library that generates fake but realistic data to replace sensitive information in production datasets. If you work with real customer or business data in development or testing, this tool lets you create safe anonymized copies for analysis and model training without exposing actual PII—useful when you need production-realistic data but can't use the real thing in non-production environments.

Learn how to utilize Python's Mimesis library for anonymizing sensitive production data, based on a step-by-step example to try yourself.

Read full article →
AWS Machine Learning
Amazon SageMaker AIAnnouncementsIntermediate (200)Technical How-to

Announcing OpenAI-compatible API support for Amazon SageMaker AI endpoints

✨ What this means for analysts

SageMaker AI endpoints now accept OpenAI-compatible API calls, meaning you can swap in a SageMaker endpoint URL without rewriting code that uses OpenAI's SDK, LangChain, or similar tools. For analysts building LLM-powered features or notebooks, this eliminates friction around authentication and client libraries—you can test models on SageMaker with minimal code changes, making it easier to experiment with or migrate between cloud providers without disrupting your workflow.

Today, Amazon SageMaker AI introduces OpenAI-compatible API support for real-time inference endpoints. If you use the OpenAI SDK, LangChain, or Strands Agents, you can now invoke models on SageMaker AI by changing only your endpoint URL. You don’t need a custom client, a SigV4 wrapper, or code rewrites. Overview With t…

Read full article →
AWS Machine Learning
Amazon BedrockAnnouncementsStrands Agents

Multimodal evaluators: MLLM-as-a-judge for image-to-text tasks in Strands Evals

✨ What this means for analysts

AWS has released multimodal evaluators in Strands Evals that use AI models to judge whether image-based AI outputs are actually correct—for example, whether a generated caption accurately describes an image or whether extracted numbers from documents match reality. If you're building or validating models that work with images, documents, or charts, this gives you a programmatic way to automatically check quality instead of manually reviewing outputs or relying on text-only checks that can't see what the model is looking at.

If you’re building visual shopping, image or document understanding, or chart analysis, you need a way to verify whether your model’s response is actually grounded in the source image. A text-only evaluator cannot tell you whether a caption faithfully describes an image, whether an extracted invoice total matches the d…

Read full article →
AWS Machine Learning
Amazon SageMakerArtificial Intelligence

Build real-time voice applications with Amazon SageMaker AI and vLLM

✨ What this means for analysts

AWS added streaming inference support to SageMaker for real-time voice applications, letting you send audio continuously and get transcriptions back on the same connection instead of waiting for batch responses. For most Power BI or Python analysts this won't affect your day-to-day work unless you're building voice-enabled dashboards or contact center analytics—but if you are, this removes a technical bottleneck that previously forced you to choose between latency and infrastructure complexity.

Voice agents, live captioning, contact center analytics, and accessibility tools all depend on real-time speech-to-text, where your application streams audio in and receives transcription back simultaneously over a single persistent connection. Traditional request-response inference falls short here because transcripti…

Read full article →
Towards Data Science
ProgrammingArticial IntelligenceDeep DivesFunctional Programming

Introduction to Lean for Programmers

✨ What this means for analysts

This article introduces Lean, a programming language designed around mathematical proof and formal verification. For data analysts, this is mostly academic interest—Lean doesn't integrate with Power BI, Fabric, Databricks, or standard Python ML workflows, so it won't change your day-to-day work unless you're specifically interested in formally verifying complex statistical logic or mathematical correctness in your analyses.

The syntax and semantics of mathematics The post Introduction to Lean for Programmers appeared first on Towards Data Science .

Read full article →
Towards Data Science
LLM ApplicationsDeep DivesEntity ResolutionKnowledge Graph

Proxy-Pointer RAG: Solving Entity and Relationship Sprawl in Large Knowledge Graphs

✨ What this means for analysts

Proxy-Pointer RAG is a technique for cleaning up messy knowledge graphs by matching duplicate entities and relationships before feeding data into retrieval-augmented generation (RAG) systems. If you're building RAG pipelines on top of fragmented or poorly deduplicated reference data—common in enterprise environments—this approach could reduce hallucinations and improve answer quality without manual data cleanup.

A scalable semantic localization layer for entity and relationship reconciliation The post Proxy-Pointer RAG: Solving Entity and Relationship Sprawl in Large Knowledge Graphs appeared first on Towards Data Science .

Read full article →
Analytics Vidhya
AI AgentsArtificial IntelligenceBeginner

Kimi WebBridge: Hands-on Guide to Kimi’s Browser Extension for AI Agents

✨ What this means for analysts

Kimi WebBridge is a browser extension that lets AI agents automate web interactions—opening pages, filling forms, clicking buttons, and extracting data—without leaving your browser. For data analysts, this could streamline repetitive tasks like pulling data from web portals or APIs, though it's worth testing whether it integrates smoothly with your existing Power BI or Python workflows before relying on it for production work.

AI agents are evolving from answering questions to taking actions inside browsers. They can now open pages, click buttons, fill forms, extract data, and automate multi step workflows across websites. Moonshot AI’s Kimi WebBridge brings this capability to Chrome and Edge, allowing local AI agents to safely interact with…

Read full article →
AWS Machine Learning
Amazon BedrockAmazon Machine LearningAmazon NovaArtificial Intelligence

Scalable voice agent design with Amazon Nova Sonic: multi-agent, tools, and session segmentation

✨ What this means for analysts

Amazon Nova Sonic is a new voice AI model designed to handle real-time audio processing with lower latency and support for multi-agent coordination and tool integration. For data analysts, this mainly matters if your organization is building voice interfaces for data queries or dashboards—otherwise it's primarily relevant to ML engineers and backend teams building voice applications rather than someone working directly in Power BI, Fabric, or Python analytics.

Design patterns for scalable voice agents matter for organizations that need to deliver fast, natural, and reliable voice experiences. Many teams face challenges like high latency, managing real-time audio, and coordinating multiple agents in complex workflows. In this post, you’ll learn how to use Amazon Nova Sonic , …

Read full article →
AWS Machine Learning
Advanced (300)Amazon Bedrock AgentsKiroTechnical How-to

Extending conversational memory in Kiro CLI using Amazon Bedrock AgentCore Memory

✨ What this means for analysts

Amazon Bedrock's Kiro CLI now retains conversational history and user preferences across multiple sessions instead of resetting after each one. For data analysts, this means you can reference earlier questions, context about your data schema, or analytical approaches from previous days without re-explaining your setup—similar to how a persistent chatbot remembers your work style rather than treating each session as brand new.

Agentic IDEs that forget what you told them in previous sessions aren’t very helpful. You work on your large codebase with complex business requirements for days or weeks. However, your IDE only remembers you during your current session and can’t recall your conversational history, preferences derived from the conversa…

Read full article →
AWS Big Data
Advanced (300)Amazon AthenaAmazon EMRAmazon Redshift

A systematic approach to benchmarking SQL processing engines on AWS

✨ What this means for analysts

AWS has published a benchmarking framework for comparing SQL processing engines on their platform, helping you evaluate which tool (Athena, Redshift, EMR, etc.) performs best for your specific workload patterns. This matters because choosing the wrong engine can mean slower queries and higher costs—this framework gives you concrete comparison criteria to test before committing to a solution for your analytics pipeline.

Selecting the right SQL processing solution for large-scale data analytics is a critical decision for organizations. As data volumes grow exponentially, the technology landscape has evolved to offer diverse options for processing and analyzing this information efficiently. This post presents a systematic framework for …

Read full article →
AWS Big Data
Advanced (300)Amazon EMRAnalyticsTechnical How-to

Build petabyte-scale synthetic test data with Amazon EMR on EC2

✨ What this means for analysts

Amazon EMR on EC2 now supports generating large volumes of synthetic test data to replace production data in testing environments, avoiding the compliance and security risks of using real customer information. For analysts working with sensitive data in regulated industries, this means you can validate pipelines, test transformations, and troubleshoot without needing to set up separate anonymization processes or worry about accidentally exposing PII in development and QA environments.

As you scale your data systems, you face a challenge: how to test thoroughly without putting customer data at risk. Using production data for testing can expose sensitive customer information to unauthorized access or breaches. For customers in regulated industries like finance and healthcare, this risk isn’t only a co…

Read full article →
AWS Big Data
Amazon RedshiftAnalyticsGraviton

Meet Amazon Redshift RG – AWS Graviton-based instances with an integrated data lake query engine delivering up to 2.4x better performance at 30% lower price than RA3

✨ What this means for analysts

Amazon released new Redshift RG instances built on cheaper AWS Graviton processors that run 2.2–2.4x faster than the previous RA3 generation while costing 30% less per compute unit. If you're running Redshift warehouses or querying data lakes through Redshift, you could significantly reduce your infrastructure costs and speed up query times without changing your code or workflows.

On May 12, 2026, we announced the general availability of Amazon Redshift RG instances , powered by AWS Graviton processors. RG instances are up to 2.2x as fast for data warehouse workloads and up to 2.4x as fast for data lake workloads, all at 30% lower price per vCPU compared to RA3 instances. RG instances support al…

Read full article →
Google Research
General ScienceMachine IntelligenceNatural Language ProcessingOpen Source Models & Datasets

Empirical Research Assistance (ERA): From Nature publication to catalyzing Computational Discovery

✨ What this means for analysts

Google has developed ERA, an AI system that helps researchers generate and test hypotheses by analyzing large datasets and scientific literature to suggest experimental directions. For data analysts, this signals growing integration of AI-assisted discovery into research workflows—potentially affecting how you scope exploratory analysis, validate findings, and communicate insights to domain experts, though it's not yet a tool you'd use directly in Power BI or Python.

General Science

Read full article →
🤖 Agentic AI & LLMs 15 posts
▶ Featured Videos
Deep Dive into LLMs like ChatGPT

Deep Dive into LLMs like ChatGPT

Let's build GPT: from scratch, in code, spelled out.

Let's build GPT: from scratch, in code, spelled out.

[1hr Talk] Intro to Large Language Models

[1hr Talk] Intro to Large Language Models

Let's build the GPT Tokenizer

Let's build the GPT Tokenizer

OpenAI

How Ramp engineers accelerate code review with Codex

✨ What this means for analysts

Ramp used OpenAI's Codex with GPT-5.5 to automate code review feedback, cutting review time from hours to minutes. For data analysts, this means AI-assisted code review could speed up your Pull Request cycles—useful if you're sharing Python scripts or DAX formulas with teammates—though the concrete impact depends on whether your organization adopts similar tooling.

How Ramp engineers use Codex with GPT-5.5 to review code and ship improvements, allowing them to get substantive feedback in minutes instead of hours.

Read full article →
Simon Willison
anthropicgrokgenerative-aiai

Quoting SpaceX S-1

✨ What this means for analysts

SpaceX is building large-scale compute infrastructure (COLOSSUS II) to train AI models like Grok 5 while also selling spare capacity to other AI companies like Anthropic. For most data analysts working with BI and analytics tools, this doesn't directly affect your daily workflow—it's primarily relevant if your organization is considering alternative cloud compute providers or planning large-scale ML infrastructure investments.

We have the ability to use compute resources to support our proprietary AI applications (such as Grok 5, which is currently being trained at COLOSSUS II), while also providing access to select compute capacity to third-party customers. For example, in May 2026, we entered into Cloud Services Agreements with Anthropic P…

Read full article →
GitHub Copilot
AI & MLGitHub CopilotNews & insightsProduct

Take your local GitHub sessions anywhere

✨ What this means for analysts

GitHub Copilot sessions can now be accessed outside your local machine, letting you pick up multiple concurrent coding tasks (like refactoring, debugging, and building features) from different devices. For data analysts, this means you can start a Python script or dbt transformation on your laptop, then continue or monitor it from a tablet or secondary machine without losing context or having to restart your work.

The best GitHub Copilot workflows don’t happen one–thing–at–a time. You might have an agent refactoring a module in VS Code, another debugging tests in the CLI, and a third scaffolding a new feature in the background. Managing all of that used to only be possible from your desk. The moment you stepped away from your la…

Read full article →
GitHub Copilot
AI & MLGenerative AIGitHub Copilotaccessibility

Building a general-purpose accessibility agent—and what we learned in the process

✨ What this means for analysts

GitHub is testing an AI agent that automatically identifies and fixes accessibility issues in code. For data analysts, this could reduce manual code review work and help ensure your Python scripts, Power BI custom visuals, or Databricks notebooks meet accessibility standards without extra effort during development.

It is an understatement to say agents have become a popular way of working with code. GitHub has adopted agent-based code creation and editing for many of its initiatives, including piloting an agent to help with our commitment to accessibility . GitHub is currently piloting an experimental general-purpose accessibilit…

Read full article →
GitHub Copilot
AI & MLApplication developmentGamingGitHub Copilot

Dungeons & Desktops: Building a procedurally generated roguelike with GitHub Copilot CLI

✨ What this means for analysts

This article describes building a GitHub CLI extension in Go using GitHub Copilot, demonstrated through a fun project that generates procedurally created roguelike dungeons. For data analysts, it's not directly relevant to daily Power BI, Fabric, or Python ML work unless you're interested in how AI coding assistants can help you build custom CLI tools or automate repository-based workflows.

I got nerd-sniped into the GitHub Copilot CLI Challenge and made a questionable decision: I turned my codebase into a roguelike dungeon. It started with a simple prompt: Build a GitHub CLI extension in Go that takes the current repository and turns it into a playable roguelike dungeon, with dungeons generated with BSP …

Read full article →
Copilot Studio

New and improved: Agent governance, intelligent workflows, and connected app experiences

✨ What this means for analysts

Copilot Studio added governance controls for AI agents, improved workflow automation, and the ability to embed business applications directly into agents. For data analysts, this means you can now build self-service agent interfaces that connect to your BI tools and data platforms with better oversight, potentially reducing ad-hoc data requests and letting non-technical stakeholders access insights more independently.

In this article Build and scale agents with better visibility and control Expand workflows into intelligent, governed automation systems Bring business apps directly into your agents What else is new and improved in Copilot Studio Stay up to date on all things Copilot Studio As organizations scale their use of AI agent…

Read full article →
GitHub Copilot
AI & MLAutomationCI/CDEnterprise software

Improving token efficiency in GitHub Agentic Workflows

✨ What this means for analysts

GitHub is making their AI agent workflows use fewer tokens (and thus cost less) when they automatically run maintenance tasks on code repositories. For data analysts, this matters if you use GitHub for version control with automated CI jobs—lower token costs mean you can run more frequent code quality checks and automated data pipeline maintenance without watching your bill spike.

GitHub Agentic Workflows is like a team of street sweepers that clean up little messes in your repo. These teams significantly improve repo hygiene and quality, but as with all agentic work, cost is a growing concern for developers. And because CI jobs like agentic workflows are automatically scheduled and triggered, c…

Read full article →
GitHub Copilot
AI & MLGenerative AIGitHub CopilotAI agents

Agent pull requests are everywhere. Here’s how to review them.

✨ What this means for analysts

AI-generated code is becoming common in pull requests, but it often contains hidden technical debt and redundancy that passes review because it looks clean on the surface. As a data analyst, this matters because approving AI-generated ETL, transformation, or analysis code without scrutiny could leave you maintaining bloated, inefficient pipelines that are harder to debug and scale—particularly risky when you're working across Power BI, Fabric, or Databricks where performance directly impacts your reports and queries.

You’ve probably already approved one without realizing it. The tests passed. The code was clean. You merged it. But it was agent-generated—and that ease of approval is exactly the problem. A January 2026 study, “More Code, Less Reuse” , found that agent-generated code introduces more redundancy and more technical debt …

Read full article →
Azure AI Foundry
Security

Enforcing trust and transparency: Open-sourcing the Azure Integrated HSM

✨ What this means for analysts

Microsoft is open-sourcing the Azure Integrated HSM, a hardware security module that encrypts and protects sensitive data at the infrastructure level. For data analysts working with regulated or sensitive datasets in Azure environments, this means better visibility into how your data encryption actually works and potentially stronger compliance assurance—though the day-to-day impact on Power BI or Fabric workflows depends on your organization's security policies and whether they adopt this for your specific workloads.

As cloud workloads become more agentic and AI systems increasingly handle mission‑critical data, trust must be engineered into the infrastructure at every layer. At Microsoft, security is designed into the foundation of our cloud infrastructure, from silicon to services. With the Azure Integrated Hardware Security Modu…

Read full article →
Copilot Studio

Extend AI voice support: Introducing real-time voice agents in Microsoft Copilot Studio

✨ What this means for analysts

Microsoft Copilot Studio now supports real-time voice agents, letting you build AI assistants that can handle customer support conversations through voice instead of just text. For data analysts, this means you'll likely need to design new data pipelines and monitoring dashboards to track voice interaction metrics, compliance logs, and agent performance data that weren't part of text-only workflows.

Customers expect support that resolves issues quickly, delivers consistent answers, and works seamlessly across channels. For organizations, this creates a familiar tension: how do you deliver high‑quality service at scale without losing control over cost, compliance, or experience? That’s why we’re excited to announce…

Read full article →
Google Gemini
SearchGeminiAI

100 things we announced at I/O 2026

✨ What this means for analysts

This is a roundup of Google's 2026 announcements including Gemini Omni (likely an upgraded AI model), Antigravity, and Universal Cart, among 100 total updates. Without specifics on which tools integrate with your BI/analytics stack (Power BI, Fabric, Databricks, Python), it's unclear which announcements affect your daily workflow—you'd need to dig into whether any Google services you depend on got material upgrades or new API capabilities.

This year at Google I/O 2026, we announced Gemini Omni, Google Antigravity, Universal Cart and so much more. Here are the highlights.

Read full article →
OpenAI
Research

An OpenAI model has disproved a central conjecture in discrete geometry

✨ What this means for analysts

OpenAI's AI model solved a longstanding mathematical problem in discrete geometry by disproving an 80-year-old conjecture about unit distances. For data analysts, this demonstrates AI's emerging capability to solve novel mathematical problems, which could eventually inform how algorithmic optimization and computational geometry are applied to complex data modeling—though this particular breakthrough doesn't directly change your day-to-day work with BI tools or Python analytics right now.

An OpenAI model solved the 80-year-old unit distance problem, disproving a major conjecture in discrete geometry and marking a milestone in AI-driven mathematics.

Read full article →
Simon Willison
aigenerative-aillms

How fast is 10 tokens per second really?

✨ What this means for analysts

Mike Veerman built an interactive tool that lets you visually see what different LLM token speeds actually feel like in practice—so when a vendor says "30 tokens/second," you can watch it play out instead of guessing. If you're evaluating LLMs for reports, chatbots, or real-time analytics features, this helps you judge whether a model's speed will feel snappy or sluggish to your end users before you commit to it.

How fast is 10 tokens per second really? Neat little HTML app by Mike Veerman ( source code here ) which simulates LLM token output speeds from 5/second to 800/second. Useful if you see a model advertised as "30 tokens/second" and want to get a feel for what that actually looks like. Via Hacker News Tags: ai , generati…

Read full article →
Simon Willison
geminigooglegenerative-aiai

Google I/O, Gemini Spark, Antigravity

✨ What this means for analysts

# Summary Simon Willison reviews Google I/O announcements but focuses only on features actually available to use now, rather than vaporware roadmap items—a practical stance since many "coming soon" products never ship or change significantly before launch. For data analysts, this means if you're considering Google's new tools, you might want to wait for his deeper dives on production-ready features rather than getting excited about announcements that may not materialize or may differ from initial promises.

It's hard to find much to write about Google I/O this year because I have a policy of not writing about anything that I can't try out myself, and a lot of the big announcements are "coming soon". I actually prefer to write about things that are in general availability, because I've had instances in the past where the p…

Read full article →
Google Gemini
Google CloudAI ProductsGeminiGoogle DeepMind

Making it easier to understand how content was created and edited

✨ What this means for analysts

Google is adding tools to track and display the creation and editing history of web content. For data analysts, this could matter when you're evaluating source credibility or auditing how datasets and documentation have changed over time, though the excerpt doesn't specify whether these tools apply to data-specific platforms you'd use daily like Power BI or Databricks.

We're expanding our tools to help you understand how content was created and edited across the web.

Read full article →