2025-06-25 18:00:05
1.3 Our Work and Contributions and 1.4 Organization
Related Work
Prosecutor Design
OS2a: Objective Service Assessment for Mobile AIGC
OS2A on Prosecutor: Two-Phase Interaction for Mobile AIGC
Implementation and Evaluation
7.1 Implementation and Experimental Setup
7.2 Prosecutor Performance Evaluation
In this section, we elaborate on the design of the OS2A framework. Firstly, inspired by the DCM principle [15], we present the idea of OS2A. Then, we demonstrate the modeling of the objective service quality and subjective service experience, respectively.
\
The modeling of the quality and experience of AIGC services is intractable. For instance, Fig. 4 shows five AIGC images generated by the same prompt. Comparing Figs. 4(a) and (b), we can observe that the latter has higher “quality” since its cabin is located in the center and contains more details. However, from the client’s perspective, the experience of Fig. 4(b) might be poor since it may highly prefer photorealism images like Figs. 4(a) and (c)-(e) rather than cartoon-styled Fig. 4(b). Apart from preference, different clients may hold different standards for the AIGC services. Some are lenient, while others might be strict. Back to the above example, even for Figs. 4(d) and (e), the strict clients might claim that the cabin is too low and there draws the unexpected aurora, respectively. DCM, first presented by Daniel McFadden [15], explains the cause of such situations. This theory states that the power determining the clients’ utility comes from two collaborative sources, namely objective factors and subjective factors. The former indicates the objective attributes that clients can enjoy, while the latter is affected by the specific environment and the subjectivity of the clients themselves. Inspired by DCM, we present the concept of Objective-Subjective Service Assessment (OS2A) for mobile AIGC, considering both the objective experience of the AIGC service process and the subjective experience of the AIGC outputs. OS2A is defined as
\
\
\ \
:::info Authors:
(1) Yinqiu Liu, School of Computer Science and Engineering, Nanyang Technological University, Singapore ([email protected]);
(2) Hongyang Du, School of Computer Science and Engineering, Nanyang Technological University, Singapore ([email protected]);
(3) Dusit Niyato, School of Computer Science and Engineering, Nanyang Technological University, Singapore ([email protected]);
(4) Jiawen Kang, School of Automation, Guangdong University of Technology, China ([email protected]);
(5) Zehui Xiong, Pillar of Information Systems Technology and Design, Singapore University of Technology and Design, Singapore ([email protected]);
(6) Abbas Jamalipour, School of Electrical and Information Engineering, University of Sydney, Australia ([email protected]);
(7) Xuemin (Sherman) Shen, Department of Electrical and Computer Engineering, University of Waterloo, Canada ([email protected]).
:::
:::info This paper is available on arxiv under CC BY 4.0 DEED license.
:::
\
2025-06-25 17:00:02
1.3 Our Work and Contributions and 1.4 Organization
Related Work
Prosecutor Design
OS2a: Objective Service Assessment for Mobile AIGC
OS2A on Prosecutor: Two-Phase Interaction for Mobile AIGC
Implementation and Evaluation
7.1 Implementation and Experimental Setup
7.2 Prosecutor Performance Evaluation
Next, we show the layer-2 design of the anchor chain, including reputation roll-up and duplex transfer channels. Traditionally, all the historical opinions should be saved on the ledger of each MASP. Nevertheless, the explosively increasing data volume wastes considerable storage resources of MASPs. Given that opinions only serve as evidence for reputation tracing, we intend to offload them from the anchor chain and only keep the most critical bookkeeping messages. As shown in Algorithm 1, we develop layer-2 reputation roll-up, containing the following steps.
\
\ 2) Reputation Compression: When reaching the threshold, RCOs take turns compressing the received transactions. Specifically, these transactions undergo the SHA256 operation sequentially in chronological order. Then, a roll-up block Br can be created by only containing the hashes, as shown in Fig. 2. Compared with one block containing 1000 transactions, which typically occupies 500 Bytes [30], one Br containing 1000 hashes only occupies 32.5 Bytes because each SHA256 output takes 256 bits [44]. Consequently, the data volume consumed for saving historical reputation records can be effectively compressed.
\
\
\
The second layer-2 design is duplex transfer channels between each MASP-client pair, with which we realize the atomic fee-ownership transfers. These channels are virtual 7 and instantiated by the specific smart contract. Within the channel, the participants can conduct multiple rounds of atomic transfers protected by the Hash Lock (HL) protocol. Only the channel initialization and closing need to be recorded on the anchor chain. Since the transfers happen inside channels, low latency can be guaranteed, and the workload of the anchor chain can also be alleviated. Next, we introduce the procedure of atomic fee-ownership transfer on the channel.
\
\
\
\
\
\
\
\
\
\
\
\ \
:::info Authors:
(1) Yinqiu Liu, School of Computer Science and Engineering, Nanyang Technological University, Singapore ([email protected]);
(2) Hongyang Du, School of Computer Science and Engineering, Nanyang Technological University, Singapore ([email protected]);
(3) Dusit Niyato, School of Computer Science and Engineering, Nanyang Technological University, Singapore ([email protected]);
(4) Jiawen Kang, School of Automation, Guangdong University of Technology, China ([email protected]);
(5) Zehui Xiong, Pillar of Information Systems Technology and Design, Singapore University of Technology and Design, Singapore ([email protected]);
(6) Abbas Jamalipour, School of Electrical and Information Engineering, University of Sydney, Australia ([email protected]);
(7) Xuemin (Sherman) Shen, Department of Electrical and Computer Engineering, University of Waterloo, Canada ([email protected]).
:::
:::info This paper is available on arxiv under CC BY 4.0 DEED license.
:::
\
2025-06-25 15:28:39
So here's the thing - Python is amazing, but it's painfully slow.
You know it, I know it, everyone knows it.
Enter Mojo, launched in May 2023 by the brilliant minds at Modular AI.
This isn't just another programming language - it's Python's superhero transformation.
Created by Chris Lattner (yes, the Swift and LLVM genius), Mojo was born from a simple frustration: why should we choose between Python's ease and C++'s speed?
Welcome to Mojo - a programming language that enables fast & portable CPU+GPU code on multiple platforms.
But wait, there's more.
Your existing Python code runs in Mojo without changing a single line.
Zero.
Nada.
Nothing changes!
:::tip Think of Mojo as Python that hit the gym, learned martial arts, and came back 1000x stronger while still being the same friendly person you know and love.
:::
The team at Modular didn't set out to build a language - they needed better tools for their AI platform, so they built the ultimate tool.
Not just does Mojo work with Python, you can also access low-level programming for GPUs, TPUs, and even ASIC units.
This means you will no longer need C, C++, CUDA, or Metal to optimize Generative AI and LLM workloads.
Adopt Mojo - and the CUDA moat is gone, and hardware-level programming is simplified.
How cool is that?
Let's start with something you already know:
fn main():
print("Hello, Mojo! 🔥")
Looks like Python, right?
That's because it literally is Python syntax.
Your muscle memory is already trained.
Here's where it gets different - variables with superpowers:
fn main():
let name = "Mojo" # This is immutable and blazing fast
var count: Int = 42 # This is mutable with type safety
let pi = 3.14159 # Smart enough to figure out the type
print("Language:", name, "Count:", count, "Pi:", pi)
See that let
keyword?
It's telling the compiler "this never changes," which unlocks serious optimization magic.
The var
keyword says "this might change," but you can add types for extra speed when you need it.
Now here's where it gets interesting - dual function modes:
fn multiply_fast(a: Int, b: Int) -> Int:
return a * b # Compiled, optimized, rocket-fast
def multiply_python(a, b):
return a * b # Good old Python flexibility
fn main():
print("Fast:", multiply_fast(6, 7))
print("Flexible:", multiply_python(6, 7))
Use fn
when you want maximum speed with type safety.
Use def
when you want Python's flexibility.
You can literally mix and match in the same program.
Start with def
, optimize with fn
later.
Here's an interesting loop:
fn main():
let numbers = List[Int](1, 2, 3, 4, 5)
var total = 0
for num in numbers:
total += num[] # That [] tells Mojo to optimize aggressively
print("Numbers:", numbers, "Sum:", total)
# This loop processes a million items faster than Python can blink
for i in range(1000000):
pass # Automatically vectorized by the compiler
That explicit []
syntax might look weird, but it's your secret weapon for telling the compiler exactly what you want optimized.
There are reasons that Mojo, when fully developed, could take over the entire world.
Remember all those Python libraries you love? They still work:
import numpy as np
import pandas as pd
from sklearn.linear_model import LinearRegression
fn main():
let data = np.array([[1, 2], [3, 4], [5, 6]])
let df = pd.DataFrame(data, columns=['x', 'y'])
let model = LinearRegression()
print("All your favorite libraries work instantly!")
This is huge.
No migration headaches, no rewriting millions of lines of code.
Your NumPy arrays, pandas DataFrames, and scikit-learn models work exactly like they always have.
The difference?
Now they can run alongside code that's 1000x faster when you need it.
Check this out - automatic parallel processing:
from algorithm import vectorize
from sys.info import simdwidthof
fn vector_magic():
alias size = 1000000
var a = DTypePointer[DType.float32].alloc(size)
var b = DTypePointer[DType.float32].alloc(size)
var result = DTypePointer[DType.float32].alloc(size)
@parameter
fn vectorized_add[width: Int](i: Int):
let a_vec = a.load[width=width](i)
let b_vec = b.load[width=width](i)
result.store[width=width](i, a_vec + b_vec)
vectorize[vectorized_add, simdwidthof[DType.float32]()](size)
That @parameter
decorator is doing compile-time magic - it creates specialized versions of your function for different CPU architectures.
Your code automatically uses all available CPU cores and SIMD instructions without you thinking about it.
This single function can be 8x to 128x faster than equivalent Python code.
And many other benchmarks are going through the roof!
Want to use your GPU?
Here's how simple it is:
from gpu import GPU
from tensor import Tensor
fn gpu_power():
@gpu.kernel
fn matrix_multiply(a: Tensor[DType.float32], b: Tensor[DType.float32]) -> Tensor[DType.float32]:
return a @ b # Just matrix multiplication, but on GPU
let big_matrix_a = Tensor[DType.float32](Shape(2048, 2048))
let big_matrix_b = Tensor[DType.float32](Shape(2048, 2048))
let result = matrix_multiply(big_matrix_a, big_matrix_b)
No CUDA programming, no memory management nightmares, no kernel configuration headaches.
The @gpu.kernel
decorator automatically generates optimized GPU code for NVIDIA, AMD, and Apple GPUs.
The same code runs on any GPU without changes.
This is revolutionary and a huge improvement over existing tooling!
Now Mojo gets really clever:
struct SmartMatrix[rows: Int, cols: Int, dtype: DType]:
var data: DTypePointer[dtype]
fn __init__(inout self):
self.data = DTypePointer[dtype].alloc(rows * cols)
fn get(self, row: Int, col: Int) -> SIMD[dtype, 1]:
return self.data.load(row * cols + col)
fn show_parametric_power():
let small_int_matrix = SmartMatrix[10, 10, DType.int32]()
let big_float_matrix = SmartMatrix[1000, 500, DType.float64]()
# Each gets its own optimized code generated at compile time
The compiler creates completely different optimized code for each combination of parameters.
Your 10x10 integer matrix gets different optimizations than your 1000x500 float matrix.
This is C++ template-level performance with much cleaner and more readable syntax.
Here's how Mojo prevents memory leaks and crashes:
struct SafePointer[T: AnyType]:
var data: Pointer[T]
fn __init__(inout self, value: T):
self.data = Pointer[T].alloc(1)
self.data.store(value)
fn __moveinit__(inout self, owned other: Self):
self.data = other.data
other.data = Pointer[T]() # Original pointer is now empty
fn __del__(owned self):
if self.data:
self.data.free() # Automatic cleanup
This is Rust-style memory safety with Python-style ease of use.
No garbage collection pauses, no memory leaks, no use-after-free bugs.
Memory gets cleaned up exactly when you expect it to, not when some garbage collector feels like it.
This is serious innovation!
@adaptive
fn smart_algorithm(data: List[Int]) -> Int:
var sum = 0
for item in data:
sum += item[]
return sum
The @adaptive
decorator tells the compiler to generate multiple versions of your function.
The runtime system profiles your actual usage and picks the fastest version for your specific data patterns.
Your code gets smarter the more it runs!
\
Want to move work from runtime to compile time?
Easy:
@parameter
fn compile_time_fibonacci(n: Int) -> Int:
@parameter
if n <= 1:
return 1
else:
return n * compile_time_fibonacci(n - 1)
fn main():
alias fib_result = compile_time_fibonacci(15)
print("Fibonacci 15:", fib_result) # Calculated while compiling
Complex calculations happen during compilation, not when your program runs.
This means zero runtime cost for things that can be figured out ahead of time.
This is a huge, forward-thinking leap in programming language design.
I expect other programming languages to follow suit!
Traits let you write code that works with many different types:
trait Addable:
fn __add__(self, other: Self) -> Self
struct Vector2D(Addable):
var x: Float32
var y: Float32
fn __add__(self, other: Self) -> Self:
return Vector2D(self.x + other.x, self.y + other.y)
fn add_anything[T: Addable](a: T, b: T) -> T:
return a + b # Works with any type that implements Addable
Write once:
Use with any compatible type:
Get optimized code for each specific type.
Want to talk directly to your CPU's vector units?
fn simd_playground():
let data = SIMD[DType.float32, 8](1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0)
let squared = data * data
let fma_result = data.fma(data, data) # Fused multiply-add
let shuffled = data.shuffle[4, 5, 6, 7, 0, 1, 2, 3]()
Direct access to CPU vector instructions with type safety.
Operations that would take 8 CPU cycles now take 1!
The standard library includes features for all kinds of tasks.
List[T]
gives you dynamic arrays that are both type-safe and lightning fast.
Dict[K, V]
provides hash tables optimized for real-world usage patterns.
String
handles both ASCII and Unicode efficiently without the usual performance penalties.
Tensor[dtype]
is your gateway to GPU-accelerated numerical computing.
DTypePointer[dtype]
gives you low-level control with high-level safety.
Buffer[T]
provides automatic memory management for temporary data.
Reference[T]
implements zero-copy borrowing for maximum efficiency.
vectorize
automatically spreads your loops across all available CPU cores.
parallelize
distributes work across threads with smart load balancing.
sort
provides specialized sorting algorithms for different data types and sizes.
Native support for complex numbers, arbitrary precision math, and linear algebra.
Automatic differentiation for machine learning without external dependencies.
Statistical functions that are both accurate and blazingly fast.
File I/O that automatically optimizes for SSD vs HDD vs network storage.
Network programming with async/await support for high-performance servers.
Cross-platform threading that actually works consistently.
Mojo currently works on Linux (Ubuntu 18.04+, CentOS 7+) and macOS (10.15+).
Windows support is coming soon - the team is working on it.
And when that happens - I see worldwide adoption.
And in the long term, I see mobile, edge, and IoT deployment as well!
You'll need 8 GB of RAM minimum, 16 GB recommended for smooth compilation.
Installation takes less than 5 minutes with the official installer.
# Install the Modular SDK
curl -fsSL https://get.modular.com | sh -
modular install mojo
# Check if everything works
mojo --version
mojo run --help
A fully featured LLDB debugger is included with Mojo, along with beautifully integrated code completion support with hover and doc hints.
The VS Code extension gives you syntax highlighting, error checking, and integrated debugging.
# Start a new project
mkdir awesome-mojo-project && cd awesome-mojo-project
mojo package init my-package
# Build and run
mojo build main.mojo
./main
The package system handles dependencies, versioning, and cross-platform distribution automatically.
from testing import assert_equal
fn test_addition():
assert_equal(2 + 3, 5)
print("Math still works!")
fn main():
test_addition()
Built-in testing framework includes performance benchmarking capabilities.
:::tip MAX is not just an architecture - it’s a performance beast!
:::
let
vs var
vs Python-style variables takes practice.\
Python and Mojo remind me of C and C++, but for Generative AI instead of OOP.
Windows and mobile support will unlock enterprise and edge markets.
Universities will start teaching Mojo, creating a new generation of developers.
Major AI companies will replace Python bottlenecks with Mojo implementations.
The ecosystem will hit critical mass with hundreds of production-ready libraries.
Mojo aims to become a full superset of Python with its own dynamically growing tool ecosystem.
New AI/ML projects will default to Mojo for production performance.
Scientific computing will gradually migrate from Fortran and C++ to Mojo.
Cloud providers will offer Mojo-optimized instances with specialized acceleration.
Mojo could become the go-to language for performance-critical applications everywhere.
Hardware manufacturers will design chips with Mojo-specific features.
The language will influence next-generation programming language design.
Schools will teach Mojo as the primary computational language.
There is limited competition from Julia, Rust, Carbon, and other performance languages, and the reason I say limited is because of Mojo’s support for Python.
But, Mojo needs to balance Python compatibility with language evolution needs.
The open-source community and the commercial platform requirements need to be balanced.
Diverse hardware architectures should be supported as well as optimization strategies.
Here's the bottom line: Mojo eliminates the false choice between system fragmentation and system performance.
Your Python skills remain valuable - they just become 10000x more powerful.
Performance improvements of 10-10000x open up applications that were previously impossible.
The unified CPU+GPU programming model simplifies modern AI and scientific computing.
Even in blockchain and crypto mining, direct access to GPUs and ASICs gives Mojo a huge advantage.
Chris Lattner's track record with Swift and LLVM gives confidence in Mojo's future.
The timing is perfect - AI demands, edge computing needs, and developer productivity requirements are converging.
:::tip And Generative AI eating the world is the perfect use-case for Mojo.
:::
I believe that developing countries such as India should adopt Mojo instead of CUDA to build their LLMs, LMMs, and SLMs.
Not only does it make us less reliant on Nvidia, the computational costs will also decrease because of higher performance.
The Rust memory-safety feature and the Python compatibility are the icing and the cherry on the cake.
Once Mojo is available for Windows, I see an accelerated takeover in the entire programming industry.
And the main reason for this is the 100% support for pure Python.
If Modular does things right, and opensources the entire code:
I see Mojo having a huge impact.
Worldwide.
If you haven’t started with Mojo, do so today!
:::tip The real question isn't whether Mojo will succeed.
It's whether you'll be ready when it transforms your industry.
And it’s no longer a question of if, but when.
:::
\ Unless attributed to other sources, images were generated by Leonardo.ai at this link: https://app.leonardo.ai/
Claude Sonnet 4 was used in this article with heavy editing, the model is available here: https://claude.ai/
\
2025-06-25 15:23:17
"Data-driven" has become a badge of honor in modern marketing and product development. Dashboards are filled with charts, click-through rates, heatmaps, and A/B results. But in the rush to optimize what we can measure, many brands have lost sight of what they can't: emotion, hesitation, intent, and trust.
In an age when customer behavior evolves by the hour, decisions made purely on metrics are often misguided. Numbers reveal what happened but not why. And in the gap between those two realities lies some of the most valuable insight a business can find.
\ What the Dashboards Miss
There’s a dangerous assumption that if something performs well numerically, it must be working holistically. But history has shown that data can deceive. High-performing pages in terms of clicks or engagement may still underdeliver in conversion, brand perception, or loyalty, not because they’re poorly designed but because they miss the emotional mark.
Take Amazon’s Fire Phone, for example. Early engagement and traffic were promising, but customer feedback revealed that the product felt gimmicky, lacked essential app support, and failed to connect emotionally with users. Despite strong initial visibility, it became one of Amazon’s most expensive product flops.
Similarly, Tesco’s Fresh & Easy chain launched in the U.S. with extensive data backing, modeled after successful UK stores. But the format—smaller stores with self-service checkout and ready-made meals - confused and frustrated American shoppers who expected a more personal, larger-scale grocery experience. The stores underperformed dramatically and were eventually shuttered.
These are reminders that even when the numbers look good on paper, emotional disconnects can undermine the entire experience.
Quantitative testing methods, like A/B testing, were never built to capture how a user feels when navigating a product page. They can't tell when a shopper pauses before clicking "buy" or when an image triggers confusion, not confidence. These nuances don’t show up in analytics tools, but they influence behavior the same way.
\ The Human Layer: Where Emotion Shapes Action
Today’s most influential customers, especially Gen Z and mobile-first users, tend to make fast, emotionally driven choices. They respond instantly and intuitively to how a brand makes them feel. These emotional impressions form in seconds and often outweigh rational evaluation. A perfectly structured product page can still fall flat if it lacks authenticity, relatability, or emotional resonance—because what feels right often matters more than what looks right.
In mobile-first environments, users often decide within milliseconds whether a brand experience feels intuitive or off. Micro-interactions, like hover hesitation, scroll speed, or a quick swipe away, offer valuable signals about trust and clarity yet rarely surface in traditional analytics dashboards.
Emotion may be irrational, but it’s far from random. It reveals itself in the flicker of a facial expression, a pause in a voice response, or the way a user scrolls—slowly, quickly, or not at all—through a carousel of images. These subtle cues often guide decision-making, yet they’re exactly what most analytics tools fail to capture.
\ Tools Bridging the Gap
A new generation of platforms is stepping in to close this gap—tools designed to bring emotional intelligence into the optimization process. Instead of relying on metrics alone, they blend qualitative feedback, behavioral signals, and AI to reveal what users are actually experiencing in real time.
These tools enhance traditional analytics rather than replace them. They capture the emotional context and behavioral nuance that data alone often misses.
\ Why Real Feedback Feeds Statistical Significance
One of the core problems with traditional feedback systems like post-call NPS and CSAT surveys is their participation rate. According to CX Today, survey engagement has dropped to just 5%—meaning 95% of customers never respond. Even within that narrow segment of respondents, the data tends to lean toward emotional extremes: customers who are either very happy or very upset. This creates a "U-shaped" curve of opinions, missing the voices of the vast majority with moderate or nuanced experiences.
This isn’t just a data issue. It’s a misalignment with reality. When businesses rely heavily on skewed feedback, they end up shaping strategies around edge cases rather than the true center of customer sentiment. And even when testing isn't biased, it's often too slow to catch up. Traditional A/B testing requires time, traffic, and statistically significant results to act. But in fast-moving digital environments, waiting for confidence intervals often means missing the response window.
Traditional surveys also lack meaningful context. Creovai's research suggests that only about 20% of respondents leave open-text comments to explain their scores. That means most customer input boils down to a single digit—useful for trend spotting but useless for problem-solving. It's the difference between spotting smoke and finding the fire.
That’s where human feedback becomes invaluable. Data alone often delivers broad or ambiguous patterns. But real shopper feedback offers precision—and context. With behavior-driven platforms, brands can hear how customers express hesitation, trust, or objection in their own words and tone. That emotional clarity is nearly impossible to replicate through dashboards alone.
While data shows trends, real user opinions reveal the triggers behind them. Numbers might tell you a page has a high drop-off rate, but it takes a human voice to explain that the call-to-action felt too pushy or that the imagery lacked credibility. In a world of instant reactions and emotionally charged decision-making, understanding what people think isn’t just a nice-to-have—it’s how brands stay relevant.
As Andri Sadlak, founder of ProductPinion, puts it:
\
“Data tells you what happened. But real consumer reactions tell you why it happened. That emotional layer—the pause before a scroll, the hesitation in a voice, the face that lights up at a product image—can never be captured in a spreadsheet. With ProductPinion, we built a system that decodes emotional behavior at scale because optimization without empathy is optimization in the dark.”
\n Many teams are turning to AI to overcome these blind spots, as it clarifies complex, emotional user behavior.
\ What AI Actually Brings to Testing
AI’s real value in this space is not automation—it’s amplification. It helps extract deeper meaning from feedback faster and at scale. Modern behavior-driven platforms use AI to:
With these new possibilities, companies can expand the scope of their testing. With AI, teams can assess which image performs better and why users trust one message over another. They can also test text length, color, emotional tone, and perceived authenticity.
Importantly, this kind of testing goes beyond surface metrics. AI allows brands to optimize for trust, relatability, and emotional resonance—all of which are difficult to measure with clicks alone. It enables iterative, empathy-driven development without sacrificing speed or scale.
When used correctly, AI doesn't replace human insight. It sharpens it.
\ Rethinking Data-Driven
The problem isn’t outdated feedback methods but the illusion that more data equals deeper understanding. Being data-driven isn’t a flaw, but it becomes one when it replaces human experience instead of enhancing it. The future of optimization lies in blending the precision of data with the power of empathy.
Platforms like ProductPinion, Trymata, and UserTesting are helping brands shift from measuring performance to interpreting behavior. The next wave of competitive advantage won’t come from tracking what people click but from uncovering the emotions and intent behind each action.
Future-leading brands will move beyond behavior analysis to truly grasp what drives decisions. They’ll combine data with empathy, dashboards with dialogue, and optimization with intuition. Because, in the end, what customers remember most is the impression your product left on them.
\n \n
2025-06-25 15:21:13
As AI agents become more autonomous and capable, their role is shifting from passive assistants to proactive actors. Today’s large language models (LLMs) don’t just generate text—they execute tasks, access APIs, modify databases, and even control infrastructure.
AI agents are taking actions that were once reserved strictly for human users, whether it’s scheduling a meeting, deploying a service, or accessing a sensitive document.
When agents operate without guardrails, they can inadvertently make harmful or unauthorized decisions. A single hallucinated command, misunderstood prompt, or overly broad permission can result in data leaks, compliance violations, or broken systems.
That’s why integrating human-in-the-loop (HITL) workflows is essential for agent safety and accountability.
Permit.io’s Access Request MCP is a framework designed to enable AI agents with the ability to request sensitive actions, while allowing humans to remain the final decision-makers.
Built on Permit.io and integrated into popular agent frameworks like LangChain and LangGraph, this system lets you insert approval workflows directly into your LLM-powered applications.
In this tutorial, you’ll learn:
interrupt()
feature.Before we dive into our demo application and implementation steps, let’s briefly discuss the importance of delegating AI permissions to humans.
AI agents are powerful, but, as we all know, they’re not infallible.
They follow instructions, but they don’t understand context like humans do. They generate responses, but they can’t judge consequences. And when those agents are integrated into real systems—banking tools, internal dashboards, infrastructure controls—that’s a dangerous gap.
In this context, everything that can go wrong is pretty clear:
Delegation is the solution.
Instead of giving agents unchecked power, we give them a protocol: “You may ask, but a human decides.”
By introducing human-in-the-loop (HITL) approval at key decision points, you get:
It’s the difference between an agent doing something and an agent requesting to do something.
And it’s exactly what Permit.io’s Access Request MCP enables.
The Access Request MCP is a core part of Permit.io’s Model Context Protocol (MCP)—a specification that gives AI agents safe, policy-aware access to tools and resources.
Think of it as a bridge between LLMs that want to act and humans who need control.
Permit’s Access Request MCP enables AI agents to:
interrupt()
mechanismBehind the scenes, it uses Permit.io’s authorization capabilities built to support:
Permit’s MCP is integrated directly into the LangChain MCP Adapter and LangGraph ecosystem:
interrupt()
when sensitive actions occur.It’s the easiest way to inject human judgment into AI behavior—no custom backend needed.
Understanding the implementation and its benefits, let’s get into our demo application.
In this tutorial, we’ll build a real-time approval workflow in which an AI agent can request access or perform sensitive actions, but only a human can approve them.
To see how Permit’s MCP can help enable an HITL workflow in a user application, we’ll model a food ordering system for a family:
This use case reflects a common pattern: “Agents can help, but humans decide.”
We’ll build this HITL-enabled agent using:
interrupt()
supportYou’ll end up with a working system where agents can collaborate with humans to ensure safe, intentional behavior—using real policies, real tools, and real-time approvals.
A repository with the full code for this application is available here.
In this section, we’ll walk through how to implement a fully functional human-in-the-loop agent system using Permit.io and LangGraph.
We’ll cover:
interrupt()
Let’s get into it -
We’ll start by defining your system’s access rules inside the Permit.io dashboard. This lets you model which users can do what, and what actions should trigger an approval flow.
Create a ReBAC Resource
Navigate to the Policy page from the sidebar, then:
Click the Resources tab
Click Create a Resource
Name the resource: restaurants
Under ReBAC Options, define two roles:
parent
child-can-order
Click Save
Now, go to the Policy Editor tab and assign permissions:
parent
: full access (create
, read
, update
, delete
)
child-can-order
: read
Set Up Permit Elements
Go to the Elements tab from the sidebar. In the User Management section, click Create Element.
Configure the element as follows:
Name: Restaurant Requests
Configure elements based on: ReBAC Resource Roles
Resource Type: restaurants
Role permission levels
parent
child-can-order
Click Create
In the newly created element card, click Get Code and take note of the config ID: restaurant-requests
. We’ll use this later in the .env
file.
Add Operation Approval Elements
Create a new Operation Approval element:
Name: Dish Approval
Resource Type: restaurants
Click Create
Then create an Approval Management element:
Name: Dish Requests
Click Get Code and copy the config ID: dish-requests
.
Add Test Users & Resource Instances
Navigate to Directory > Instances
Click Add Instance
Resource Type: restaurants
Instance Key: pizza-palace
Tenant: Default Tenant (or your working tenant)
Switch to the Users tab
Click Add User
Key: joe
Instance Access: restaurants:pizza-palace#parent
Click Save
Create another user with the key henry
Don’t assign a role
Once Permit is configured, we’re ready to clone the MCP server and connect your policies to a working agent.
With your policies modeled in the Permit dashboard, it’s time to bring them to life by setting up the Permit MCP server—a local service that exposes your access request and approval flows as tools that an AI agent can use.
Clone and Install the MCP Server
Start by cloning the MCP server repository and setting up a virtual environment.
git clone <https://github.com/permitio/permit-mcp>
cd permit-mcp
# Create virtual environment, activate it and install dependencies
uv venv
source .venv/bin/activate # For Windows: .venv\\Scripts\\activate
uv pip install -e .
Add Environment Configuration
Create a .env
file at the root of the project based on the provided .env.example
, and populate it with the correct values from your Permit setup:
bash
CopyEdit
RESOURCE_KEY=restaurants
ACCESS_ELEMENTS_CONFIG_ID=restaurant-requests
OPERATION_ELEMENTS_CONFIG_ID=dish-requests
TENANT= # e.g. default
LOCAL_PDP_URL=
PERMIT_API_KEY=
PROJECT_ID=
ENV_ID=
You can retrieve these values using the following resources:
LOCAL_PDP_URL
PERMIT_API_KEY
PROJECT_ID
ENV_ID
⚠️ Note: We are using Permit’s Local PDP (Policy Decision Point) for this tutorial to support ReBAC evaluation and low-latency, offline testing.
Start the Server
With everything in place, you can now run the MCP server locally:
uv run -m src.permit_mcp
Once the server is running, it will expose your configured Permit Elements (access request, approval management, etc.) as tools the agent can call through the MCP protocol.
Now that the Permit MCP server is up and running, we’ll build an AI agent client that can interact with it. This client will:
request_access
, approve_operation_approval
, etc.interrupt()
(in the next section)Let’s connect the dots.
Install Required Dependencies
Inside your MCP project directory, install the necessary packages:
uv add langchain-mcp-adapters langgraph langchain-google-genai
This gives you:
langchain-mcp-adapters
: Automatically converts Permit MCP tools into LangGraph-compatible toolslanggraph
: For orchestrating graph-based workflowslangchain-google-genai
: For interacting with Gemini 2.0 FlashAdd Google API Key
You’ll need an API key from Google AI Studio to use Gemini.
Add the key to your .env
file:
GOOGLE_API_KEY=your-key-here
Build the MCP Client
Create a file named client.py
in your project root.
We’ll break this file down into logical blocks:
Imports and Setup
Start by importing dependencies and loading environment variables:
import os
from typing_extensions import TypedDict, Literal, Annotated
from dotenv import load_dotenv
from langchain_google_genai import ChatGoogleGenerativeAI
from langgraph.graph import StateGraph, START, END
from langgraph.types import Command, interrupt
from langgraph.checkpoint.memory import MemorySaver
from langgraph.prebuilt import ToolNode
from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client
from langchain_mcp_adapters.tools import load_mcp_tools
import asyncio
from langgraph.graph.message import add_messages
Then, load the environment and set up your Gemini LLM:
\
load_dotenv()
global_llm_with_tools = None
llm = ChatGoogleGenerativeAI(
model="gemini-2.0-flash",
google_api_key=os.getenv('GOOGLE_API_KEY')
)
\
server_params = StdioServerParameters( command="python", args=["src/permit_mcp/server.py"], )
Define the shared agent state:
class State(TypedDict):
messages: Annotated[list, add_messages]
\
Define Workflow Nodes and the graph builder:
Here’s the logic to route between calling the LLM and invoking tools:
\
async def call_llm(state):
response = await global_llm_with_tools.ainvoke(state["messages"])
return {"messages": [response]}
def route_after_llm(state) -> Literal[END, "run_tool"]:
return END if len(state["messages"][-1].tool_calls) == 0 else "run_tool"
async def setup_graph(tools):
builder = StateGraph(State)
run_tool = ToolNode(tools)
builder.add_node(call_llm)
builder.add_node('run_tool', run_tool)
builder.add_edge(START, "call_llm")
builder.add_conditional_edges("call_llm", route_after_llm)
builder.add_edge("run_tool", "call_llm")
memory = MemorySaver()
return builder.compile(checkpointer=memory)
In the above code, we have defined an LLM node and its conditional edge, which routes to the run_tool
node if there is a tool call in the state's message, or ends the graph. We have also defined a function to set up and compile the graph with an in-memory checkpointer.
Next, add the following line of code to stream response from the graph and add an interactive chat loop, which will run until it’s explicitly exited.
\
Stream Output and Handle Chat Input, and an infinite loop for user interaction:
\
async def stream_responses(graph, config, invokeWith):
async for event in graph.astream(invokeWith, config, stream_mode='updates'):
for key, value in event.items():
if key == 'call_llm':
content = value["messages"][-1].content
if content:
print('\\n' + ", ".join(content)
if isinstance(content, list) else content)
async def chat_loop(graph):
while True:
try:
user_input = input("\\nQuery: ").strip()
if user_input in ["quit", "exit", "q"]:
print("Goodbye!")
break
sys_m = """
Always provide the resource instance key during tool calls, as the ReBAC authorization model is being used. To obtain the resource instance key, use the list_resource_instances tool to view available resource instances.
Always parse the provided data before displaying it.
If the user has initially provided their ID, use that for subsequent tool calls without asking them again.
"""
invokeWith = {"messages": [
{"role": "user", "content": sys_m + '\\n\\n' + user_input}]}
config = {"configurable": {"thread_id": "1"}}
await stream_responses(graph, config, invokeWith)
except Exception as e:
print(f"Error: {e}")
Final Assembly
Add the main entry point where we will convert the Permit MCP server tool to LangGraph-compatible tools, bind our LLM to the resulting tools, set up the graph, draw it to a file, and fire up the chat loop:
\
python
CopyEdit
async def main():
async with stdio_client(server_params) as (read, write):
async with ClientSession(read, write) as session:
await session.initialize()
tools = await load_mcp_tools(session)
llm_with_tools = llm.bind_tools(tools)
graph = await setup_graph(tools)
global global_llm_with_tools
global_llm_with_tools = llm_with_tools
with open("workflow_graph.png", "wb") as f:
f.write(graph.get_graph().draw_mermaid_png())
await chat_loop(graph)
if __name__ == "__main__":
asyncio.run(main())
Once you’ve saved everything, start the client:
uv run client.py
After running, a new image file called workflow_graph.png will be created, which shows the graph.
With everything set up, we can now specify queries like this:
Query: My user id is henry, request access to pizza palace with the reason: I am now 18, and the role child-can-order
Query: My user id is joe, list all access requests
Your agent is now able to call MCP tools dynamically!
interrupt()
With your LangGraph-powered MCP client up and running, Permit tools can now be invoked automatically. But what happens when the action is sensitive, like granting access to a restricted resource or approving a high-risk operation?
That’s where LangGraph’s interrupt()
becomes useful.
We’ll now add a human approval node to intercept and pause the workflow whenever the agent tries to invoke critical tools like:
approve_access_request
approve_operation_approval
A human will be asked to manually approve or deny the tool call before the agent proceeds.
Define the Human Review Node
At the top of your client.py
file (before setup_graph
), add the following function:
\
async def human_review_node(state) -> Command[Literal["call_llm", "run_tool"]]:
"""Handle human review process."""
last_message = state["messages"][-1]
tool_call = last_message.tool_calls[-1]
high_risk_tools = ['approve_access_request', 'approve_operation_approval']
if tool_call["name"] not in high_risk_tools:
return Command(goto="run_tool")
human_review = interrupt({
"question": "Do you approve this tool call? (yes/no)",
"tool_call": tool_call,
})
review_action = human_review["action"]
if review_action == "yes":
return Command(goto="run_tool")
return Command(goto="call_llm", update={"messages": [{
"role": "tool",
"content": f"The user declined your request to execute the {tool_call.get('name', 'Unknown')} tool, with arguments {tool_call.get('args', 'N/A')}",
"name": tool_call["name"],
"tool_call_id": tool_call["id"],
}]})
This node checks whether the tool being called is considered “high risk.” If it is, the graph is interrupted with a prompt asking for human confirmation.
Update Graph Routing
Modify the route_after_llm
function so that the tool calls the route to the human review node instead of running immediately:
def route_after_llm(state) -> Literal[END, "human_review_node"]:
"""Route logic after LLM processing."""
return END if len(state["messages"][-1].tool_calls) == 0 else "human_review_node"
Wire in the HITL Node
Update the setup_graph
function to add the human_review_node
as a node in the graph:
async def setup_graph(tools):
builder = StateGraph(State)
run_tool = ToolNode(tools)
builder.add_node(call_llm)
builder.add_node('run_tool', run_tool)
builder.add_node(human_review_node) # Add the interrupt node here
builder.add_edge(START, "call_llm")
builder.add_conditional_edges("call_llm", route_after_llm)
builder.add_edge("run_tool", "call_llm")
memory = MemorySaver()
return builder.compile(checkpointer=memory)
Handle Human Input During Runtime
Finally, let’s enhance your stream_responses
function to detect when the graph is interrupted, prompt for a decision, and resume with human input using Command(resume={"action": user_input})
.
After running the client, the graph should not look like this:
After running the client, your graph diagram (workflow_graph.png
) will now include a human review node between the LLM and tool execution stages:
This ensures that you remain in control whenever the agent tries to make a decision that could alter permissions or bypass restrictions.
With this, you've successfully added human oversight to your AI agent, without rewriting your tools or backend logic.
In this tutorial, we built a secure, human-aware AI agent using Permit.io’s Access Request MCP, LangGraph, and LangChain MCP Adapters.
Instead of letting the agent operate unchecked, we gave it the power to request access and defer critical decisions to human users, just like a responsible team member would.
We covered:
interrupt()
Want to see the full demo in action? Check out the GitHub Repo.
Further Reading -
interrupt()
Reference\
2025-06-25 15:12:07
AI Security Posture Management (AISPM) is an emerging discipline focused on securing AI agents, their memory, external interactions, and behavior in real-time.
As AI agents become deeply embedded in applications, traditional security models aren’t really up for the task. Unlike static systems, AI-driven environments introduce entirely new risks—hallucinated outputs, prompt injections, autonomous actions, and cascading interactions between agents.
These aren’t just extensions of existing problems—they’re entirely new challenges that legacy security posture tools like DSPM (Data Security Posture Management) or CSPM (Cloud Security Posture Management) were never designed to solve.
AISPM exists because AI systems don’t just store or transmit data—they generate new content, make decisions, and trigger real-world actions. Securing these systems requires rethinking how we monitor, enforce, and audit security, not at the infrastructure level, but at the level of AI reasoning and behavior.
If you’re looking for a deeper dive into what machine identities are and how AI agents fit into modern access control models, we cover that extensively in “What is a Machine Identity? Understanding AI Access Control”. This article, however, focuses on the next layer: securing how AI agents operate, not just who they are.
Join us as we explain what makes AISPM a distinct and necessary evolution, explore the four unique perimeters of AI security, and outline how organizations can start adapting their security posture for an AI-driven world.
Because the risks AI introduces are already here, and they’re growing fast.
Securing AI systems isn’t just about adapting existing tools—it’s about confronting entirely new risk categories that didn’t exist up to now.
As mentioned above, AI agents don’t just execute code—they generate content, make decisions, and interact with other systems in unpredictable ways. That unpredictability introduces vulnerabilities that security teams are only beginning to understand.
AI hallucinations, for example—false or fabricated outputs—aren’t just inconvenient; they can corrupt data, expose sensitive information, or even trigger unsafe actions if not caught.
Combine that with the growing use of retrieval-augmented generation (RAG) pipelines, where AI systems pull information from vast memory stores, and the attack surface expands dramatically.
Beyond data risks, AI systems are uniquely susceptible to prompt injection attacks, where malicious actors craft inputs designed to hijack the AI’s behavior. Think of it as the SQL injection problem, but harder to detect and even harder to contain, as it operates within natural language.
Perhaps the most challenging part of this is that AI agents don’t operate in isolation. They trigger actions, call external APIs, and sometimes interact with other AI agents, creating complex, cascading chains of behavior that are difficult to predict, control, or audit.
Traditional security posture tools were never designed for this level of autonomy and dynamic behavior. That’s why AISPM is not DSPM or CSPM for AI—it’s a new model entirely, focused on securing AI behavior and decision-making.
Securing AI systems isn’t just about managing access to models—it requires controlling the entire flow of information and decisions as AI agents operate. From what they’re fed, to what they retrieve, to how they act, and what they output, each phase introduces unique risks.
As with any complex system, access control becomes an attack surface amplified in the context of AI. That’s why a complete AISPM strategy should consider these four distinct perimeters—each acting as a checkpoint for potential vulnerabilities:
Every AI interaction starts with a prompt, and prompts are now an attack surface. Whether from users, other systems, or upstream AI agents, unfiltered prompts can lead to manipulation, unintended behaviors, or AI "jailbreaks".
Prompt filtering ensures that only validated, authorized inputs reach the model. This includes:
For example, restricting certain prompt types for non-admin users or requiring additional checks for prompts containing sensitive operations like database queries or financial transactions.
Retrieval-Augmented Generation (RAG) pipelines—where AI agents pull data from external knowledge bases or vector databases—add a powerful capability but also expand the attack surface. AISPM must control:
Without this perimeter, AI agents risk retrieving and leaking sensitive data or training themselves on information they shouldn’t have accessed in the first place.
“Building AI Applications with Enterprise-Grade Security Using RAG and FGA” provides a practical example of RAG data protection for healthcare.
AI agents aren’t confined to internal reasoning. Increasingly, they act—triggering API calls, executing transactions, modifying records, or chaining tasks across systems.
AISPM must enforce strict controls over these external actions:
Define exactly what operations each AI agent is authorized to perform
Track “on behalf of” chains to maintain accountability for actions initiated by users but executed by agents
Insert human approval steps where needed, especially for high-risk actions like purchases or data modifications
This prevents AI agents from acting outside of their intended scope or creating unintended downstream effects.
Even if all inputs and actions are tightly controlled, AI responses themselves can still create risk, hallucinating facts, exposing sensitive information, or producing inappropriate content.
Response enforcement means:
Scanning outputs for compliance, sensitivity, and appropriateness before delivering them
Applying role-based output filters so that only authorized users see certain information
Ensuring AI doesn’t unintentionally leak internal knowledge, credentials, or PII in its final response
In AI systems, output is not just information—it’s the final, visible action. Securing it is non-negotiable.
Together, these four perimeters form the foundation of AISPM. They ensure that every stage of the AI’s operation is monitored, governed, and secured—from input to output, from memory access to real-world action.
Treating AI security as an end-to-end flow—not just a static model check—is what sets AISPM apart from legacy posture management. Because when AI agents reason, act, and interact dynamically, security must follow them every step of the way.
As we can already see, securing AI systems demands a different mindset—one that treats AI reasoning and behavior as part of the attack surface, not just the infrastructure it runs on. AISPM is built on a few key principles designed to meet this challenge:
Effective AI security can’t be bolted on. It must be baked into the AI’s decision-making loop—filtering prompts, restricting memory access, validating external calls, and scanning responses in real-time. External wrappers like firewalls or static code scans don’t protect against AI agents reasoning their way into unintended actions.
The AI itself must operate inside secure boundaries.
AI decisions happen in real-time, which means continuous evaluation is critical.
AISPM systems must track agent behavior as it unfolds, recalculate risk based on new context or inputs, and adjust permissions or trigger interventions mid-execution if necessary.
Static posture reviews or periodic audits will not catch issues as they emerge. AI security is a live problem, so your posture management must be live, too.
AI agents have the ability to chain actions—call APIs, trigger other agents, or interact with users—these all require extremely granular auditing.
AISPM must:
This is the only way to maintain accountability and traceability when AI agents act autonomously.
AI systems don’t just act—they delegate tasks to other agents, services, or APIs. Without proper boundaries, trust can cascade unchecked, creating risks of uncontrolled AI-to-AI interactions.
AISPM should enforce strict scoping of delegated authority, Time-to-live (TTL) on trust or delegated access, preventing long-lived permission chains that become impossible to revoke, and enabling human review checkpoints for high-risk delegations.
Lastly, as AI ecosystems grow, agents will need to trust—but verify—other agents' claims. AISPM should prepare for this future by supporting cryptographic signatures on AI requests and responses as well as tamper-proof logs that allow agents—and humans—to verify the source and integrity of any action in the chain.
This is how AI systems will eventually audit and regulate themselves, especially in multi-agent environments.
While AISPM is still an emerging discipline, we’re starting to see practical tools and frameworks that help put its principles into action, enabling developers to build AI systems with security guardrails baked into the flow of AI decisions and actions.
Popular AI development frameworks like LangChain and LangFlow are beginning to support integrations that add identity verification and fine-grained policy enforcement directly into AI workflows. These integrations allow developers to:
Authenticate AI agents using identity tokens before allowing actions
Insert dynamic permission checks mid-workflow to stop unauthorized data access or unsafe operations
Apply fine-grained authorization to Retrieval-Augmented Generation (RAG) pipelines, filtering what the AI can retrieve based on real-time user or agent permissions.
These capabilities move beyond basic input validation, enabling secure, identity-aware pipelines in which AI agents must prove what they’re allowed to do at every critical step.
Frameworks designed for AI application development increasingly support structured data validation and access control enforcement. By combining input validation with authorization layers, developers can:
This helps protect systems against accidental data leaks and intentional prompt manipulation by ensuring the AI operates strictly within its defined boundaries.
Emerging standards like the Model Context Protocol (MCP) propose structured ways for AI agents to interact with external tools, APIs, and systems. These protocols enable:
Explicit permission checks before AI agents can trigger external operations
Machine identity assignment to AI agents, scoping their capabilities
Real-time authorization rules at interaction points, ensuring actions remain controlled and traceable
This is crucial for keeping AI-driven actions—like API calls, database queries, or financial transactions—accountable and auditable.
The rapid evolution of AI agents is already pushing the boundaries of what traditional security models can handle. As AI systems grow more autonomous—capable of reasoning, chaining actions, and interacting with other agents—AISPM will become foundational, not optional.
One major shift on the horizon is the rise of risk scoring and trust propagation models for AI agents. Just as human users are assigned trust levels based on behavior and context, AI agents will need dynamic trust scores that influence what they’re allowed to access or trigger—especially in multi-agent environments where unchecked trust could escalate risks fast.
AISPM shifts security upstream into the AI’s decision-making process and controls behavior at every critical point.
As AI continues to drive the next wave of applications, AISPM will be critical to maintaining trust, compliance, and safety. The organizations that embrace it early will be able to innovate with AI without compromising security.
Read more about how Permit.io handles secure AI collaboration through a permissions gateway here.
If you have any questions, make sure to join our Slack community, where thousands of devs are building and implementing authorization.