MoreRSS

site iconHackerNoonModify

We are an open and international community of 45,000+ contributing writers publishing stories and expertise for 4+ million curious and insightful monthly readers.
Please copy the RSS to your reader, or quickly subscribe to:

Inoreader Feedly Follow Feedbin Local Reader

Rss preview of Blog of HackerNoon

OS2a: Objective Service Assessment for Mobile AIGC

2025-06-25 18:00:05

Table of Links

Abstract and 1. Introduction

1.1 Background

1.2 Motivation

1.3 Our Work and Contributions and 1.4 Organization

  1. Related Work

    2.1 Mobile AIGC and Its QoE Modeling

    2.2 Blockchain for Mobile Networks

  2. Preliminaries

  3. Prosecutor Design

    4.1 Architecture Overview

    4.2 Reputation Roll-up

    4.3 Duplex Transfer Channel

  4. OS2a: Objective Service Assessment for Mobile AIGC

    5.1 Inspiration from DCM

    5.2 Objective Quality of the Service Process

    5.3 Subjective Experience of AIGC Outputs

  5. OS2A on Prosecutor: Two-Phase Interaction for Mobile AIGC

    6.1 MASP Selection by Reputation

    6.2 Contract Theoretic Payment Scheme

  6. Implementation and Evaluation

    7.1 Implementation and Experimental Setup

    7.2 Prosecutor Performance Evaluation

    7.3 Investigation of Functional Goals

    7.4 Security Analysis

  7. Conclusion and References

5 OS2A: OBJECTIVE-SUBJECTIVE SERVICE ASSESSMENT FOR MOBILE AIGC

In this section, we elaborate on the design of the OS2A framework. Firstly, inspired by the DCM principle [15], we present the idea of OS2A. Then, we demonstrate the modeling of the objective service quality and subjective service experience, respectively.

\ Fig. 4: A series of AIGC images. Image (a) and the others are generated by Stable Diffusion 2.1 and Craiyon V3, respectively. The other configurations are default.

5.1 Inspiration from DCM

The modeling of the quality and experience of AIGC services is intractable. For instance, Fig. 4 shows five AIGC images generated by the same prompt. Comparing Figs. 4(a) and (b), we can observe that the latter has higher “quality” since its cabin is located in the center and contains more details. However, from the client’s perspective, the experience of Fig. 4(b) might be poor since it may highly prefer photorealism images like Figs. 4(a) and (c)-(e) rather than cartoon-styled Fig. 4(b). Apart from preference, different clients may hold different standards for the AIGC services. Some are lenient, while others might be strict. Back to the above example, even for Figs. 4(d) and (e), the strict clients might claim that the cabin is too low and there draws the unexpected aurora, respectively. DCM, first presented by Daniel McFadden [15], explains the cause of such situations. This theory states that the power determining the clients’ utility comes from two collaborative sources, namely objective factors and subjective factors. The former indicates the objective attributes that clients can enjoy, while the latter is affected by the specific environment and the subjectivity of the clients themselves. Inspired by DCM, we present the concept of Objective-Subjective Service Assessment (OS2A) for mobile AIGC, considering both the objective experience of the AIGC service process and the subjective experience of the AIGC outputs. OS2A is defined as

\ \

\ \

:::info Authors:

(1) Yinqiu Liu, School of Computer Science and Engineering, Nanyang Technological University, Singapore ([email protected]);

(2) Hongyang Du, School of Computer Science and Engineering, Nanyang Technological University, Singapore ([email protected]);

(3) Dusit Niyato, School of Computer Science and Engineering, Nanyang Technological University, Singapore ([email protected]);

(4) Jiawen Kang, School of Automation, Guangdong University of Technology, China ([email protected]);

(5) Zehui Xiong, Pillar of Information Systems Technology and Design, Singapore University of Technology and Design, Singapore ([email protected]);

(6) Abbas Jamalipour, School of Electrical and Information Engineering, University of Sydney, Australia ([email protected]);

(7) Xuemin (Sherman) Shen, Department of Electrical and Computer Engineering, University of Waterloo, Canada ([email protected]).

:::


:::info This paper is available on arxiv under CC BY 4.0 DEED license.

:::

\

How Reputation Roll-Up and Duplex Transfer Channels Improve Blockchain Efficiency

2025-06-25 17:00:02

Table of Links

Abstract and 1. Introduction

1.1 Background

1.2 Motivation

1.3 Our Work and Contributions and 1.4 Organization

  1. Related Work

    2.1 Mobile AIGC and Its QoE Modeling

    2.2 Blockchain for Mobile Networks

  2. Preliminaries

  3. Prosecutor Design

    4.1 Architecture Overview

    4.2 Reputation Roll-up

    4.3 Duplex Transfer Channel

  4. OS2a: Objective Service Assessment for Mobile AIGC

    5.1 Inspiration from DCM

    5.2 Objective Quality of the Service Process

    5.3 Subjective Experience of AIGC Outputs

  5. OS2A on Prosecutor: Two-Phase Interaction for Mobile AIGC

    6.1 MASP Selection by Reputation

    6.2 Contract Theoretic Payment Scheme

  6. Implementation and Evaluation

    7.1 Implementation and Experimental Setup

    7.2 Prosecutor Performance Evaluation

    7.3 Investigation of Functional Goals

    7.4 Security Analysis

  7. Conclusion and References

4.2 Reputation Roll-up

Next, we show the layer-2 design of the anchor chain, including reputation roll-up and duplex transfer channels. Traditionally, all the historical opinions should be saved on the ledger of each MASP. Nevertheless, the explosively increasing data volume wastes considerable storage resources of MASPs. Given that opinions only serve as evidence for reputation tracing, we intend to offload them from the anchor chain and only keep the most critical bookkeeping messages. As shown in Algorithm 1, we develop layer-2 reputation roll-up, containing the following steps.

\

\ 2) Reputation Compression: When reaching the threshold, RCOs take turns compressing the received transactions. Specifically, these transactions undergo the SHA256 operation sequentially in chronological order. Then, a roll-up block Br can be created by only containing the hashes, as shown in Fig. 2. Compared with one block containing 1000 transactions, which typically occupies 500 Bytes [30], one Br containing 1000 hashes only occupies 32.5 Bytes because each SHA256 output takes 256 bits [44]. Consequently, the data volume consumed for saving historical reputation records can be effectively compressed.

\

\

\ Fig. 3: Illustration of atomic fee-ownership transfers. The operations framed by red dotted lines construct one atomic operation that should be executed simultaneously

4.3 Duplex Transfer Channel

The second layer-2 design is duplex transfer channels between each MASP-client pair, with which we realize the atomic fee-ownership transfers. These channels are virtual 7 and instantiated by the specific smart contract. Within the channel, the participants can conduct multiple rounds of atomic transfers protected by the Hash Lock (HL) protocol. Only the channel initialization and closing need to be recorded on the anchor chain. Since the transfers happen inside channels, low latency can be guaranteed, and the workload of the anchor chain can also be alleviated. Next, we introduce the procedure of atomic fee-ownership transfer on the channel.

\ \

\ \ \

\ \ \

\ \ \

\ \

:::info Authors:

(1) Yinqiu Liu, School of Computer Science and Engineering, Nanyang Technological University, Singapore ([email protected]);

(2) Hongyang Du, School of Computer Science and Engineering, Nanyang Technological University, Singapore ([email protected]);

(3) Dusit Niyato, School of Computer Science and Engineering, Nanyang Technological University, Singapore ([email protected]);

(4) Jiawen Kang, School of Automation, Guangdong University of Technology, China ([email protected]);

(5) Zehui Xiong, Pillar of Information Systems Technology and Design, Singapore University of Technology and Design, Singapore ([email protected]);

(6) Abbas Jamalipour, School of Electrical and Information Engineering, University of Sydney, Australia ([email protected]);

(7) Xuemin (Sherman) Shen, Department of Electrical and Computer Engineering, University of Waterloo, Canada ([email protected]).

:::


:::info This paper is available on arxiv under CC BY 4.0 DEED license.

:::

\

Meet Mojo: The Language That Could Replace Python, C++, and CUDA

2025-06-25 15:28:39

Why Mojo Changes Everything

So here's the thing - Python is amazing, but it's painfully slow.

You know it, I know it, everyone knows it.

Enter Mojo, launched in May 2023 by the brilliant minds at Modular AI.

This isn't just another programming language - it's Python's superhero transformation.

Created by Chris Lattner (yes, the Swift and LLVM genius), Mojo was born from a simple frustration: why should we choose between Python's ease and C++'s speed?

Welcome to Mojo - a programming language that enables fast & portable CPU+GPU code on multiple platforms.

But wait, there's more.

Your existing Python code runs in Mojo without changing a single line.

Zero.

Nada.

Nothing changes!

:::tip Think of Mojo as Python that hit the gym, learned martial arts, and came back 1000x stronger while still being the same friendly person you know and love.

:::

The team at Modular didn't set out to build a language - they needed better tools for their AI platform, so they built the ultimate tool.

Not just does Mojo work with Python, you can also access low-level programming for GPUs, TPUs, and even ASIC units.

This means you will no longer need C, C++, CUDA, or Metal to optimize Generative AI and LLM workloads.

Adopt Mojo - and the CUDA moat is gone, and hardware-level programming is simplified.

How cool is that?

Your First Taste of Mojo

Modular: Mojo🔥 - It's finally here!

Let's start with something you already know:

fn main():
    print("Hello, Mojo! 🔥")

Looks like Python, right?

That's because it literally is Python syntax.

Your muscle memory is already trained.

Here's where it gets different - variables with superpowers:

fn main():
    let name = "Mojo"        # This is immutable and blazing fast
    var count: Int = 42      # This is mutable with type safety
    let pi = 3.14159         # Smart enough to figure out the type
    print("Language:", name, "Count:", count, "Pi:", pi)

See that let keyword?

It's telling the compiler "this never changes," which unlocks serious optimization magic.

The var keyword says "this might change," but you can add types for extra speed when you need it.

Now here's where it gets interesting - dual function modes:

fn multiply_fast(a: Int, b: Int) -> Int:
    return a * b  # Compiled, optimized, rocket-fast

def multiply_python(a, b):
    return a * b  # Good old Python flexibility

fn main():
    print("Fast:", multiply_fast(6, 7))
    print("Flexible:", multiply_python(6, 7))

Use fn when you want maximum speed with type safety.

Use def when you want Python's flexibility.

You can literally mix and match in the same program.

Start with def, optimize with fn later.

Here's an interesting loop:

fn main():
    let numbers = List[Int](1, 2, 3, 4, 5)
    var total = 0

    for num in numbers:
        total += num[]  # That [] tells Mojo to optimize aggressively

    print("Numbers:", numbers, "Sum:", total)

    # This loop processes a million items faster than Python can blink
    for i in range(1000000):
        pass  # Automatically vectorized by the compiler

That explicit [] syntax might look weird, but it's your secret weapon for telling the compiler exactly what you want optimized.

The Game-Changing Features of Mojo

Mojo is a potential high-return investment!

There are reasons that Mojo, when fully developed, could take over the entire world.

Zero-Cost Python Compatibility (Your Programming Knowledge is Safe)

Remember all those Python libraries you love? They still work:

import numpy as np
import pandas as pd
from sklearn.linear_model import LinearRegression

fn main():
    let data = np.array([[1, 2], [3, 4], [5, 6]])
    let df = pd.DataFrame(data, columns=['x', 'y'])
    let model = LinearRegression()
    print("All your favorite libraries work instantly!")

This is huge.

No migration headaches, no rewriting millions of lines of code.

Your NumPy arrays, pandas DataFrames, and scikit-learn models work exactly like they always have.

The difference?

Now they can run alongside code that's 1000x faster when you need it.

SIMD Vectorization Made Simple (Parallel Processing for Mortals)

Check this out - automatic parallel processing:

from algorithm import vectorize
from sys.info import simdwidthof

fn vector_magic():
    alias size = 1000000
    var a = DTypePointer[DType.float32].alloc(size)
    var b = DTypePointer[DType.float32].alloc(size)
    var result = DTypePointer[DType.float32].alloc(size)

    @parameter
    fn vectorized_add[width: Int](i: Int):
        let a_vec = a.load[width=width](i)
        let b_vec = b.load[width=width](i)
        result.store[width=width](i, a_vec + b_vec)

    vectorize[vectorized_add, simdwidthof[DType.float32]()](size)

That @parameter decorator is doing compile-time magic - it creates specialized versions of your function for different CPU architectures.

Your code automatically uses all available CPU cores and SIMD instructions without you thinking about it.

This single function can be 8x to 128x faster than equivalent Python code.

And many other benchmarks are going through the roof!

GPU Programming Without the Headache

Want to use your GPU?

Here's how simple it is:

from gpu import GPU
from tensor import Tensor

fn gpu_power():
    @gpu.kernel
    fn matrix_multiply(a: Tensor[DType.float32], b: Tensor[DType.float32]) -> Tensor[DType.float32]:
        return a @ b  # Just matrix multiplication, but on GPU

    let big_matrix_a = Tensor[DType.float32](Shape(2048, 2048))
    let big_matrix_b = Tensor[DType.float32](Shape(2048, 2048))
    let result = matrix_multiply(big_matrix_a, big_matrix_b)

No CUDA programming, no memory management nightmares, no kernel configuration headaches.

The @gpu.kernel decorator automatically generates optimized GPU code for NVIDIA, AMD, and Apple GPUs.

The same code runs on any GPU without changes.

This is revolutionary and a huge improvement over existing tooling!

Parametric Programming (Templates Done Right)

Now Mojo gets really clever:

struct SmartMatrix[rows: Int, cols: Int, dtype: DType]:
    var data: DTypePointer[dtype]

    fn __init__(inout self):
        self.data = DTypePointer[dtype].alloc(rows * cols)

    fn get(self, row: Int, col: Int) -> SIMD[dtype, 1]:
        return self.data.load(row * cols + col)

fn show_parametric_power():
    let small_int_matrix = SmartMatrix[10, 10, DType.int32]()
    let big_float_matrix = SmartMatrix[1000, 500, DType.float64]()
    # Each gets its own optimized code generated at compile time

The compiler creates completely different optimized code for each combination of parameters.

Your 10x10 integer matrix gets different optimizations than your 1000x500 float matrix.

This is C++ template-level performance with much cleaner and more readable syntax.

Memory Safety Without Garbage Collection

Here's how Mojo prevents memory leaks and crashes:

struct SafePointer[T: AnyType]:
    var data: Pointer[T]

    fn __init__(inout self, value: T):
        self.data = Pointer[T].alloc(1)
        self.data.store(value)

    fn __moveinit__(inout self, owned other: Self):
        self.data = other.data
        other.data = Pointer[T]()  # Original pointer is now empty

    fn __del__(owned self):
        if self.data:
            self.data.free()  # Automatic cleanup

This is Rust-style memory safety with Python-style ease of use.

No garbage collection pauses, no memory leaks, no use-after-free bugs.

Memory gets cleaned up exactly when you expect it to, not when some garbage collector feels like it.

Adaptive Compilation (The AI That Optimizes Your Code)

This is serious innovation!

@adaptive
fn smart_algorithm(data: List[Int]) -> Int:
    var sum = 0
    for item in data:
        sum += item[]
    return sum

The @adaptive decorator tells the compiler to generate multiple versions of your function.

The runtime system profiles your actual usage and picks the fastest version for your specific data patterns.

Your code gets smarter the more it runs!

Advanced Features That Make Mojo Unstoppable

The AI Art Generator has a good imagination!

\

Compile-Time Computation

Want to move work from runtime to compile time?

Easy:

@parameter
fn compile_time_fibonacci(n: Int) -> Int:
    @parameter
    if n <= 1:
        return 1
    else:
        return n * compile_time_fibonacci(n - 1)

fn main():
    alias fib_result = compile_time_fibonacci(15)
    print("Fibonacci 15:", fib_result)  # Calculated while compiling

Complex calculations happen during compilation, not when your program runs.

This means zero runtime cost for things that can be figured out ahead of time.

This is a huge, forward-thinking leap in programming language design.

I expect other programming languages to follow suit!

Trait System for Generic Programming

Traits let you write code that works with many different types:

trait Addable:
    fn __add__(self, other: Self) -> Self

struct Vector2D(Addable):
    var x: Float32
    var y: Float32

    fn __add__(self, other: Self) -> Self:
        return Vector2D(self.x + other.x, self.y + other.y)

fn add_anything[T: Addable](a: T, b: T) -> T:
    return a + b  # Works with any type that implements Addable

Write once:

Use with any compatible type:

Get optimized code for each specific type.

Direct SIMD Operations

Want to talk directly to your CPU's vector units?

fn simd_playground():
    let data = SIMD[DType.float32, 8](1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0)
    let squared = data * data
    let fma_result = data.fma(data, data)  # Fused multiply-add
    let shuffled = data.shuffle[4, 5, 6, 7, 0, 1, 2, 3]()

Direct access to CPU vector instructions with type safety.

Operations that would take 8 CPU cycles now take 1!

The Mojo Standard Library: Simplicity Meets Practicality

There's genuine change coming into this scenario!

The standard library includes features for all kinds of tasks.

List[T] gives you dynamic arrays that are both type-safe and lightning fast.

Dict[K, V] provides hash tables optimized for real-world usage patterns.

String handles both ASCII and Unicode efficiently without the usual performance penalties.

Tensor[dtype] is your gateway to GPU-accelerated numerical computing.

Memory Management Made Simple

DTypePointer[dtype] gives you low-level control with high-level safety.

Buffer[T] provides automatic memory management for temporary data.

Reference[T] implements zero-copy borrowing for maximum efficiency.

An Algorithm Library That Actually Helps

vectorize automatically spreads your loops across all available CPU cores.

parallelize distributes work across threads with smart load balancing.

sort provides specialized sorting algorithms for different data types and sizes.

Math and Numerics Built for Performance

Native support for complex numbers, arbitrary precision math, and linear algebra.

Automatic differentiation for machine learning without external dependencies.

Statistical functions that are both accurate and blazingly fast.

System Integration Without Compromise

File I/O that automatically optimizes for SSD vs HDD vs network storage.

Network programming with async/await support for high-performance servers.

Cross-platform threading that actually works consistently.

Use Cases Where Mojo Can Dominate

Mojo has more use cases that you might expect!

Machine Learning That Scales

  • Training models with 10-1000x faster data preprocessing (some sources claim 35000x).
  • You can now preprocess datasets that used to take hours in minutes.
  • Real-time inference systems handling millions of requests per second on regular hardware.
  • Computer vision processing 4K video streams in real-time on edge devices.
  • The performance gains mean you can do more with less expensive hardware.

Scientific Computing Revolution

  • Climate models that used to need supercomputers now run on workstations.
  • Protein folding simulations with unprecedented speed and accuracy.
  • Financial risk models with microsecond precision for high-frequency trading.
  • Quantum simulations that approach the performance of actual quantum computers (for the foreseeable future, at least).

High-Performance Web Services

  • API servers handling millions of concurrent connections without breaking a sweat.
  • Real-time analytics processing terabytes of data per hour.
  • Game servers supporting thousands of players with sub-millisecond latency.
  • Cryptocurrency mining and blockchain validation at maximum theoretical efficiency.

Edge Computing and IoT Magic

  • Smart cameras that perform real-time object detection and tracking.
  • Autonomous vehicle systems with safety-critical performance requirements.
  • Industrial automation with real-time sensor processing and control.
  • Medical devices that perform complex computations within strict power budgets.

Financial Technology Transformation

  • Algorithmic trading systems with nanosecond execution times.
  • Risk assessment models process market data as it arrives.
  • Fraud detection analyzes transaction patterns instantly.
  • DeFi protocols with optimized smart contract execution.

The Blockchain and Crypto Revolution

  • Blazing-fast performance allows developers to replace Golang with Mojo.
  • Crypto mining software gets a huge boost with the ability to manipulate ASICs directly.
  • Expect Mojo SDKs for all crypto mining frameworks.
  • The memory-safety of Mojo, borrowed from Rust, should accelerate adoption.

Quantum AI Adoption

  • The biggest revolution in quantum computing is Quantum AI, where Mojo is the perfect match.
  • Existing Python libraries have full compatibility, such as IBM Qiskit and Google Cirq.
  • Quantum Computation can be simulated easily with GPUs, where Mojo is king.
  • Quantum Computing performance could see 100x-10000x performance boosts.

Generative AI Acceleration

  • DeepSeek was able to run cheaply because of low-level GPU optimization.
  • With Mojo, this low-level optimization is available to all.
  • The CUDA moat could disappear overnight.
  • The smartest thing Nvidia could do is to adopt Mojo and MAX themselves!

Getting Started: Your Journey Begins Now

No Time Like the Present!

Installation is Surprisingly Easy

Mojo currently works on Linux (Ubuntu 18.04+, CentOS 7+) and macOS (10.15+).

Windows support is coming soon - the team is working on it.

And when that happens - I see worldwide adoption.

And in the long term, I see mobile, edge, and IoT deployment as well!

You'll need 8 GB of RAM minimum, 16 GB recommended for smooth compilation.

Installation takes less than 5 minutes with the official installer.

Setting Up Your Development Environment

# Install the Modular SDK
curl -fsSL https://get.modular.com | sh -
modular install mojo

# Check if everything works
mojo --version
mojo run --help

A fully featured LLDB debugger is included with Mojo, along with beautifully integrated code completion support with hover and doc hints.

The VS Code extension gives you syntax highlighting, error checking, and integrated debugging.

Creating Your First Project

# Start a new project
mkdir awesome-mojo-project && cd awesome-mojo-project
mojo package init my-package

# Build and run
mojo build main.mojo
./main

The package system handles dependencies, versioning, and cross-platform distribution automatically.

Testing Your Code

from testing import assert_equal

fn test_addition():
    assert_equal(2 + 3, 5)
    print("Math still works!")

fn main():
    test_addition()

Built-in testing framework includes performance benchmarking capabilities.

The Mojo-Modular-MAX GitHub Ecosystem

From the Modular Website

Official Repositories

Open Source Components

  • As of February 2025, the Mojo compiler is closed-source with an open-source standard library.
  • The standard library uses Apache 2.0 license, so you can contribute and modify freely.
  • The company plans to open-source the entire language once a more mature version is ready.

MAX Platform: Enterprise AI Infrastructure

  • The MAX platform will completely revolutionize the current Gen AI infrastructure.
  • Costs will decrease, hardware optimization can now be done by LLMs, overseen by human experts, and:
  • The same language used for different hardware. (see below)

Multi-Hardware Magic

  • The same code runs on CPUs, GPUs, TPUs, and custom AI chips without modification.
  • Automatic profiling finds the optimal hardware configuration for your workload.
  • Dynamic load balancing distributes work across mixed hardware environments.

Model Optimization Pipeline

  • Automatic quantization shrinks models by 75% while maintaining accuracy.
  • Graph optimization eliminates redundant operations and fuses them for speed.
  • Memory layout optimization reduces cache misses and improves data flow.

:::tip MAX is not just an architecture - it’s a performance beast!

:::

Production Deployment Tools

  • Kubernetes-native deployment is available with automatic scaling based on demand.
  • A/B testing framework is also provided for comparing model performance in production.
  • Real-time monitoring and alerting for performance issues.

Features Introduced in 2025

  • Enhanced large language model support with efficient attention mechanisms.
  • Edge computing optimizations for mobile and IoT devices.
  • Seamless integration with major cloud providers.
  • Multi-tenant support for serving multiple models from a single infrastructure.

The Reality Check: What Mojo Can't Do Yet - But Will With Time

Reality Check But Also Promise For the Future!

Platform Limitations

  • Windows support is still in development, which limits enterprise adoption.
  • In my opinion, once Windows support is available, Mojo adoption will explode.
  • And you can already run Mojo on Windows with the Windows Subsystem for Linux (WSL)!
  • Mobile platforms (iOS and Android) are not supported yet for edge deployment.
  • Some cloud providers don't have Mojo-optimized instances available.

Ecosystem Growing Pains

  • The third-party library ecosystem is tiny compared to Python's vast repository.
  • Documentation has gaps, especially for advanced features.
  • Stack Overflow has fewer Mojo answers than you'd like.

Tooling Limitations

  • IDE support is mainly VS Code with basic functionality.
  • Profiling and debugging tools are less mature than established languages.
  • Package management is newer and less feature-rich than pip or conda.

Learning Curve Challenges

  • Functions can be declared using either fn or def, with fn ensuring strong typing - this duality confuses newcomers.
  • Understanding when to use let vs var vs Python-style variables takes practice.
  • Memory ownership concepts are new for garbage-collected language developers.

Corporate Dependencies

  • Heavy reliance on Modular's roadmap for language evolution.
  • Uncertainty about long-term open-source commitment vs commercial interests.
  • Potential vendor lock-in for projects using MAX platform features heavily.

Performance Gotchas

  • Some Python libraries haven't been optimized for Mojo's characteristics yet.
  • JIT compilation can impact startup time for short-running scripts.
  • Memory usage can be higher than Python in certain scenarios.

The Future is Bright: What's Coming Next

More AI and fewer people, that’s the future according to the AI Agents hype…

\

Python and Mojo remind me of C and C++, but for Generative AI instead of OOP.

Short-Term Wins (2025-2027)

Windows and mobile support will unlock enterprise and edge markets.

Universities will start teaching Mojo, creating a new generation of developers.

Major AI companies will replace Python bottlenecks with Mojo implementations.

The ecosystem will hit critical mass with hundreds of production-ready libraries.

Medium-Term Transformation (2027-2030)

Mojo aims to become a full superset of Python with its own dynamically growing tool ecosystem.

New AI/ML projects will default to Mojo for production performance.

Scientific computing will gradually migrate from Fortran and C++ to Mojo.

Cloud providers will offer Mojo-optimized instances with specialized acceleration.

Long-Term Revolution (2030+)

Mojo could become the go-to language for performance-critical applications everywhere.

Hardware manufacturers will design chips with Mojo-specific features.

The language will influence next-generation programming language design.

Schools will teach Mojo as the primary computational language.

Potential Challenges Ahead

There is limited competition from Julia, Rust, Carbon, and other performance languages, and the reason I say limited is because of Mojo’s support for Python.

But, Mojo needs to balance Python compatibility with language evolution needs.

The open-source community and the commercial platform requirements need to be balanced.

Diverse hardware architectures should be supported as well as optimization strategies.

Conclusion: Why Mojo Changes Everything

Here's the bottom line: Mojo eliminates the false choice between system fragmentation and system performance.

Your Python skills remain valuable - they just become 10000x more powerful.

Performance improvements of 10-10000x open up applications that were previously impossible.

The unified CPU+GPU programming model simplifies modern AI and scientific computing.

Even in blockchain and crypto mining, direct access to GPUs and ASICs gives Mojo a huge advantage.

Chris Lattner's track record with Swift and LLVM gives confidence in Mojo's future.

The timing is perfect - AI demands, edge computing needs, and developer productivity requirements are converging.

:::tip And Generative AI eating the world is the perfect use-case for Mojo.

:::

I believe that developing countries such as India should adopt Mojo instead of CUDA to build their LLMs, LMMs, and SLMs.

Not only does it make us less reliant on Nvidia, the computational costs will also decrease because of higher performance.

The Rust memory-safety feature and the Python compatibility are the icing and the cherry on the cake.

Once Mojo is available for Windows, I see an accelerated takeover in the entire programming industry.

And the main reason for this is the 100% support for pure Python.

If Modular does things right, and opensources the entire code:

I see Mojo having a huge impact.

Worldwide.

If you haven’t started with Mojo, do so today!

:::tip The real question isn't whether Mojo will succeed.

It's whether you'll be ready when it transforms your industry.

And it’s no longer a question of if, but when.

:::

Yes - Mojo has a very bright future!

\ Unless attributed to other sources, images were generated by Leonardo.ai at this link: https://app.leonardo.ai/

Claude Sonnet 4 was used in this article with heavy editing, the model is available here: https://claude.ai/

\

The Data Delusion: Why Brands Trust Dashboards More Than People - And Why That’s a Mistake

2025-06-25 15:23:17

"Data-driven" has become a badge of honor in modern marketing and product development. Dashboards are filled with charts, click-through rates, heatmaps, and A/B results. But in the rush to optimize what we can measure, many brands have lost sight of what they can't: emotion, hesitation, intent, and trust.

In an age when customer behavior evolves by the hour, decisions made purely on metrics are often misguided. Numbers reveal what happened but not why. And in the gap between those two realities lies some of the most valuable insight a business can find.

\ What the Dashboards Miss


There’s a dangerous assumption that if something performs well numerically, it must be working holistically. But history has shown that data can deceive. High-performing pages in terms of clicks or engagement may still underdeliver in conversion, brand perception, or loyalty, not because they’re poorly designed but because they miss the emotional mark.

Take Amazon’s Fire Phone, for example. Early engagement and traffic were promising, but customer feedback revealed that the product felt gimmicky, lacked essential app support, and failed to connect emotionally with users. Despite strong initial visibility, it became one of Amazon’s most expensive product flops.

Similarly, Tesco’s Fresh & Easy chain launched in the U.S. with extensive data backing, modeled after successful UK stores. But the format—smaller stores with self-service checkout and ready-made meals - confused and frustrated American shoppers who expected a more personal, larger-scale grocery experience. The stores underperformed dramatically and were eventually shuttered.

These are reminders that even when the numbers look good on paper, emotional disconnects can undermine the entire experience.

Quantitative testing methods, like A/B testing, were never built to capture how a user feels when navigating a product page. They can't tell when a shopper pauses before clicking "buy" or when an image triggers confusion, not confidence. These nuances don’t show up in analytics tools, but they influence behavior the same way.

\ The Human Layer: Where Emotion Shapes Action


Today’s most influential customers, especially Gen Z and mobile-first users, tend to make fast, emotionally driven choices. They respond instantly and intuitively to how a brand makes them feel. These emotional impressions form in seconds and often outweigh rational evaluation. A perfectly structured product page can still fall flat if it lacks authenticity, relatability, or emotional resonance—because what feels right often matters more than what looks right.

In mobile-first environments, users often decide within milliseconds whether a brand experience feels intuitive or off. Micro-interactions, like hover hesitation, scroll speed, or a quick swipe away, offer valuable signals about trust and clarity yet rarely surface in traditional analytics dashboards.

Emotion may be irrational, but it’s far from random. It reveals itself in the flicker of a facial expression, a pause in a voice response, or the way a user scrolls—slowly, quickly, or not at all—through a carousel of images. These subtle cues often guide decision-making, yet they’re exactly what most analytics tools fail to capture.

\ Tools Bridging the Gap


A new generation of platforms is stepping in to close this gap—tools designed to bring emotional intelligence into the optimization process. Instead of relying on metrics alone, they blend qualitative feedback, behavioral signals, and AI to reveal what users are actually experiencing in real time.

  • ProductPinion helps brands gather real-time video reactions from shoppers. The system decodes emotional patterns and reveals why certain visuals or copy elements resonate (or don’t).
  • Trymata focuses on usability testing, capturing how users behave while completing tasks and recording their voice feedback as they do.
  • UserTesting scales human feedback by connecting brands with real people who narrate their thoughts while interacting with digital experiences.
  • Creovai focuses on real-time and post-call intelligence by analyzing every customer conversation, tracking sentiment, friction, and tone to guide support teams and improve customer experience at scale.

These tools enhance traditional analytics rather than replace them. They capture the emotional context and behavioral nuance that data alone often misses.

\ Why Real Feedback Feeds Statistical Significance


One of the core problems with traditional feedback systems like post-call NPS and CSAT surveys is their participation rate. According to CX Today, survey engagement has dropped to just 5%—meaning 95% of customers never respond. Even within that narrow segment of respondents, the data tends to lean toward emotional extremes: customers who are either very happy or very upset. This creates a "U-shaped" curve of opinions, missing the voices of the vast majority with moderate or nuanced experiences.

This isn’t just a data issue. It’s a misalignment with reality. When businesses rely heavily on skewed feedback, they end up shaping strategies around edge cases rather than the true center of customer sentiment. And even when testing isn't biased, it's often too slow to catch up. Traditional A/B testing requires time, traffic, and statistically significant results to act. But in fast-moving digital environments, waiting for confidence intervals often means missing the response window.

Traditional surveys also lack meaningful context. Creovai's research suggests that only about 20% of respondents leave open-text comments to explain their scores. That means most customer input boils down to a single digit—useful for trend spotting but useless for problem-solving. It's the difference between spotting smoke and finding the fire.

That’s where human feedback becomes invaluable. Data alone often delivers broad or ambiguous patterns. But real shopper feedback offers precision—and context. With behavior-driven platforms, brands can hear how customers express hesitation, trust, or objection in their own words and tone. That emotional clarity is nearly impossible to replicate through dashboards alone.

While data shows trends, real user opinions reveal the triggers behind them. Numbers might tell you a page has a high drop-off rate, but it takes a human voice to explain that the call-to-action felt too pushy or that the imagery lacked credibility. In a world of instant reactions and emotionally charged decision-making, understanding what people think isn’t just a nice-to-have—it’s how brands stay relevant.

As Andri Sadlak, founder of ProductPinion, puts it:

\

“Data tells you what happened. But real consumer reactions tell you why it happened. That emotional layer—the pause before a scroll, the hesitation in a voice, the face that lights up at a product image—can never be captured in a spreadsheet. With ProductPinion, we built a system that decodes emotional behavior at scale because optimization without empathy is optimization in the dark.”

\n Many teams are turning to AI to overcome these blind spots, as it clarifies complex, emotional user behavior.

\ What AI Actually Brings to Testing


AI’s real value in this space is not automation—it’s amplification. It helps extract deeper meaning from feedback faster and at scale. Modern behavior-driven platforms use AI to:

  • Aggregate and process large-scale feedback from video recordings, survey transcripts, voice inputs, and interaction logs.
  • Detect emotional and behavioral patterns across user segments, identifying repeated moments of hesitation, frustration, or delight.
  • Prioritize what matters most by weighing user reactions by frequency and emotional impact or proximity to key conversion steps.
  • Translate findings into actionable design changes, messaging shifts, or UX refinements—sometimes within the same sprint.

With these new possibilities, companies can expand the scope of their testing. With AI, teams can assess which image performs better and why users trust one message over another. They can also test text length, color, emotional tone, and perceived authenticity.

Importantly, this kind of testing goes beyond surface metrics. AI allows brands to optimize for trust, relatability, and emotional resonance—all of which are difficult to measure with clicks alone. It enables iterative, empathy-driven development without sacrificing speed or scale.

When used correctly, AI doesn't replace human insight. It sharpens it.

\ Rethinking Data-Driven


The problem isn’t outdated feedback methods but the illusion that more data equals deeper understanding. Being data-driven isn’t a flaw, but it becomes one when it replaces human experience instead of enhancing it. The future of optimization lies in blending the precision of data with the power of empathy.

Platforms like ProductPinion, Trymata, and UserTesting are helping brands shift from measuring performance to interpreting behavior. The next wave of competitive advantage won’t come from tracking what people click but from uncovering the emotions and intent behind each action.

Future-leading brands will move beyond behavior analysis to truly grasp what drives decisions. They’ll combine data with empathy, dashboards with dialogue, and optimization with intuition. Because, in the end, what customers remember most is the impression your product left on them.

\n \n

Delegating AI Permissions to Human Users with Permit.io’s Access Request MCP

2025-06-25 15:21:13

As AI agents become more autonomous and capable, their role is shifting from passive assistants to proactive actors. Today’s large language models (LLMs) don’t just generate text—they execute tasks, access APIs, modify databases, and even control infrastructure.

AI agents are taking actions that were once reserved strictly for human users, whether it’s scheduling a meeting, deploying a service, or accessing a sensitive document.

When agents operate without guardrails, they can inadvertently make harmful or unauthorized decisions. A single hallucinated command, misunderstood prompt, or overly broad permission can result in data leaks, compliance violations, or broken systems.

That’s why integrating human-in-the-loop (HITL) workflows is essential for agent safety and accountability.

Permit.io’s Access Request MCP is a framework designed to enable AI agents with the ability to request sensitive actions, while allowing humans to remain the final decision-makers.

Built on Permit.io and integrated into popular agent frameworks like LangChain and LangGraph, this system lets you insert approval workflows directly into your LLM-powered applications.

In this tutorial, you’ll learn:

  • Why delegating sensitive permissions to humans is critical for trustworthy AI,
  • How Permit.io’s Model Context Protocol (MCP) enables access request workflows,
  • How to build a real-world system that blends LLM intelligence with human oversight—using LangGraph’s interrupt() feature.

Before we dive into our demo application and implementation steps, let’s briefly discuss the importance of delegating AI permissions to humans.

Why Delegating AI Permissions to Humans Is Critical

AI agents are powerful, but, as we all know, they’re not infallible.

They follow instructions, but they don’t understand context like humans do. They generate responses, but they can’t judge consequences. And when those agents are integrated into real systems—banking tools, internal dashboards, infrastructure controls—that’s a dangerous gap.

In this context, everything that can go wrong is pretty clear:

  • Over-permissive agents: LLMs may be granted access to tools they shouldn’t touch, either by design or accident.
  • Hallucinated tool calls: Agents can fabricate commands, arguments, or IDs that never existed.
  • Lack of auditability: Without human checkpoints, there’s no clear record of who approved what, or why.

Delegation is the solution.

Instead of giving agents unchecked power, we give them a protocol: “You may ask, but a human decides.”

By introducing human-in-the-loop (HITL) approval at key decision points, you get:

  • Safety: Prevent irreversible actions before they happen.
  • Accountability: Require explicit human sign-off for high-stakes operations.
  • Control: Let people set the rules—who can approve, what can be approved, and when.

It’s the difference between an agent doing something and an agent requesting to do something.

And it’s exactly what Permit.io’s Access Request MCP enables.

Permit.io’s Access Request MCP

The Access Request MCP is a core part of Permit.io’s Model Context Protocol (MCP)—a specification that gives AI agents safe, policy-aware access to tools and resources.

Think of it as a bridge between LLMs that want to act and humans who need control.

What it does

Permit’s Access Request MCP enables AI agents to:

  • Request access to restricted resources (e.g., "Can I access this restaurant?")
  • Request approval to perform sensitive operations (e.g., "Can I order this restricted dish?")
  • Wait for human input before proceeding—via LangGraph’s interrupt() mechanism
  • Log the request and decision for auditing and compliance

Behind the scenes, it uses Permit.io’s authorization capabilities built to support:

Plug-and-play with LangChain and LangGraph

Permit’s MCP is integrated directly into the LangChain MCP Adapter and LangGraph ecosystem:

  • You can expose Permit Elements as LangGraph-compatible tools.
  • You can pause the agent with interrupt() when sensitive actions occur.
  • You can resume execution based on real human decisions.

It’s the easiest way to inject human judgment into AI behavior—no custom backend needed.

Understanding the implementation and its benefits, let’s get into our demo application.

What We’ll Build - Demo Application Overview

In this tutorial, we’ll build a real-time approval workflow in which an AI agent can request access or perform sensitive actions, but only a human can approve them.

Scenario: Family Food Ordering System

To see how Permit’s MCP can help enable an HITL workflow in a user application, we’ll model a food ordering system for a family:

  • Parents can access and manage all restaurants and dishes.
  • Children can view public items, but must request access to restricted restaurants or expensive dishes.
  • When a child submits a request, a parent receives it for review and must explicitly approve or deny it before the action proceeds.

This use case reflects a common pattern: “Agents can help, but humans decide.”

Tech Stack

We’ll build this HITL-enabled agent using:

  • Permit.io - Handles authorization, roles, policies, and approvals
  • Permit MCP Server - Exposes Permit workflows as tools that the agent can use
  • LangChain MCP Adapters - Bridges Permit’s MCP tools into LangGraph & LangChain
  • LangGraph - Orchestrates the agent’s workflow with interrupt() support
  • Gemini 2.0 Flash - Lightweight, multimodal LLM used as the agent’s reasoning engine
  • Python - The glue holding it all together

You’ll end up with a working system where agents can collaborate with humans to ensure safe, intentional behavior—using real policies, real tools, and real-time approvals.

A repository with the full code for this application is available here.

Step-by-Step Tutorial

In this section, we’ll walk through how to implement a fully functional human-in-the-loop agent system using Permit.io and LangGraph.

We’ll cover:

  • Modeling Permissions with Permit
  • Setting Up the Permit MCP Server
  • Creating a LangGraph + LangChain MCP Client
  • Adding Human-in-the-Loop with interrupt()
  • Running the Full Workflow

Let’s get into it -

Modeling Permissions with Permit

We’ll start by defining your system’s access rules inside the Permit.io dashboard. This lets you model which users can do what, and what actions should trigger an approval flow.

Create a ReBAC Resource

Navigate to the Policy page from the sidebar, then:

  • Click the Resources tab

  • Click Create a Resource

  • Name the resource: restaurants

  • Under ReBAC Options, define two roles:

  • parent

  • child-can-order

  • Click Save

Now, go to the Policy Editor tab and assign permissions:

  • parent: full access (create, read, update, delete)

  • child-can-order: read

Set Up Permit Elements

Go to the Elements tab from the sidebar. In the User Management section, click Create Element.

  • Configure the element as follows:

  • Name: Restaurant Requests

  • Configure elements based on: ReBAC Resource Roles

  • Resource Type: restaurants

  • Role permission levels

    • Level 1 – Workspace Owner: parent
    • Assignable Roles: child-can-order
  • Click Create

  • In the newly created element card, click Get Code and take note of the config ID: restaurant-requests. We’ll use this later in the .env file.

Add Operation Approval Elements

  • Create a new Operation Approval element:

  • Name: Dish Approval

  • Resource Type: restaurants

  • Click Create

  • Then create an Approval Management element:

  • Name: Dish Requests

  • Click Get Code and copy the config ID: dish-requests.

Add Test Users & Resource Instances

  • Navigate to Directory > Instances

  • Click Add Instance

  • Resource Type: restaurants

  • Instance Key: pizza-palace

  • Tenant: Default Tenant (or your working tenant)

  • Switch to the Users tab

  • Click Add User

  • Key: joe

  • Instance Access: restaurants:pizza-palace#parent

  • Click Save

  • Create another user with the key henry

  • Don’t assign a role

Once Permit is configured, we’re ready to clone the MCP server and connect your policies to a working agent.

Setting Up the Permit MCP Server

With your policies modeled in the Permit dashboard, it’s time to bring them to life by setting up the Permit MCP server—a local service that exposes your access request and approval flows as tools that an AI agent can use.

Clone and Install the MCP Server

Start by cloning the MCP server repository and setting up a virtual environment.

git clone <https://github.com/permitio/permit-mcp>
cd permit-mcp

# Create virtual environment, activate it and install dependencies
uv venv
source .venv/bin/activate # For Windows: .venv\\Scripts\\activate
uv pip install -e .

Add Environment Configuration

Create a .env file at the root of the project based on the provided .env.example, and populate it with the correct values from your Permit setup:

bash
CopyEdit
RESOURCE_KEY=restaurants
ACCESS_ELEMENTS_CONFIG_ID=restaurant-requests
OPERATION_ELEMENTS_CONFIG_ID=dish-requests
TENANT= # e.g. default
LOCAL_PDP_URL=
PERMIT_API_KEY=
PROJECT_ID=
ENV_ID=

You can retrieve these values using the following resources:

  • LOCAL_PDP_URL
  • PERMIT_API_KEY
  • PROJECT_ID
  • ENV_ID

⚠️ Note: We are using Permit’s Local PDP (Policy Decision Point) for this tutorial to support ReBAC evaluation and low-latency, offline testing.

Start the Server

With everything in place, you can now run the MCP server locally:

uv run -m src.permit_mcp

Once the server is running, it will expose your configured Permit Elements (access request, approval management, etc.) as tools the agent can call through the MCP protocol.

Creating a LangGraph + LangChain MCP Client

Now that the Permit MCP server is up and running, we’ll build an AI agent client that can interact with it. This client will:

  • Use a Gemini-powered LLM to decide what actions to takeDynamically invoke MCP tools like request_access, approve_operation_approval, etc.
  • Run entirely within a LangGraph workflow
  • Pause for human review using interrupt() (in the next section)

Let’s connect the dots.

Install Required Dependencies

Inside your MCP project directory, install the necessary packages:

uv add langchain-mcp-adapters langgraph langchain-google-genai

This gives you:

  • langchain-mcp-adapters: Automatically converts Permit MCP tools into LangGraph-compatible tools
  • langgraph: For orchestrating graph-based workflows
  • langchain-google-genai: For interacting with Gemini 2.0 Flash

Add Google API Key

You’ll need an API key from Google AI Studio to use Gemini.

Add the key to your .env file:

GOOGLE_API_KEY=your-key-here

Build the MCP Client

Create a file named client.py in your project root.

We’ll break this file down into logical blocks:

  • Imports and Setup

    Start by importing dependencies and loading environment variables:

  import os
  from typing_extensions import TypedDict, Literal, Annotated
  from dotenv import load_dotenv
  from langchain_google_genai import ChatGoogleGenerativeAI
  from langgraph.graph import StateGraph, START, END
  from langgraph.types import Command, interrupt
  from langgraph.checkpoint.memory import MemorySaver
  from langgraph.prebuilt import ToolNode
  from mcp import ClientSession, StdioServerParameters
  from mcp.client.stdio import stdio_client
  from langchain_mcp_adapters.tools import load_mcp_tools
  import asyncio
  from langgraph.graph.message import add_messages

Then, load the environment and set up your Gemini LLM:

\

load_dotenv()

global_llm_with_tools = None

llm = ChatGoogleGenerativeAI(
    model="gemini-2.0-flash",
    google_api_key=os.getenv('GOOGLE_API_KEY')
)

\

  • Configure MCP Server ParametersTell LangGraph how to communicate with the running MCP server:
  server_params = StdioServerParameters( command="python", args=["src/permit_mcp/server.py"], )

Define the shared agent state:

class State(TypedDict):
    messages: Annotated[list, add_messages]

\

  • Define Workflow Nodes and the graph builder:

    Here’s the logic to route between calling the LLM and invoking tools:

    \

  async def call_llm(state):
      response = await global_llm_with_tools.ainvoke(state["messages"])
      return {"messages": [response]}

  def route_after_llm(state) -> Literal[END, "run_tool"]:
      return END if len(state["messages"][-1].tool_calls) == 0 else "run_tool"

  async def setup_graph(tools):
      builder = StateGraph(State)
      run_tool = ToolNode(tools)
      builder.add_node(call_llm)
      builder.add_node('run_tool', run_tool)

      builder.add_edge(START, "call_llm")
      builder.add_conditional_edges("call_llm", route_after_llm)
      builder.add_edge("run_tool", "call_llm")

      memory = MemorySaver()
      return builder.compile(checkpointer=memory)

In the above code, we have defined an LLM node and its conditional edge, which routes to the run_tool node if there is a tool call in the state's message, or ends the graph. We have also defined a function to set up and compile the graph with an in-memory checkpointer.

Next, add the following line of code to stream response from the graph and add an interactive chat loop, which will run until it’s explicitly exited.

\

  • Stream Output and Handle Chat Input, and an infinite loop for user interaction:

    \

       async def stream_responses(graph, config, invokeWith):
      async for event in graph.astream(invokeWith, config, stream_mode='updates'):
          for key, value in event.items():
              if key == 'call_llm':
                  content = value["messages"][-1].content
                  if content:
                      print('\\n' + ", ".join(content)
                            if isinstance(content, list) else content)

  async def chat_loop(graph):
      while True:
          try:
              user_input = input("\\nQuery: ").strip()
              if user_input in ["quit", "exit", "q"]:
                  print("Goodbye!")
                  break

              sys_m = """
              Always provide the resource instance key during tool calls, as the ReBAC authorization model is being used. To obtain the resource instance key, use the list_resource_instances tool to view available resource instances.

              Always parse the provided data before displaying it.
              If the user has initially provided their ID, use that for subsequent tool calls without asking them again.
              """

              invokeWith = {"messages": [
                  {"role": "user", "content": sys_m + '\\n\\n' + user_input}]}
              config = {"configurable": {"thread_id": "1"}}

              await stream_responses(graph, config, invokeWith)

          except Exception as e:
              print(f"Error: {e}")
  • Final Assembly

    Add the main entry point where we will convert the Permit MCP server tool to LangGraph-compatible tools, bind our LLM to the resulting tools, set up the graph, draw it to a file, and fire up the chat loop:

    \

  python
  CopyEdit
  async def main():
      async with stdio_client(server_params) as (read, write):
          async with ClientSession(read, write) as session:
              await session.initialize()

              tools = await load_mcp_tools(session)
              llm_with_tools = llm.bind_tools(tools)
              graph = await setup_graph(tools)

              global global_llm_with_tools
              global_llm_with_tools = llm_with_tools

              with open("workflow_graph.png", "wb") as f:
                  f.write(graph.get_graph().draw_mermaid_png())

              await chat_loop(graph)

  if __name__ == "__main__":
      asyncio.run(main())
  • Lastly, Run the Client

Once you’ve saved everything, start the client:

uv run client.py

After running, a new image file called workflow_graph.png will be created, which shows the graph.

With everything set up, we can now specify queries like this:

Query: My user id is henry, request access to pizza palace with the reason: I am now 18, and the role child-can-order
Query: My user id is joe, list all access requests

Your agent is now able to call MCP tools dynamically!

Adding Human-in-the-Loop with interrupt()

With your LangGraph-powered MCP client up and running, Permit tools can now be invoked automatically. But what happens when the action is sensitive, like granting access to a restricted resource or approving a high-risk operation?

That’s where LangGraph’s interrupt() becomes useful.

We’ll now add a human approval node to intercept and pause the workflow whenever the agent tries to invoke critical tools like:

  • approve_access_request
  • approve_operation_approval

A human will be asked to manually approve or deny the tool call before the agent proceeds.

Define the Human Review Node

At the top of your client.py file (before setup_graph), add the following function:

\

async def human_review_node(state) -> Command[Literal["call_llm", "run_tool"]]:
    """Handle human review process."""
    last_message = state["messages"][-1]
    tool_call = last_message.tool_calls[-1]

    high_risk_tools = ['approve_access_request', 'approve_operation_approval']
    if tool_call["name"] not in high_risk_tools:
        return Command(goto="run_tool")

    human_review = interrupt({
        "question": "Do you approve this tool call? (yes/no)",
        "tool_call": tool_call,
    })

    review_action = human_review["action"]

    if review_action == "yes":
        return Command(goto="run_tool")

    return Command(goto="call_llm", update={"messages": [{
        "role": "tool",
        "content": f"The user declined your request to execute the {tool_call.get('name', 'Unknown')} tool, with arguments {tool_call.get('args', 'N/A')}",
        "name": tool_call["name"],
        "tool_call_id": tool_call["id"],
    }]})

This node checks whether the tool being called is considered “high risk.” If it is, the graph is interrupted with a prompt asking for human confirmation.

Update Graph Routing

Modify the route_after_llm function so that the tool calls the route to the human review node instead of running immediately:

def route_after_llm(state) -> Literal[END, "human_review_node"]:
    """Route logic after LLM processing."""
    return END if len(state["messages"][-1].tool_calls) == 0 else "human_review_node"

Wire in the HITL Node

Update the setup_graph function to add the human_review_node as a node in the graph:

async def setup_graph(tools):
    builder = StateGraph(State)
    run_tool = ToolNode(tools)
    builder.add_node(call_llm)
    builder.add_node('run_tool', run_tool)
    builder.add_node(human_review_node)  # Add the interrupt node here

    builder.add_edge(START, "call_llm")
    builder.add_conditional_edges("call_llm", route_after_llm)
    builder.add_edge("run_tool", "call_llm")

    memory = MemorySaver()
    return builder.compile(checkpointer=memory)

Handle Human Input During Runtime

Finally, let’s enhance your stream_responses function to detect when the graph is interrupted, prompt for a decision, and resume with human input using Command(resume={"action": user_input}).

After running the client, the graph should not look like this:

After running the client, your graph diagram (workflow_graph.png) will now include a human review node between the LLM and tool execution stages:

This ensures that you remain in control whenever the agent tries to make a decision that could alter permissions or bypass restrictions.

With this, you've successfully added human oversight to your AI agent, without rewriting your tools or backend logic.

Conclusion

In this tutorial, we built a secure, human-aware AI agent using Permit.io’s Access Request MCP, LangGraph, and LangChain MCP Adapters.

Instead of letting the agent operate unchecked, we gave it the power to request access and defer critical decisions to human users, just like a responsible team member would.

We covered:

  • How to model permissions and approval flows using Permit Elements and ReBAC
  • How to expose those flows via the Permit MCP server
  • How to build a LangGraph-powered client that invokes these tools naturally
  • And how to insert real-time human-in-the-loop (HITL) checks using interrupt()

Want to see the full demo in action? Check out the GitHub Repo.

Further Reading -

\

AI Security Posture Management (AISPM): How to Handle AI Agent Security

2025-06-25 15:12:07

AI Demands a New Security Posture

AI Security Posture Management (AISPM) is an emerging discipline focused on securing AI agents, their memory, external interactions, and behavior in real-time.

As AI agents become deeply embedded in applications, traditional security models aren’t really up for the task. Unlike static systems, AI-driven environments introduce entirely new risks—hallucinated outputs, prompt injections, autonomous actions, and cascading interactions between agents.

These aren’t just extensions of existing problems—they’re entirely new challenges that legacy security posture tools like DSPM (Data Security Posture Management) or CSPM (Cloud Security Posture Management) were never designed to solve.

AISPM exists because AI systems don’t just store or transmit data—they generate new content, make decisions, and trigger real-world actions. Securing these systems requires rethinking how we monitor, enforce, and audit security, not at the infrastructure level, but at the level of AI reasoning and behavior.

If you’re looking for a deeper dive into what machine identities are and how AI agents fit into modern access control models, we cover that extensively in “What is a Machine Identity? Understanding AI Access Control”. This article, however, focuses on the next layer: securing how AI agents operate, not just who they are.

Join us as we explain what makes AISPM a distinct and necessary evolution, explore the four unique perimeters of AI security, and outline how organizations can start adapting their security posture for an AI-driven world.

Because the risks AI introduces are already here, and they’re growing fast.

What Makes AI Security Unique?

Securing AI systems isn’t just about adapting existing tools—it’s about confronting entirely new risk categories that didn’t exist up to now.

As mentioned above, AI agents don’t just execute code—they generate content, make decisions, and interact with other systems in unpredictable ways. That unpredictability introduces vulnerabilities that security teams are only beginning to understand.

AI hallucinations, for example—false or fabricated outputs—aren’t just inconvenient; they can corrupt data, expose sensitive information, or even trigger unsafe actions if not caught.

Combine that with the growing use of retrieval-augmented generation (RAG) pipelines, where AI systems pull information from vast memory stores, and the attack surface expands dramatically.

Beyond data risks, AI systems are uniquely susceptible to prompt injection attacks, where malicious actors craft inputs designed to hijack the AI’s behavior. Think of it as the SQL injection problem, but harder to detect and even harder to contain, as it operates within natural language.

Perhaps the most challenging part of this is that AI agents don’t operate in isolation. They trigger actions, call external APIs, and sometimes interact with other AI agents, creating complex, cascading chains of behavior that are difficult to predict, control, or audit.

Traditional security posture tools were never designed for this level of autonomy and dynamic behavior. That’s why AISPM is not DSPM or CSPM for AI—it’s a new model entirely, focused on securing AI behavior and decision-making.

The Four Access Control Perimeters of AI Agents

Securing AI systems isn’t just about managing access to models—it requires controlling the entire flow of information and decisions as AI agents operate. From what they’re fed, to what they retrieve, to how they act, and what they output, each phase introduces unique risks.

As with any complex system, access control becomes an attack surface amplified in the context of AI. That’s why a complete AISPM strategy should consider these four distinct perimeters—each acting as a checkpoint for potential vulnerabilities:

1. Prompt Filtering — Controlling What Enters the AI

Every AI interaction starts with a prompt, and prompts are now an attack surface. Whether from users, other systems, or upstream AI agents, unfiltered prompts can lead to manipulation, unintended behaviors, or AI "jailbreaks".

Prompt filtering ensures that only validated, authorized inputs reach the model. This includes:

  • Blocking malicious inputs designed to trigger unsafe behavior
  • Enforcing prompt-level policies based on roles, permissions, or user context
  • Dynamically validating inputs before execution

For example, restricting certain prompt types for non-admin users or requiring additional checks for prompts containing sensitive operations like database queries or financial transactions.

2. RAG Data Protection — Securing AI Memory and Knowledge Retrieval

Retrieval-Augmented Generation (RAG) pipelines—where AI agents pull data from external knowledge bases or vector databases—add a powerful capability but also expand the attack surface. AISPM must control:

  • Who or what can access specific data sources
  • What data is retrieved based on real-time access policies
  • Post-retrieval filtering to remove sensitive information before it reaches the model

Without this perimeter, AI agents risk retrieving and leaking sensitive data or training themselves on information they shouldn’t have accessed in the first place.

Building AI Applications with Enterprise-Grade Security Using RAG and FGA” provides a practical example of RAG data protection for healthcare.

3. Secure External Access — Governing AI Actions Beyond the Model

AI agents aren’t confined to internal reasoning. Increasingly, they act—triggering API calls, executing transactions, modifying records, or chaining tasks across systems.

AISPM must enforce strict controls over these external actions:

  • Define exactly what operations each AI agent is authorized to perform

  • Track “on behalf of” chains to maintain accountability for actions initiated by users but executed by agents

  • Insert human approval steps where needed, especially for high-risk actions like purchases or data modifications

This prevents AI agents from acting outside of their intended scope or creating unintended downstream effects.

4. Response Enforcement — Monitoring What AI Outputs

Even if all inputs and actions are tightly controlled, AI responses themselves can still create risk, hallucinating facts, exposing sensitive information, or producing inappropriate content.

Response enforcement means:

  • Scanning outputs for compliance, sensitivity, and appropriateness before delivering them

  • Applying role-based output filters so that only authorized users see certain information

  • Ensuring AI doesn’t unintentionally leak internal knowledge, credentials, or PII in its final response

In AI systems, output is not just information—it’s the final, visible action. Securing it is non-negotiable.

Why These Perimeters Matter

Together, these four perimeters form the foundation of AISPM. They ensure that every stage of the AI’s operation is monitored, governed, and secured—from input to output, from memory access to real-world action.

Treating AI security as an end-to-end flow—not just a static model check—is what sets AISPM apart from legacy posture management. Because when AI agents reason, act, and interact dynamically, security must follow them every step of the way.

Best Practices for Effective AISPM

As we can already see, securing AI systems demands a different mindset—one that treats AI reasoning and behavior as part of the attack surface, not just the infrastructure it runs on. AISPM is built on a few key principles designed to meet this challenge:

Intrinsic Security — Guardrails Inside the AI Flow

Effective AI security can’t be bolted on. It must be baked into the AI’s decision-making loop—filtering prompts, restricting memory access, validating external calls, and scanning responses in real-time. External wrappers like firewalls or static code scans don’t protect against AI agents reasoning their way into unintended actions.

The AI itself must operate inside secure boundaries.

Continuous Monitoring — Real-Time Risk Assessment

AI decisions happen in real-time, which means continuous evaluation is critical.

AISPM systems must track agent behavior as it unfolds, recalculate risk based on new context or inputs, and adjust permissions or trigger interventions mid-execution if necessary.

Static posture reviews or periodic audits will not catch issues as they emerge. AI security is a live problem, so your posture management must be live, too.

Chain of Custody and Auditing

AI agents have the ability to chain actions—call APIs, trigger other agents, or interact with users—these all require extremely granular auditing.

AISPM must:

  • Record what action was performed
  • Who or what triggered it
  • Preserve the full "on-behalf-of" trail back to the human or system that originated the action.

This is the only way to maintain accountability and traceability when AI agents act autonomously.

Delegation Boundaries and Trust TTLs

AI systems don’t just act—they delegate tasks to other agents, services, or APIs. Without proper boundaries, trust can cascade unchecked, creating risks of uncontrolled AI-to-AI interactions.

AISPM should enforce strict scoping of delegated authority, Time-to-live (TTL) on trust or delegated access, preventing long-lived permission chains that become impossible to revoke, and enabling human review checkpoints for high-risk delegations.

Cryptographic Validation Between AI Agents

Lastly, as AI ecosystems grow, agents will need to trust—but verify—other agents' claims. AISPM should prepare for this future by supporting cryptographic signatures on AI requests and responses as well as tamper-proof logs that allow agents—and humans—to verify the source and integrity of any action in the chain.

This is how AI systems will eventually audit and regulate themselves, especially in multi-agent environments.

Tooling and Emerging Standards for AISPM

While AISPM is still an emerging discipline, we’re starting to see practical tools and frameworks that help put its principles into action, enabling developers to build AI systems with security guardrails baked into the flow of AI decisions and actions.

AI Framework Integrations for Access Control

Popular AI development frameworks like LangChain and LangFlow are beginning to support integrations that add identity verification and fine-grained policy enforcement directly into AI workflows. These integrations allow developers to:

  • Authenticate AI agents using identity tokens before allowing actions

  • Insert dynamic permission checks mid-workflow to stop unauthorized data access or unsafe operations

  • Apply fine-grained authorization to Retrieval-Augmented Generation (RAG) pipelines, filtering what the AI can retrieve based on real-time user or agent permissions.

These capabilities move beyond basic input validation, enabling secure, identity-aware pipelines in which AI agents must prove what they’re allowed to do at every critical step.

Secure Data Validation and Structured Access

Frameworks designed for AI application development increasingly support structured data validation and access control enforcement. By combining input validation with authorization layers, developers can:

This helps protect systems against accidental data leaks and intentional prompt manipulation by ensuring the AI operates strictly within its defined boundaries.

Standardizing Secure AI-to-System Interactions

Emerging standards like the Model Context Protocol (MCP) propose structured ways for AI agents to interact with external tools, APIs, and systems. These protocols enable:

  • Explicit permission checks before AI agents can trigger external operations

  • Machine identity assignment to AI agents, scoping their capabilities

  • Real-time authorization rules at interaction points, ensuring actions remain controlled and traceable

This is crucial for keeping AI-driven actions—like API calls, database queries, or financial transactions—accountable and auditable.

Looking Ahead: The Future of AISPM

The rapid evolution of AI agents is already pushing the boundaries of what traditional security models can handle. As AI systems grow more autonomous—capable of reasoning, chaining actions, and interacting with other agents—AISPM will become foundational, not optional.

One major shift on the horizon is the rise of risk scoring and trust propagation models for AI agents. Just as human users are assigned trust levels based on behavior and context, AI agents will need dynamic trust scores that influence what they’re allowed to access or trigger—especially in multi-agent environments where unchecked trust could escalate risks fast.

AISPM shifts security upstream into the AI’s decision-making process and controls behavior at every critical point.

As AI continues to drive the next wave of applications, AISPM will be critical to maintaining trust, compliance, and safety. The organizations that embrace it early will be able to innovate with AI without compromising security.

Read more about how Permit.io handles secure AI collaboration through a permissions gateway here.

If you have any questions, make sure to join our Slack community, where thousands of devs are building and implementing authorization.