Run AI Security Testing Locally: Caido Shift + Ollama for Data-Sensitive Engagements

Why Cloud LLMs Are a Liability for Pentesters

Every HTTP request you send through a cloud LLM during a pentest or bug bounty is a potential liability. Auth tokens, session cookies, API keys, PII – all of it flowing to a third party. For anyone on the red team side doing security assessments, that’s a compliance nightmare waiting to happen.

Caido: https://caido.io

Enter Caido’s Shift plugin: Shift Acquisition | Caido

Caido’s Shift plugin supports external LLMs, and it seems like most of the docs point it at OpenRouter which lets a tester easily wire Caido up to public models like OpenAI, Google, Anthropic, etc. I wanted to see if I could keep everything local using Ollama. Turns out you can, and the setup is straightforward once you know the gotchyas. This post walks through the architecture I used in the lab, the actual config that works, and a demo where a local model found and exploited a SQL injection in OWASP Juice Shop.

Another benefit, other than HW and electricity costs, using the local LLM is essentially free. If it’s free, it’s for me.

Architecture Overview

The stack is simple:

Caido → Shift plugin → Ollama → qwen3-coder-32k

And, of course, good old dockerized Juice Shop for demo purposes.

docker run -d -p 1337:3000 bkimminich/juice-shop

Caido acts as your intercepting proxy. Shift is Caido’s AI plugin that sends HTTP requests to an LLM for analysis. Shift expects an OpenAI-compatible API endpoint, which Ollama provides natively at /v1/chat/completions.

The key insight: skip any middleware like LiteLLM. I initially tried using LiteLLM as a translation layer and ran into tool calling issues – qwen3 models output tool calls as XML in the content field instead of the structured tool_calls array that Shift expects. Pointing Shift directly at Ollama’s native /v1 endpoint avoids this entirely.

With this, all of my traffic stays local. If I am working on a bug bounty or client engagement, I don’t have to worry about accidentally shipping sensitive info (API keys, etc.) to a 3rd party. The data never leaves my machine. Good stuff!

Setting Up Ollama for Local Inference

This post walks through setting up Ollama with Qwen3-Coder:

Running Claude Code with Local Models via Ollama

It has the detailed steps on how to configure and why I have a 32k context window for performance. On an RTX 3090 with 24GB of VRAM, this model is super fast and very powerful.

For my lab, good to go!

Configuring Caido Shift for Local Models

Adding the shift plugin into Caido is super easy.

Caido -> Plugins -> Official

Once that’s installed, you can find Shift at the bottom of your left-hand nav (may require a Caido restart)

Then it’s onto Settings:

Once we’re into settings, we head to AI and configure OpenAI. Why OpenAI?

Ollama exposes /v1/chat/completions specifically to be compatible with anything expecting an OpenAI endpoint. So when you configure Shift under “OpenAI” settings, you’re really just telling it to use the OpenAI API format – the actual backend can be OpenAI, Ollama, LM Studio, vLLM, or anything else that speaks the same protocol.

Port 11434 is where I have Ollama running – the API Key can be whatever – it’s not used but it does have to be populated.

Now, we need to add our model.

Main dashboard -> Plugins -> Shift -> Models -> Add Custom Model

Note we configured for openai, gave it a name and then the Model ID. The Model ID has to match what Ollama presents with the list command:

Since I am using my 32k model, I put in there qwen3-coder-32k:latest to ensure I am using the intended model and it resolves. I believe (this is untested) that checking the ‘Is Reasoning Model’ is fine as I think Ollama will ignore it if the model is not a reasoning one, but I didn’t exercise all of the options here. For now, I am leaving ‘Is Reasoning Model’ unchecked.

And that’s it!

Lab: Finding SQL Injection in Juice Shop

Now, let’s do a simple attack. If you have played with Juice Shop, you know it’s purposefully vulnerable to all kinds of OWASP (and other) attacks. There’s a known vulnerability with the login so let’s target that.

Let’s launch Chrome via Caido and then navigate to our login:

Put whatever into the email and password and hit ‘Log in’. You get the ‘Invalid email or password.’ in the browser and you get the request/response in the Caido HTTP History:

Now, let’s right mouse click and send to Replay:

We now have a Shift collection. Let’s send it there – and this gives us a new popup:

I chose to add the SQL Injection skill – I will show why in a bit. With this simple of a vuln, it’s possible the model would find the issue regardless, but selecting the skill allows you to add additional instructions which steers the model towards our goal:

Under Replay, there’s a flyout from the right where the underlying AI is doing the work:

In just a few minutes:

Great! And, we can still interact with the AI as well. There’s a chat prompt available. I can ask for additional context and everything.

This is great! I have an AI assistant that can help me test surfaces as well as give me additional contextual information without any of the data leaving my machine. I am no longer potentially exposing sensitive client information to public AI agents/cloud services. Instead, it’s all local.

What the Skill Actually Does

Circling back on why to select the skill. I actually did an additional MiTM and captured the prompts being sent both with and without the skill selected. The prompt being sent is pretty substantial and the skill simply adds to it! I should note that on the first attempt without the skill added, the AI did fail to identify the vulnerability. It did successfully ID this simple vuln on each subsequent attempt, though.

If you want to see what is added to the prompt, we can see that in Caido.

Dashboard -> Plugins -> Shift -> Skills

For each of the skills, you can open them up and see the additional prompt language and even edit them. The best part, you can create your own custom skills (button at the top). This is huge.

Summary

This works. A local model found and exploited a real SQL injection vulnerability without a single byte leaving my machine. The key is the skill – without it, qwen tried a few payloads and declared the endpoint safe. With the SQLi skill enabled, it got the structured methodology it needed to keep escalating until it found the auth bypass.

For anyone doing security assessments where client data matters – and it should always matter – this setup removes the compliance headache entirely. No API costs, no third-party data exposure, no awkward conversations about where that auth token ended up. You just need to have the HW and the local LLM setup to assist.

Shift also supports custom skills, which means you can build your own testing playbooks and have the local model execute them systematically. I can’t wait to play with this more.

Happy hacking!