PDF JavaScript Security Bypass Flaw

Alex Johnson
-
PDF JavaScript Security Bypass Flaw

Hey everyone! Today, we're diving deep into a rather concerning security vulnerability that's been found in PublicCMS versions up to and including V5.202506.d. This isn't just a minor bug; it's a loophole that could potentially lead to stored XSS (Cross-Site Scripting) attacks, and it all revolves around how the system handles PDF files. If you're using PublicCMS, or if you handle file uploads in your web applications, you'll definitely want to pay attention to this.

What's the Big Deal with this PDF Vulnerability?

So, the core issue here is that PublicCMS versions V5.202506.d and earlier are susceptible to stored XSS attacks through PDF file uploads. What does that mean in plain English? It means an attacker can upload a PDF file that's been specially crafted with malicious JavaScript code. The really sneaky part is that this malicious code can bypass the security checks that are supposed to be in place within the backend, specifically in a file called CmsFileUtils.java. This bypass then allows the JavaScript to be embedded and executed when a user views the uploaded PDF. The implications are pretty serious. Imagine an attacker uploading a seemingly innocent PDF to your website. When a user clicks on it, that hidden JavaScript could steal their login credentials, execute commands on your system via APIs, or perform other malicious actions, all without the user even realizing they've been compromised. This vulnerability affects all the usual file upload points, like /cmsTemplate/save, /file/doUpload, /cmsTemplate/doUpload, and others, making it a widespread threat across the application.

How Can an Attacker Exploit This?

Let's break down the steps an attacker would typically take to exploit this flaw. It’s a bit technical, but understanding the process helps us appreciate the cleverness (and maliciousness!) of the attack.

  1. Get the Malicious PDF: The first step for an attacker is to obtain or create a PDF file that contains the exploit. There are tools and techniques to craft these, and in this specific case, a PoC (Proof of Concept) PDF was shared, designed to get past the isSafe function validation in CmsFileUtils.java. You can find an example of such a payload at this GitHub link. It’s crucial to understand that this PDF isn't just a regular document; it’s a vessel for executing code.

  2. Upload the PDF: Once the attacker has the malicious PDF, they need to get it onto the target system. They do this by uploading it through any of the application's file upload endpoints. As mentioned earlier, these include paths like /cmsTemplate/save, /file/doUpload, and others. The system, thinking it's just another file, accepts it.

  3. Trigger the XSS: After the PDF is uploaded, it's typically stored in the website's static directory. The vulnerability lies in the fact that when a user accesses and attempts to view this PDF, the embedded JavaScript is executed. The images provided in the original report show the structure of the exploit and the results of the attack, illustrating how the malicious PDF is placed and how it could be triggered.

This process highlights a fundamental breakdown in the file validation and sanitization process. The system thinks it's safe, but the attacker found a way to hide the dangerous payload where the security checks couldn't find it. This is a classic example of how attackers exploit complex file formats to hide malicious code.

Digging into the Root Cause: Why Did This Happen?

To truly fix a problem, we need to understand why it happened in the first place. In this scenario, the root cause is a surprisingly simple oversight in the PDF validation logic within PublicCMS. The developers were checking for JavaScript, which is good, but they were only looking in the most obvious places.

The Vulnerable Code: A Shallow Check

The critical piece of code in question is the isSafe method within CmsFileUtils.java. Let's look at what it does:

private static boolean isSafe(List<COSObject> pdfObjects) {
    for (COSObject object : pdfObjects) {
        COSBase realObject = object.getObject();
        if (realObject instanceof COSDictionary) {
            COSDictionary dic = (COSDictionary) realObject;

            // ⚠️ VULNERABILITY: Only checks top-level keys
            if (null != dic.getDictionaryObject(COSName.JS)) {
                return false;
            }
        }
    }
    return true;
}

See that comment: // ⚠️ VULNERABILITY: Only checks top-level keys? That’s the smoking gun. This code iterates through PDF objects and checks if any of them directly contain a key named /JS. If it finds one, it flags the file as unsafe. However, it doesn't look any deeper. PDFs are complex documents, and malicious code can be hidden in nested structures, like dictionaries within dictionaries.

The Attack Vector: Hiding in Plain Sight (Sort Of)

Attackers are masters at exploiting such limitations. In this case, the vulnerability is exploited by embedding the JavaScript code within a nested dictionary, specifically under the /AA (Additional Actions) key, and then within that, under an /O (Open Action) key, which finally contains the /JS key. Here's how that structure looks:

Page Object (3 0 obj)
├── /Type: /Page
├── /AA: Dictionary                    ← Additional Actions
│   └── /O: Dictionary                 ← Open Action
│       └── /JS: JavaScript Code       ← HIDDEN HERE!
├── /Contents: 4 0 R
└── /Resources: {...}

The vulnerable isSafe function would look at the Page Object, see the /AA key, but because /JS isn't directly attached to the top-level Page Object (it's buried deeper), the function incorrectly determines that the file is safe. It never gets to the nested /JS key because the check is too shallow.

Analyzing the Detection Flow

Let's map out how the current system handles this, and what it should be doing:

Step Current Behavior Expected Behavior
1. Load Object 3 ✓ Loaded ✓ Loaded
2. Check top-level /JS Returns null Should recurse
3. Check /AA dictionary Not checked Should check
4. Check nested /JS Not reached Should detect
5. Final result true (Safe) false (Unsafe)

As you can see, the process stops prematurely. The system successfully loads the object, checks the immediate keys, finds no direct /JS entry, and declares victory, even though malicious code is present just a few levels down. The Proof of Concept (PoC) PDF provided demonstrates this exact technique, where the JavaScript is placed within the /AA/O/JS structure. The app.alert("XSS") function is a standard way to test for XSS, popping up a message box when executed.

Technical Limitations Exposed

This vulnerability highlights several critical limitations in the original security check:

  • Lack of Recursive Inspection: The most obvious flaw is that the code doesn't look inside nested structures. It only performs a superficial check. This also means it likely wouldn't handle indirect object references properly, where one object points to another for its content.
  • Incomplete Keyword Coverage: The check was too limited in what it looked for. It primarily focused on the direct /JS key. However, PDFs have many features that can trigger code execution or data exfiltration. Things like /AA (Additional Actions), /OpenAction (actions that run when the document is opened), /Launch (executing external programs), /SubmitForm (sending data from forms), and /ImportData (bringing data into the PDF) are all potential vectors that were missed.
  • No Action Chain Analysis: PDF actions can be chained together. For instance, an /OpenAction might lead to another action, which then leads to JavaScript execution. The current check doesn't follow these chains, stopping as soon as it doesn't find a direct /JS key at the top level.

Understanding these limitations is key to developing a robust solution. The fix needs to be much more thorough, looking deeper into the PDF structure and considering a wider range of potentially dangerous features.

How to Fix This: Recommendations for a Stronger Defense

Now that we've understood the problem and its root cause, let's talk about solutions. The good news is that the developers have provided clear recommendations for how to patch this vulnerability, focusing on making the security checks much more comprehensive.

1. Implement Recursive Detection for Deeper Inspection

The most crucial fix is to move beyond shallow checks and implement a recursive inspection mechanism. This means the validation function needs to be able to dive into nested dictionaries, arrays, and streams to find potentially malicious content, no matter how deeply it's hidden.

Enhanced Detection Function

Instead of just looping through top-level objects, the new approach involves a function that can recursively traverse the entire PDF object structure. This function, let's call it isSafeObject, would take a PDF object and a set of already visited objects (to prevent infinite loops) and analyze it. If the object is a dictionary, array, or stream, it calls specialized functions to check those types.

/**
 * Comprehensive PDF safety check with recursive inspection
 * @param pdfObjects List of PDF objects to validate
 * @return true if safe, false if malicious content detected
 */
private static boolean isSafe(List<COSObject> pdfObjects) {
    for (COSObject object : pdfObjects) {
        if (!isSafeObject(object.getObject(), new HashSet<>())) {
            return false;
        }
    }
    return true;
}

/**
 * Recursively inspect PDF object for malicious content
 * @param base PDF object to inspect
 * @param visited Set of visited objects to prevent infinite loops
 * @return true if safe, false if malicious
 */
private static boolean isSafeObject(COSBase base, Set<COSBase> visited) {
    // Prevent infinite recursion
    if (visited.contains(base)) {
        return true;
    }
    visited.add(base); // Mark as visited

    if (base instanceof COSDictionary) {
        return isSafeDictionary((COSDictionary) base, visited);
    } else if (base instanceof COSArray) {
        return isSafeArray((COSArray) base, visited);
    } else if (base instanceof COSStream) {
        return isSafeStream((COSStream) base, visited);
    }

    return true; // Assume safe if not a complex type we need to inspect
}

This isSafeObject function acts as the central router, delegating the actual inspection to type-specific methods.

Dictionary Inspection (Going Deeper)

When the system encounters a dictionary, it needs to do more than just check for /JS. It should look for a wider range of dangerous keywords and also recursively check all the values within that dictionary. This ensures that even if /JS isn't directly present, a dangerous action or script could be hiding within its contents.

private static boolean isSafeDictionary(COSDictionary dict, Set<COSBase> visited) {
    // Define dangerous keywords - expanded list
    String[] dangerousKeys = {
        "JS", "JavaScript",           // Direct JavaScript
        "AA", "OpenAction",            // Action triggers
        "Launch", "SubmitForm",        // External interactions
        "ImportData", "GoToR",         // Remote references
        "Sound", "Movie",              // Media with potential exploits
        "RichMedia", "3D"              // Complex embedded content
    };

    // Check for dangerous keys at current level
    for (String key : dangerousKeys) {
        COSBase value = dict.getDictionaryObject(COSName.getPDFName(key));
        if (value != null) {
            // Specific checks for known dangerous types
            if (key.equals("JS") || key.equals("JavaScript")) {
                // Found direct JavaScript, clearly unsafe
                return false;
            }
            // For action-related keys, we need to inspect the action chain
            if (!isSafeAction(value, visited)) {
                return false;
            }
        }
    }

    // IMPORTANT: Recursively check all values in the dictionary
    // This ensures we don't miss nested dangerous items
    for (COSBase value : dict.getValues()) {
        if (!isSafeObject(value, visited)) {
            return false;
        }
    }

    return true; // If no dangers found after thorough check
}

This updated dictionary check now includes a broader list of potentially harmful keys and, crucially, iterates through all values within the dictionary, ensuring that any nested elements are also inspected recursively.

Action Chain Validation (Following the Path)

PDFs can link actions together. An OpenAction might trigger a Next action, which then executes JavaScript. The new validation needs to follow these chains. The isSafeAction function is designed for this:

private static boolean isSafeAction(COSBase action, Set<COSBase> visited) {
    // If it's not a dictionary, it's likely not an action structure we need to parse further
    if (!(action instanceof COSDictionary)) {
        return true;
    }

    COSDictionary actionDict = (COSDictionary) action;

    // Check the action type (e.g., /S /JavaScript)
    COSBase actionType = actionDict.getDictionaryObject(COSName.S);
    if (actionType instanceof COSName) {
        String type = ((COSName) actionType).getName();

        // Blacklist dangerous action types explicitly
        String[] dangerousActions = {
            "JavaScript", "Launch", "SubmitForm",
            "ImportData", "GoToR", "Sound", "Movie"
        };

        for (String dangerous : dangerousActions) {
            if (type.equals(dangerous)) {
                return false; // Found a dangerous action type
            }
        }
    }

    // Explicitly check for JavaScript embedded directly in the action dictionary
    if (actionDict.getDictionaryObject(COSName.JS) != null) {
        return false;
    }

    // Crucially, follow the action chain using the /Next key
    COSBase next = actionDict.getDictionaryObject(COSName.getPDFName("Next"));
    if (next != null && !isSafeAction(next, visited)) {
        return false; // If the next action in the chain is unsafe, the whole chain is unsafe
    }

    // Also, recursively check all other properties of the action dictionary
    return isSafeObject(actionDict, visited);
}

This function checks the action type, looks for direct JavaScript, and importantly, follows the /Next pointers to ensure no malicious actions are hidden further down the line. It also ensures that the action dictionary itself is safe by calling isSafeObject recursively.

Array and Stream Inspection

PDFs can also contain arrays and streams, which might embed executable code or other complex structures. These also need to be checked:

private static boolean isSafeArray(COSArray array, Set<COSBase> visited) {
    for (COSBase item : array) {
        // Recursively check each item in the array
        if (!isSafeObject(item, visited)) {
            return false;
        }
    }
    return true;
}

private static boolean isSafeStream(COSStream stream, Set<COSBase> visited) {
    // First, check the stream's dictionary for dangerous keys
    if (!isSafeDictionary(stream, visited)) {
        return false;
    }

    // Optional but recommended: Scan the stream's actual content
    // This is more complex as it might involve decompressing and parsing
    try {
        String content = new String(stream.toByteArray());
        if (containsObfuscatedJS(content)) {
            return false;
        }
    } catch (IOException e) {
        // If we can't read the stream, it's safer to treat it as suspicious
        // Log this error appropriately.
        return false;
    }

    return true;
}

These functions ensure that elements within arrays and the content of streams are also scrutinized, preventing attacks that might hide within these structures.

2. Implement Additional Security Measures

Beyond the core recursive checking, there are other layers of defense that can be implemented to catch more sophisticated attacks.

Content Analysis for Obfuscated JavaScript

Attackers often try to hide their JavaScript code by obfuscating it, making it harder for simple string checks to detect. This involves techniques like using character codes, encoding, or complex function constructions. A function like containsObfuscatedJS can look for common patterns associated with obfuscated JavaScript:

/**
 * Detect obfuscated JavaScript patterns using regular expressions
 */
private static boolean containsObfuscatedJS(String content) {
    // A list of common patterns used in obfuscated JavaScript
    String[] jsPatterns = {
        "eval\\s*\\(",                    // eval() function, often used for dynamic code execution
        "Function\\s*\\(",                // Function constructor, another way to execute arbitrary code
        "app\\.(alert|launchURL)",        // Common Adobe JavaScript API calls that can be exploited
        "this\\.exportDataObject",        // Method used for data exfiltration
        "util\\.printf",                  // Potential for format string vulnerabilities
        "\\\\u[0-9a-fA-F]{4}",           // Unicode escapes, a common obfuscation technique
        "String\\.fromCharCode"
        // Add more patterns as needed for more comprehensive detection
    };

    for (String pattern : jsPatterns) {
        // Using String.matches with a pattern that searches anywhere in the string
        if (content.matches(".*" + pattern + ".*")) {
            return true; // Obfuscated JS pattern found
        }
    }

    return false; // No suspicious patterns detected
}

This function uses regular expressions to scan the content for known obfuscation techniques. While not foolproof, it significantly increases the chances of catching cleverly hidden scripts.

Enhanced Logging and Monitoring

When a potential threat is detected, it's not enough to just reject the file. Effective logging and monitoring are crucial for understanding attack attempts, debugging issues, and improving defenses over time. The isSafeWithLogging function demonstrates this:

/**
 * Enhanced detection that also logs identified threats
 * @param pdfObjects List of PDF objects to validate
 * @return true if safe, false if malicious content detected
 */
private static boolean isSafeWithLogging(List<COSObject> pdfObjects) {
    List<String> threats = new ArrayList<>(); // To store details of detected threats

    // Start the recursive threat detection process
    for (COSObject object : pdfObjects) {
        // The detectThreats function will populate the 'threats' list
        detectThreats(object.getObject(), "", threats, new HashSet<>());
    }

    // If any threats were found, log them and return false
    if (!threats.isEmpty()) {
        // Assuming 'logger' is a properly configured logger instance
        // logger.warn("PDF Security Threats Detected:");
        for (String threat : threats) {
            // logger.warn("  - " + threat);
        }
        return false; // File is deemed unsafe
    }

    return true; // No threats found, file is safe
}

/**
 * Recursively detects and records threats within PDF objects.
 * @param base The current PDF object being inspected.
 * @param path The path/key leading to this object from the root.
 * @param threats A list to accumulate descriptions of detected threats.
 * @param visited A set to keep track of visited objects to prevent infinite loops.
 */
private static void detectThreats(COSBase base, String path, List<String> threats, Set<COSBase> visited) {
    // Prevent infinite loops by checking if we've already processed this object
    if (visited.contains(base)) {
        return;
    }
    visited.add(base); // Mark this object as visited

    if (base instanceof COSDictionary) {
        COSDictionary dict = (COSDictionary) base;

        // Check for specific dangerous keys directly within the dictionary
        if (dict.getDictionaryObject(COSName.JS) != null) {
            threats.add("JavaScript found at: " + path + "/JS");
        }
        // Check for Additional Actions, a common place to hide exploits
        if (dict.getDictionaryObject(COSName.getPDFName("AA")) != null) {
            threats.add("Additional Actions found at: " + path + "/AA");
        }

        // Recurse through all key-value pairs in the dictionary
        // This is essential for finding nested threats
        for (COSName key : dict.keySet()) {
            detectThreats(dict.getDictionaryObject(key), // Recursively call for the value
                         path + "/" + key.getName(), // Update the path
                         threats, visited); // Pass along the threats list and visited set
        }
    } else if (base instanceof COSArray) {
        // If it's an array, iterate through its elements and recurse
        COSArray array = (COSArray) base;
        for (int i = 0; i < array.size(); i++) {
            detectThreats(array.getObject(i),
                         path + "[" + i + "]", // Path includes array index
                         threats, visited);
        }
    } else if (base instanceof COSStream) {
        // If it's a stream, we can also inspect its dictionary part
        // and potentially its content (though content analysis is complex)
        detectThreats(((COSStream) base).getCOSObject(), // Inspect the dictionary part of the stream
                     path + "/StreamDictionary",
                     threats, visited);
        // Optionally, add stream content analysis here similar to containsObfuscatedJS
    }
}

This enhanced function not only identifies threats but also logs detailed information about where they were found. This is invaluable for incident response and security auditing. By implementing these recursive checks, broader keyword detection, action chain analysis, and content scanning, PublicCMS can significantly strengthen its defenses against malicious PDF uploads.

Conclusion: Staying Vigilant with File Uploads

This vulnerability in PublicCMS serves as a potent reminder that securing file uploads is a critical aspect of web application security. The attackers cleverly exploited the complexity of the PDF format and a superficial security check to achieve code execution. By understanding the root cause – the shallow inspection of PDF objects – and implementing the recommended fixes, developers can build more robust defenses. The key takeaways are the necessity of deep, recursive inspection of file structures, a comprehensive blacklist of dangerous keywords and actions, and proactive content analysis.

For those using PublicCMS, ensuring you're updated to a version that addresses this vulnerability is paramount. For developers building similar systems, remember that sanitizing and validating any user-uploaded content, especially complex file types like PDFs, requires a thorough and layered approach. Never trust user input, and always assume that attackers will find the easiest path to exploit any weakness.

If you want to learn more about general web security best practices and common vulnerabilities, I highly recommend checking out resources like the OWASP Foundation (https://owasp.org/). They provide a wealth of information, guides, and tools to help developers build more secure applications.

You may also like