Giter VIP home page Giter VIP logo

har-sanitizer's Introduction

HAR Sanitizer

https://har-sanitizer.pages.dev/

Development

To create a hot reload server running at localhost:3001

npm run dev

Example Request for API.

jq '{"har": .}' < /Users/jroyal/Downloads/test_har_sanitizer.har  | curl -X POST -H "Content-Type: application/json" --data @- localhost:3001/scrub > scrubbed.har

har-sanitizer's People

Contributors

247arjun avatar benpbolton avatar jroyal avatar qainsights avatar techbrunch avatar third774 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

har-sanitizer's Issues

Dockerfile - [ERROR] Error: Could not proxy request: TypeError: fetch failed

I am trying to dockerize the sanitizer. Here is my Dockerfile.

FROM ubuntu:latest

# Install required packages
RUN apt-get update && \
    apt-get install -y ca-certificates curl gnupg git && \
    mkdir -p /etc/apt/keyrings && \
    curl -fsSL https://deb.nodesource.com/gpgkey/nodesource-repo.gpg.key | gpg --dearmor -o /etc/apt/keyrings/nodesource.gpg

ENV NODE_MAJOR=20
RUN echo "deb [signed-by=/etc/apt/keyrings/nodesource.gpg] https://deb.nodesource.com/node_$NODE_MAJOR.x nodistro main" | tee /etc/apt/sources.list.d/nodesource.list

RUN apt-get update && \
    apt-get install nodejs -y

# Check the installed versions
RUN git clone https://github.com/cloudflare/har-sanitizer.git

# Change directory to the "har-sanitizer" directory
WORKDIR /har-sanitizer

# Install project dependencies
RUN npm install

EXPOSE 3001

CMD [ "npm", "run", "dev:wrangler"]

I am getting below error for /. But for /scrub it is working.

[ERROR] Error: Could not proxy request: TypeError: fetch failed

Any insights please?

Domain not directly associated with Cloudflare reduces trust

I appreciate your work on this project and it's really useful.

When I googled for Cloudflare HAR sanitizer I found the domain: https://har-sanitizer.pages.dev/

However, I was hesitant to use the page as anyone could have made that website and put the Cloudflare name and logo on that domain.
I only could verify it's actually a Cloudflare domain by going to your blog post and going to that page via the link.

To improve the trust in the tool it would be useful to put it on the Cloudflare domain, e.g. https://har-sanitizer.cloudflare.com/

Request and Response Data is not equally sanitized

When sanitizing application/json MIME type in a HAR file, the response body is properly sanitized.
However, the request body including JSON is not sanitized. This is problematic if sensitive data, such as authentication credentials, are included in a JSON request body.
Screenshot 2024-01-22 at 1 55 57 PM

Query params not properly sanitized from urls located within request/response bodies.

As the title states, there is an edge case for the regex that covers [name]=[value] in which it doesn't hit a final query param in a JSON formatted request/response body. I believe this issue occurs because the given delimiters, (&|\\\\",|",|"\\s|"}}|;){1}, don't cover this case.

For an example, I've selected X-Amz-Signature to be sanitized.
Screenshot 2024-07-11 at 4 13 55 PM

Seen here, the X-Amz-Signature is located within the response content and it was not properly sanitized.

Screenshot 2024-07-11 at 4 17 10 PM

A possible solution I've come up with to fix this issue is to change the [name]=[value] regex to (?=&|\\\\",|",|"\\s|"}}|;|\\\\"){1}. Two things have been added: a positive lookahead to prevent accidental consuming of the & character and an additional delimiter \\\\" which will allow the sanitizer to be properly bounded.

HAR produced after sanitization is no longer valid json

This results in an error like SyntaxError: Expected ',' or ']' after array element in JSON at position 676059 (line 8188 column 9) when I try to download.

Here is a snippet of a request (I've already replaced some of the values manually so I could post it here) from my file to give a sample of the problem which I've stringified for testing.

	let snippet = JSON.stringify(
		{
			request: {
				method: "POST",
				url: "https://www.example.com",
				queryString: [],
				cookies: [
					{
						name: "TS0143f862",
						value:
							"12345678900000000000000000000000000000000000000000000000000000000000000000000000000000000000000000",
						path: "/",
						domain: "www.example.net",
						expires: "1969-12-31T23:59:59.000Z",
						httpOnly: false,
						secure: false,
					},
					{
						name: "TS01ae66c9",
						value:
							"12345678900000000000000000000000000000000000000000000000000000000000000000000000000000000000000000",
						path: "/",
						domain: "www.example.net",
						expires: "1969-12-31T23:59:59.000Z",
						httpOnly: false,
						secure: false,
					},
				],
				headersSize: 1862,
				bodySize: 86,
				postData: {
					mimeType: "application/json",
					text: '{"pageIndex":0,"pageSize":10,"sortByColumn":"Date","sortByColumnDesc":true,"filter":1}',
				},
			},
		},
		null,
		2,
	);

From my original file I've selected just cookie TS0143f862 to be sanitized.

I've narrowed the issue down to :
wordSpecificScrubList[1] (from const wordSpecificScrubList = wordList.map((word) => buildRegex(word)); )

which for my original file was:

{
regex: /("name": "TS0143f862",[\s\w+:"-\%!*()`~'.,#]*?"value": ")([\w+-_:&\+=#~/$()\.\,\*\!|%"'\s;{}]+?)("[\s]+){1}/g
replacement: "$1[TS0143f862 redacted]$3"
}

.replace(regex, replacement) on the provided snippet outputs this string, note how the cookies array is no longer closed.

 "request": {
          "method": "POST",
          "url": "https://www.example.com",
          "queryString": [],
          "cookies": [
            {
              "name": "TS0143f862",
              "value": "[TS0143f862 redacted]"
          }
        }

Unable to sanitize mbox cookie

When trying to sanitize a HAR file generated from Microsoft Entra, it fails to sanitize mbox cookies. It presents an error message similar to:

SyntaxError: Expected ',' or ']' after array element in JSON at position ******* (line ***** column ******)

Send scrub options in curl

How can we include scrub options when using curl? I do not see any network calls when I customize the options.

Got error

error when download HAR file after import and select object select elements to sanitized from the HAR file.

SyntaxError: Expected ',' or '}' after property value in JSON at position 51809 (line 1196 column 43)

Question: Why did Cloudflare take the regex approach?

I think there is likely a good reason for using regex to grab items across boundries, but I can't figure out what it is. Any chance you can share why this architectural approach was taken? My guess is this decision is what leads to issues like #22

Thanks!

Options to redact fields in POST bodies

I could be mistaken: it appears the HAR sanitizer doesn't redact any fields submitted via a POST body.

For example, I would expect an option to redact the password field as included here:

         "postData": {
            "mimeType": "application/x-www-form-urlencoded",
            "text": "password=fake-password&submit=",
            "params": [
              {
                "name": "password",
                "value": "fake-password"
              },
              {
                "name": "submit",
                "value": ""
              }
            ]
          }

JSON could not be parsed

I exported a .har file from Firefox 118.0.2, and while the file was correctly analysed for metadata, downloading it fails with:

SyntaxError: JSON.parse: expected ',' or ']' after array element at line 274 column 24 of the JSON data

The input har is json-valid, so something goes wrong during the transformation. What could be wrong? Ironically I can't provide it here due to sensitive information.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.