ryanmarcus / dirty-json Goto Github PK
View Code? Open in Web Editor NEWA parser for invalid JSON
License: GNU Affero General Public License v3.0
A parser for invalid JSON
License: GNU Affero General Public License v3.0
Hi
I was using v0.7, and this was in my unit tests
json: 'id: \"test\"\nlang: \"en\"\nresult {\n source: \"agent\"\n}',
expect: `{"id":"test","lang":"en","result": { "source":"agent"}}`
After upgrading to 0.8 or 0.9 this is broken and it returns below json
expect: `{"id":"test","lang":"en","result ": { "source":"agent"}}`
There is an extra white space after result
I have an environment where I cannot easily use npm, is there a way to use your library without it?
Awesome script.
I have a rather large chunk of I'm trying to clean. When dirty-json hits an error I get "Error: Got a :value that can't be handled" or similar, and a stack trace that isn't very helpful.
It would be really handy if I could see a decent snippet of the offending JSON so I can manually try to fix it before re-running.
{
sku: {
"price":"",
"retailPrice":"",
"canBookCount":"79779387",
"saleCount":"222153",
"priceRange":[[1, 1.05], [50, 0.95], [10000000, 0.55]],
"priceRangeOriginal":[[1, 1.05], [50, 0.95], [10000000, 0.55]],
"skuProps":[{"prop":"颜色","value":[{"imageUrl":"https://cbu01.alicdn.com/img/ibank/2018/982/811/9109118289_987926572.jpg","name":"S码黄色"},{"imageUrl":"https://cbu01.alicdn.com/img/ibank/2018/640/055/9068550046_987926572.jpg","name":"M码黄色"},{"imageUrl":"https://cbu01.alicdn.com/img/ibank/2018/973/925/9068529379_987926572.jpg","name":"L码黄色"},{"imageUrl":"https://cbu01.alicdn.com/img/ibank/2018/467/301/9109103764_987926572.jpg","name":"XL码黄色"},{"imageUrl":"https://cbu01.alicdn.com/img/ibank/2018/681/082/9090280186_987926572.jpg","name":"S码橘色"},{"imageUrl":"https://cbu01.alicdn.com/img/ibank/2018/446/715/9068517644_987926572.jpg","name":"M码橘色"},{"imageUrl":"https://cbu01.alicdn.com/img/ibank/2018/161/445/9068544161_987926572.jpg","name":"L码橘色"},{"imageUrl":"https://cbu01.alicdn.com/img/ibank/2018/401/631/9109136104_987926572.jpg","name":"XL码橘色"}]}],
"skuMap":{"S码黄色":{"specId":"a40a20fc34fbcb0424f31fc752744e8f","price":"0.96","saleCount":27847,"discountPrice":"0.96","canBookCount":9972150,"skuId":3531833413668},"M码黄色":{"specId":"9f62f460d40c76020bd2988c6c9a5e04","price":"0.96","saleCount":49396,"discountPrice":"0.96","canBookCount":9952303,"skuId":3531833413669},"L码黄色":{"specId":"a49f562f02242ed2792065cc31993c1c","price":"0.96","saleCount":64712,"discountPrice":"0.96","canBookCount":9935236,"skuId":3531833413670},"XL码黄色":{"specId":"cc20ec377117159c9e041f4fcae1d466","price":"0.96","saleCount":27314,"discountPrice":"0.96","canBookCount":9972640,"skuId":3531833413671},"M码橘色":{"specId":"7dbd5012fdca6243605e8eb96cb84517","price":"0.96","saleCount":15784,"discountPrice":"0.96","canBookCount":9984215,"skuId":3531833413673},"S码橘色":{"specId":"75a788ff47d799a867b6f4bd259b5f50","price":"0.96","saleCount":8306,"discountPrice":"0.96","canBookCount":9991684,"skuId":3531833413672},"L码橘色":{"specId":"77e3772af66028fa531f01dbb15b3e90","price":"0.96","saleCount":15277,"discountPrice":"0.96","canBookCount":9984723,"skuId":3531833413674},"XL码橘色":{"specId":"25d602eea57e3318b9ce34349bc18cfc","price":"0.96","saleCount":13515,"discountPrice":"0.96","canBookCount":9986436,"skuId":3531833413675}},
},
"end": 0
}
Hello, I have a 5MB JSON file that sometimes comes with some problems, like missing closing-bracket, missing commas, or missing array closing. Your script seems to get into an infinite loop when running towards this JSON file.
I have done a small JSON file with the same problems that the big one and your script returned no expected result.
const r = dJSON.parse(`[{"id":2222,"username":"a","maxViewers":927,"userId":2222,"countryCode":"br","viewers":290,"broadcastTime":148,"broadcastType":"solo","goldPartyWinner":false,"accessLevel":"general","dailyAward":true,"monthlyAward":false,"adminVerified":true,"hdStream":false,"vrStream":false,"showType":"NORMAL","source":"desktop","privateRoom":false,"newPerformer":false,"blockedRegions":["604"],"languages":["pt","en","es"]`);
console.log(JSON.stringify(r));
Result: "["
https://github.com/RyanMarcus/dirty-json/issues
return console.warn("dirty-json got valid JSON that failed with the custom parser. We're returning the valid JSON, but please file a bug report here: https://github.com/RyanMarcus/dirty-json/issues -- the JSON that caused the failure was: " + e),
@RyanMarcus
many npm packages will minify their project with a build that can be used w/out using a require statement
Thank you for the great project, it helped enormously with loosening the requirements around quotes and such. That said, the project completely breaks all for/in
loops because it adds peek
and last
methods to all arrays, and the loop thinks it's an index. These need to be non-enumerable; I will submit a PR if necessary.
For example, support parsing:
{"ss":[["Thu","7:00","Final",,"BAL","19","ATL","20",,,"56808",,"PRE4","2015"],["Thu","7:00","Final",,"NO","10","GB","38",,,"56809",,"PRE4","2015"]]}
https://stackoverflow.com/questions/6886935/parsing-malformed-json-with-javascript
in this test I just removed second quote on coolCSS
{ "key": "<div class="coolCSS>some text</div>" }
when parsing it exports this :
"{"
it should only escape like this
{ "key": "<div class=\"coolCSS>some text</div>" }
I have such a json:
{
"some": [a,b,c,],
"b": a
}
but it doesn't convert to
{
"some": ["a","b","c"],
"b": "a"
}
I have a little problems with a specific instance. I tried it in the web demo and it concatenates the words that are inside the unescaped quotes (see the "claimReviewed")
Input:
{
"@context": "http://schema.org",
"@type": [
"Review",
"ClaimReview"
],
"datePublished": "2016-03-31",
"url": "http://www.politifact.com/north-carolina/statements/2016/mar/30/pat-mccrory/pat-mccrory-wrong-when-he-says-north-carolinas-new/",
"author": {
"@type": "Organization",
"url": "https://www.politifact.com" "twitter": "@politifact"
},
"claimReviewed": ""We have not taken away any rights that have currently existed in any city in North Carolina" with the passage of HB2.",
"claimReviewSiteLogo": "http://static.politifact.com/mediapage/jpgs/politifact-logo-big.jpg",
"reviewRating": {
"@type": "Rating",
"ratingValue": "4",
"bestRating": "6",
"text": "False",
"image": "https://s3.amazonaws.com/share-the-facts/rating_images/politifact/tom-false.jpg"
},
"itemReviewed": {
"@type": "CreativeWork",
"author": {
"@type": "Person",
"name": "Pat McCrory",
"title": "Governor of North Carolina",
"image": "http://static.politifact.com.s3.amazonaws.com/politifact%2Fmugs%2FMcCrory_mug.jpg",
"sameAs": []
},
"datePublished": "2016-03-28",
"sourceName": "A speech in Clayton, NC"
}
}
Output:
{
"@context": "http://schema.org",
"@type": [
"Review",
"ClaimReview"
],
"datePublished": "2016-03-31",
"url": "http://www.politifact.com/north-carolina/statements/2016/mar/30/pat-mccrory/pat-mccrory-wrong-when-he-says-north-carolinas-new/",
"author": {
"@type": "Organization",
"url": "https://www.politifact.com",
"twitter": "@politifact"
},
"claimReviewed": "\"WehavenottakenawayanyrightsthathavecurrentlyexistedinanycityinNorthCarolina\" with the passage of HB2.",
"claimReviewSiteLogo": "http://static.politifact.com/mediapage/jpgs/politifact-logo-big.jpg",
"reviewRating": {
"@type": "Rating",
"ratingValue": "4",
"bestRating": "6",
"text": "False",
"image": "https://s3.amazonaws.com/share-the-facts/rating_images/politifact/tom-false.jpg"
},
"itemReviewed": {
"@type": "CreativeWork",
"author": {
"@type": "Person",
"name": "Pat McCrory",
"title": "Governor of North Carolina",
"image": "http://static.politifact.com.s3.amazonaws.com/politifact%2Fmugs%2FMcCrory_mug.jpg",
"sameAs": []
},
"datePublished": "2016-03-28",
"sourceName": "A speech in Clayton, NC"
}
}
First of all: this package saved me tons of hours! 🥇
Hi
I have a string like this
{
prop1: 'val1',
prop2: {
prop3: 'val3' ,
messages: {something:'val'},
messages: { something:'val2', x : {x : 1, y : 5} }
}
, prop4: 'val4'
}
As it's obvious messages
is duplicate and when I try to use the library it takes the last one, which is good for most cases.
I was wondering if it's possible to have an option how to treat duplicates, possible values could be (TakeLast, TakeFirst, ConvertToArray)
For ConvertToArray
, the output will be something like below
{
prop1: 'val1',
prop2: {
prop3: 'val3' ,
messages: [
{something:'val'},
{ something:'val2', x : {x : 1, y : 5} }
]
}
, prop4: 'val4'
}
Handle unquoted keys in lists. Should expand unit tests to cover these, and look at ambiguous unquoted keys (containing commas, for example) as well.
I will transfer a lot of dirty json, but my js skill is poor~
Hello,
I have a problem with a json that the lib seem not able to fix:
{"title": "Colorful", "desc": ""Lorem ipsum dolor sit amet, consectetur adipiscin"}
It is because the desc value start with a "
Any idea why the quote is not escaped when at start and how to fix it?
This is not supported as it seems to have issue with '{}' inside the quoted string.
{
"action": "This text is not supported as there it doesn't seem to like curly braces "${blah}" "${blahblah}"."
}
If I enclose the value in single quotes it still doesn't seem to like it, It doesn't seem to handle escaped quote. (doesn't)
{
"action": 'This text is not supported as there it doesn't seem to like curly braces "${blah}" "${blahblah}".'
}
Input:
{
"content": "<img src="img.jpg""
}
Output:
{
"content": "<img src=\"img\"NaN\"jpg\""
}
Example:
{ "key": "value", "anotherkey": \"value"value\", "number": 45., "number2": nan }
When the server tried to parse an object contain a multidimensional array property, dirty-json sent out this message:
dirty-json got valid JSON that failed with the custom parser. We're returning the valid JSON, but please file a bug report here: https://github.com/RyanMarcus/dirty-json/issues
Here is my data:
{ "rows": [["this", "is", "failing"]] }
However, the parser works fine when the data has another array inside "rows" array.
Here is an example:
{ "rows": [["it", "works", "fine"], ["now"]] }
Let me know if you need more information about the issue
Thank you :)
Hi, with v0.9.1 on a simple JSON:
Input: {"name": "White 17" Microwave", "price": "50.00"}
Expected output: {"name": "White 17\" Microwave", "price": "50.00"}
Actual output: Found } that I can't handle at line -1:-1
Interesting that if I replace 50.00
with a non-number (e.g. Fifty
), it doesn't fail but produces incorrect output (leaves only "name"
and loses space after "
):
Input: {"name": "White 17" Microwave", "price": "Fifty"}
Expected output: {"name": "White 17\" Microwave", "price": "Fifty"}
Actual output: {"name":"White 17\"Microwave\", \"price\": \"Fifty\"}"}
Can something be done here?
You should mention at the top that by using this software, your program must also be open sourced to the AGPL specifications.
You should also include an additional option for a license available upon purchase. Unless this is agpl because of dependencies. Then this is purely academic. AGPL is not free. The cost is open sourcing anything that links to it. License trolls know this. I would remove that logo as well.
This example
require('dirty-json').parse(null)
Returns null
as expected, but whines every time:
dirty-json got valid JSON that failed with the custom parser. We're returning the valid JSON, but please file a bug report here: https://github.com/RyanMarcus/dirty-json/issues -- the JSON that caused the failure was: null
Passing an empty string throws an exception:
require('dirty-json').parse("")
Uncaught TypeError: Cannot read property 'type' of undefined
It is negotiable though what is supposed to be returned instead of an exception. An empty string or an empty object?
Also, passing undefined
(not sure who might want that) gives another exception
require('dirty-json').parse(undefined)
Uncaught TypeError: Cannot read property 'length' of undefined
Not sure if you want to handle undefined
at all, I just checked it for no apparent reason while creating this issue. Also, boolean values (I know, this module is supposed to process strings, what am I talking about) have the same behavior as null
, the same value returned, but with a warning.
Failed to parse:
const dJSON = require('dirty-json')
const x = dJSON.parse(`{
text: 'foo bar',
}`
)
Succeed:
const dJSON = require('dirty-json')
const x = dJSON.parse(`{
text: 'foo bar',
}`.replace(/,\s*\}/g, '')
)
For example, add null
as a value if there is none.
{ "key": }
-> { "key": null }
When there is a number in an html attribute it does not work
input :
{ "test": "<div test="44">test</div>" }
error:
Found } that I can't handle at line 0:39
Hi, when you have an unquoted string with leading zeros it removes zeros.
{ "id": 00000111 } -> { "id": 111 }
or
{ "id": A00000111 } -> { "id": "A111" }
I think it's related to Number conversion, but not sure.
Passing this intro dirty-json:
{
type: String,
value: 'something'
}
results in:
Error: Found } that I can't handle at line 0:45
at reduce (C:\Users\MrBartusek\Documents\Development\docs\node_modules\dirty-json\parser.js:517:15)
at Object.parse (C:\Users\MrBartusek\Documents\Development\docs\node_modules\dirty-json\parser.js:99:16)
at Object.parse (C:\Users\MrBartusek\Documents\Development\docs\node_modules\dirty-json\dirty-json.js:39:23)
at praseFileContent (C:\Users\MrBartusek\Documents\Development\docs\scripts\updateConfig.js:32:19)
at Object.<anonymous> (C:\Users\MrBartusek\Documents\Development\docs\scripts\updateConfig.js:15:21)
0:45
undefined:undefined
hoverer I cannot reproduce itHi!
I have another problem with this JSON, I wonder if you manage to understand why it doesn't work. If I simply remove the '
character in the middle of the claimReviewed
the library manages to parse
{
"@context": "http://schema.org",
"@type": "ClaimReview",
"datePublished": "2019-05-08 10:01:22",
"url": "https://teyit.org/a-haberin-chpnin-akil-almaz-nanoteknolojik-hilesi-alt-bandi-kullandigi-iddiasi/",
"itemReviewed":
{
"@type": "CreativeWork",
"author":
{
"@type": "Organization",
"name": "Sosyal Medya",
"sameAs": "https://twitter.com/kacsaatolduson/status/1125728521110863873"
},
"datePublished": "05/07/2019"
},
"claimReviewed": "İDDİA: A Haber "CHP'nin akılalmaz nanoteknolojik hilesi" şeklinde bir alt bant kullandı.",
"author":
{
"@type": "Organization",
"name": "teyit.org",
"sameAs": "https://teyit.org/a-haberin-chpnin-akil-almaz-nanoteknolojik-hilesi-alt-bandi-kullandigi-iddiasi/"
},
"reviewRating":
{
"@type": "Rating",
"ratingValue": "-1",
"bestRating": "-1",
"worstRating": "-1",
"alternateName" : "YANLIŞ"
}
}
Thank you very much,
Martino
{
"key" : [{"a":"b"}]
}
This JSON above has an error, it doesn't seem to like an array of single object.
It works after adding a second object in the array
{
"key" : [{"a":"b"}, {"c":"d"}]
}
How about this? :)
{ "space":73,2,"space_percent":94,9 }
Means:
{ "space":73.2,"space_percent":94.9 }
How it works in demo now:
angular.js:10734 Error: Found } that I can't handle at line 0:37
Hi
I already checked with the demo, it doesn't convert it, but after few round of cleaning it start working, so wondering if it's possible to handle this worst case scenario?
id: \"55b3a5c5-16e7-4ae6-91bc-7f08fb152dde-ee1dc704\"\nlang: \"en\"\nsession_id: \"6b1f9ba2-7c79-47bc-aadb-2c600b111836\"\ntimestamp: \"2019-09-04T17:13:43.374Z\"\nresult {\n source: \"agent\"\n resolved_query: \"welcome\"\n action: \"customSettingsAnswer\"\n score: 1.0\n parameters {\n fields {\n key: \"key\"\n value {\n string_value: \"welcome\"\n }\n }\n fields {\n key: \"default\"\n value {\n string_value: \"Hello, you called condo bot. Your virtual concierge. How can I help you today?\"\n }\n }\n }\n metadata {\n intent_id: \"040b9e41-d20e-4da8-9fff-d2c1f1f5812e\"\n webhook_response_time: 4992\n intent_name: \"Default Welcome Intent - custom\"\n webhook_used: \"true\"\n webhook_for_slot_filling_used: \"false\"\n is_fallback_intent: \"false\"\n }\n fulfillment {\n speech: \"Hello, you called condo bot. Your virtual concierge. How can I help you today?\"\n messages {\n lang: \"en\"\n type {\n number_value: 0.0\n }\n speech {\n string_value: \"Hello, you called condo bot. Your virtual concierge. How can I help you today?\"\n }\n }\n }\n}\nstatus {\n code: 200\n error_type: \"success\"\n}\n
This is my string, what I did to make it work
{
and }
to start and end of string\n
to ,
{
and add a :
between} , }
and replace it with } }
{,
and replace it with {
after all of that modifications it starts working
Hey Ryan!
Thank you for this awesome package! I was wondering if when you have the time, you could implement @types/dirty-json
. It would be very useful!
Thank you!
I'm thinking about doing a little bit more of a plan for the doc and against jake
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.