3

I have JSON objects in this format:

 {
     "1f626": {
         "name": "frowning face with open mouth",
         "ascii": [],
         "code_points": {
             "base": "1f626",
             "default_matches": [
                 "1f626"
             ],
             "greedy_matches": [
                 "1f626"
             ],
             "decimal": ""
         }
     }
 }

I have to remove the code_points object using Regular Expressions.


I have tried using this RegEx:

(("code\w+)(.*)(}))

But it is only selecting the first line. I have to select until end of curly brackets in order to fully get rid of the code_points object.

How can I do that?


Note: I have to remove it using Regular Expressions and not JavaScript. Please don't post any JavaScript answers or mark this as a possible duplicate of a JavaScript-based question.

16
  • 3
    Just delete obj["1f626"]["code_points"] Commented Aug 26, 2018 at 4:03
  • @KaiserKatze using javascript? Commented Aug 26, 2018 at 4:05
  • Yes. Just try delete obj["1f626"]["code_points"], with obj being the object in your code. Commented Aug 26, 2018 at 4:09
  • Reference: 1; 2; 3. Commented Aug 26, 2018 at 4:12
  • 3
    JSON isn't a regular language; it is actually awful to use regex on JSON and it is why we have JSON parsers. I dread to think who is forcing you to use regex :-( Commented Aug 26, 2018 at 4:18

2 Answers 2

3

Alternatively, at the command-line, if you can use jq

jq "del(.[].code_points)" <monster.json >smaller_monster.json

This deletes the code_points key inside each 2nd-level object.

It took my machine about 5 seconds on a 60MB document.

It's not a regular expression but it's not JavaScript, either. So, it meets half of your non-functional requirements.

Sign up to request clarification or add additional context in comments.

3 Comments

thanks for your answer, this code also removing object key & comma: prntscr.com/knh22y here is code snippet: jqplay.org/s/ohqeX8OnG_
can you please help me about this?
@Mina. Weird. I fixed the query.
2

("code_points")([\s\S]*?)(})

The problem you had is that . is actually any character except \n, so in this case I usually use [\s\S] which means any whitespace and non-whitespace character (so it's actually any character). Also you should make * quantifier to be lazy by adding ?.

Remember that this Regular Expression won't work properly in case you have inner object (other {}) in code_points

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.