-1

I have a legacy cli tool which outputs a structured list with sub-items intended with a tab (stackoverflow won't let me put tabs here so I replaced them with 4 spaces in this example).

Heading One:
    Sub One: 'Value 1'
    Sub Two: 'Value 2'
Heading Two:
    Sub Three: 'Value 3'
    Sub Four: 'Value 4'
Key One: 'This key has no heading' 

I try to achieve an JSON output like

{
  "Heading One": {
    "Sub One": "Value 1",
    "Sub Two": "Value 2"
  },
  "Heading Two": {
    "Sub Three": "Value 3",
    "Sub Four": "Value 4"
  },
  "Key One": "This key has no heading"
}

Is this possible with jq or do I need to write a more complex python-script?

4
  • The input provided looks close to a YAML syntax. Can the assumption be made? Commented Jul 12, 2022 at 15:16
  • @Inian sadly, no. The headings sometimes include special characters like quotes and parenthesis. AFAIK, YAML does not allow that. Commented Jul 12, 2022 at 15:20
  • Can you provide an input that is close to your actual input? From what I was going to post as an answer - github.com/mikefarah/yq does understand such special characters and can transform them to JSON. If you can update a real world example, I can try and post an answer Commented Jul 12, 2022 at 15:22
  • But too bad you cannot use Tabs in YAML - A YAML file cannot contain tabs as indentation Commented Jul 12, 2022 at 15:35

1 Answer 1

2

This is an approach for a deeply nested input. It splits on top-level items using a negative look-ahead regex on tabs following newlines, then separates the head and "unindents" the rest by removing one tab following a newline, which serves as input for a recursive call.

jq -Rs '
  def comp:
    reduce (splits("\n(?!\\t)") | select(length > 0)) as $item ({};
      ($item | index(":")) as $hpos | .[$item[:$hpos]] = (
        $item[$hpos + 1:] | gsub("\n\t"; "\n")
        | if test("\n") then comp else .[index("'\''") + 1: rindex("'\''")] end
      )
    );
  comp
'
{
  "Heading One": {
    "Sub One": "Value 1",
    "Sub Two": "Value 2"
  },
  "Heading Two": {
    "Sub Three": "Value 3",
    "Sub Four": "Value 4"
  },
  "Key One": "This key has no heading"
}
Sign up to request clarification or add additional context in comments.

1 Comment

Oh wow, I did not expect the expression to be this extensive. Thanks for not only answering the question, but also providing a perfeclty fine working solution!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.