0

I'm hoping someone could help, I'm trying to merge two json files. Here is my bash script:

script_directory="/home/joey/scripts/scripts/delete"
region_file="US.en.json"
cnmts_file="cnmts.json"

wget https://github.com/blawar/titledb/raw/master/$region_file -O $script_directory/$region_file
wget https://github.com/blawar/titledb/raw/master/$cnmts_file -O $script_directory/$cnmts_file

#This is here just to simplify the json files
cat $script_directory/$region_file | jq '.[] | {id: .id}' > $script_directory/region_file_id.txt
cat $script_directory/$cnmts_file | jq '.[] | .[] | {titleId: .titleId, otherApplicationId: .otherApplicationId}' > $script_directory/cnmts_titleId_otherApplicationId.txt

Essentially, I'm given two files:

region_file_id.txt:

{
  "id": "01007EF00011E000"
}
{
  "id": "0100225000FEE000"
}
{
  "id": "0100BCE000598000"
}
{
  "id": "0100B42001DB4000"
}
{
  "id": "01008A9001DC2000"
}

and cnmts_titleId_otherApplicationId.txt:

{
  "titleId": "0100000000010000",
  "otherApplicationId": "0100000000010800"
}
{
  "titleId": "0100000000010800",
  "otherApplicationId": "0100000000010000"
}
{
  "titleId": "010000000e5ee000",
  "otherApplicationId": "010000000e5ee800"
}
{
  "titleId": "010000000eef0000",
  "otherApplicationId": "010000000eef0800"
}
{
  "titleId": "010000000eef0800",
  "otherApplicationId": "010000000eef0000"
}
{
  "titleId": "0100000011d90000",
  "otherApplicationId": "0100000011d90800"
}
{
  "titleId": "0100000011d90800",
  "otherApplicationId": "0100000011d90000"
}
{
  "titleId": "0100000011d90800",
  "otherApplicationId": "0100000011d90000"
}

Please note, this is only a snippet of the files, feel free to run the bash script to get a more accurate file.

All the "id" in the 'region_file_id' equal to a "titleId" somewhere in 'cnmts_titleId_otherApplicationId' (the reverse is not true though as it included id from different regions). I'm trying to grab the "otherApplicationId" values for each "id" in 'region_file_id' by cross referencing them and creating a json like: (repeated for every 'id' in region_file_id)

{
 "id": "111000"
 "titleId": "111000" (this one is optional as it is a duplicate from 'id')
 "otherApplicationId": 111800"
}

I've tried searching and tried different snippets:

jq -s '.[0] * .[1]' $script_directory/region_file_id.txt cnmts_titleId_otherApplicationId.txt (only returned 1 object for some reason)
jq -s '{ .[0] as $u | .[1] | select(.id == $u.titleId) |= $u }' $script_directory/region_file_id.txt cnmts_titleId_otherApplicationId.txt

Update:

As peak pointed out:

jq -n --slurpfile ids region_file_id.txt  '
  INDEX(inputs; .titleId | ascii_upcase) as $dict
  | $ids[].id as $id
  | {$id} + $dict[$id]
' cnmts_titleId_otherApplicationId.txt > merged.txt

This seems to work until I hit "null" values where my file doesn't include the correct id, which is another problem all together!

1 Answer 1

2

All the "id" in the 'region_file_id' equal to a "titleId" somewhere in 'cnmts_titleId_otherApplicationId'

If that really is the case, then you could proceed as follows:

< cnmts_titleId_otherApplicationId.txt jq -n --slurpfile ids region_file_id.txt  '
  INDEX(inputs; .titleId) as $dict
  | $ids[].id as $id
  | {$id} + $dict[$id]
' 

Sign up to request clarification or add additional context in comments.

2 Comments

That worked sometimes, please see my edit in original post
Scratch my last comment, I didn't realize the second file was all lower case. Updated the fix. Thanks for the help!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.