I am comparing two files. I am trying to ignore the alphanumeric characters after @ and before [ . A line looks like
model.Field@d6b0d6b[fieldName
I would use process substitutions here:
diff <(sed 's/@[^[]*/@/' old) <(sed 's/@[^[]*/@/' new)
I assume you are using Bash.
if v="model.Field@d6b0d6b[fieldName" then you can do the following:
# Extract the right side of "$v"
r="${v#*[}"
# Extract the left side of "$v"
l="${v%@*}"
# Combine
new_v="$l@[$r"; new_v1="$l$r"
You can use "$new_v" or "$new_v1" depends on whether you want the @ and [ or not.
As Mr. Wijsman commented, my answer doesn't answer the question. Correct, I did not pay much attention to the title. Let's fix it and wrap the code above with the following function to print a single file's data as required
pf()
{
while read -r line; do
# This is a bit fancy but does the same thing as the code above.
printf '%s\n' "${line%@*}${line#*[}"
done < "$1"
}
Now, we can diff the two files by using the following command:
diff <(pf file1.txt) <(pf file2.txt)
Here is a Sample output
rany$ cat file1.txt
model.Field1@__A__[fieldName
model.FieldIAMDIFFERENT@__B__[fieldName
model.Field1@__C__[fieldName
rany$ cat file2.txt
model.Field1@__C__[fieldName
model.Field1@__D__[fieldName
model.Field1@__E__[fieldName
rany$ diff <(pf file1.txt) <(pf file2.txt)
2c2
< model.FieldIAMDIFFERENTfieldName
---
> model.Field1fieldName
rany$
As you can see, the fact that the lines are different between @ and [ is being ignored, and the only line which is different between the files is this:
model.FieldIAMDIFFERENTfieldName
I'm sorry for not paying careful attention to your title as a part of the question.
Filter the datafiles - then perform diff-:
sed 's/\@.*\[/@[/' file1 > file1.filt
sed 's/\@.*\[/@[/' file2 > file2.filt
diff file1.filt file2.filt
An alternative is to use diff has an option -I . Any lines which match the pattern are ignored in the diff comparision. Select a pattern which will uniquely select the lines which are not to be compared.
e.g.
diff -I 'dataexplorer.bigindex' file1 file2
diff -I ... help?
sedto remove everything between the@and[characters? If that's the case, you can pipe the output to temporary files, usediff, and then know where your changes are. Kind of round-about, but works. Alternatively, you could use Perl.s///operator to replace everything between@and[with nothing, and compare the two lines with them//operator. This becomes very complicated very quickly if your data is not VERY similar. (Like if you had to re-syncronize lines.) It's probably easiest for a one-shot deal to just do what I suggested above or what @suspectus suggested below (they are the same thing).sedcommand is interpreting your unescaped square bracket as the start of a character class. Put\[instead of[. Take a look at gnu.org/software/sed/manual/html_node/Regular-Expressions.html