Linked Questions
22 questions linked to/from How does awk '!a[$0]++' work?
4
votes
1
answer
11k
views
how awk seen option works [duplicate]
I have recently come across an awk seen option. I can see it is removing duplicates in files. I could use some clarification on how it works.
cat tes
1
2
3
1
1
1
3
4
with awk seen output
cat tes | ...
0
votes
1
answer
103
views
Please explain this awk statement [duplicate]
Yesterday I was googling how to merge two files and came across an awk snippet.
I need a simple merge, so sort -u is not the way to go, but the code below works.
Could some one please explain what ...
252
votes
12
answers
372k
views
How to remove duplicate lines inside a text file?
A huge (up to 2 GiB) text file of mine contains about 100 exact duplicates of every line in it (useless in my case, as the file is a CSV-like data table).
What I need is to remove all the repetitions ...
6
votes
4
answers
2k
views
How to remove unique strings from a textfile?
Sorry guys I had to edit my example, because I didn't express my query properly.
Let's say I have the .txt file:
Happy sad
Happy sad
Happy sad
Sad happy
Happy sad
Happy sad
Mad sad
Mad happy
Mad happy
...
9
votes
2
answers
17k
views
Remove entire row in a file if first column is repeated
I have a file containing two columns and 10 million rows. The first column contains many repeated values, but there is a distinct value in column 2. I want to remove the repeated rows and want to keep ...
6
votes
2
answers
6k
views
Why doesn't "uniq --unique" remove all duplicate lines?
Running
printf "lol\nlol\nfoo\n\n\n\n\nbar\nlol\nlol\nfoo\nlol\nfoo" | uniq --unique
prints
foo
bar
foo
lol
foo
Why is foo printed three times? Shouldn't uniq --unique remove them?
Also, ...
4
votes
3
answers
4k
views
Print line if value in column changes
I know this should be an easy one by googling but was not successful. Sorry for that.
I would like to print the first line of groups defined the value in the first column. Delimiter is tab.
Input:
...
6
votes
1
answer
13k
views
In awk, how can I make a boolean value that I can toggle it?
In other programming languages there often has a bool or boolean type. I can create a boolean type variable and use the not operator to toggle it. I can toggle it many times and get a series of true, ...
5
votes
2
answers
2k
views
delete a line if same line exists in the previous line
I want to remove lines which exist already in the previous line from the command line in UNIX. I have the following data in a file.
<xref id="gi_525506931_ref_NP_001266519.1__brain_aromatase"/>...
3
votes
3
answers
3k
views
Removing Duplicates from a CSV based on specified columns
I am working with a CSV data set which looks like the below:
year,manufacturer,brand,series,variation,card_number,card_title,sport,team
2015,Leaf,Trinity,Printing Plates,Magenta,TS-JH2,John Amoth,...
6
votes
2
answers
1k
views
Concatenate files that overlap, avoiding repetition
Assume we have two text files and we want to combine them into one.
The second file starts with lines that are also in the first file, so it repeats part of it. There is a redundant overlap.
How can I ...
0
votes
4
answers
2k
views
read phone numbers from file and store them in other file uniquely
I have input text file e.g myfile.txt which is contains data like
WO_ID
------------------------------------------------------------------------
moveover_virus_8493020020_virus.final
...
0
votes
2
answers
2k
views
Retrieve first occurrence of record, where matching pattern is taken from input
I have a list like this:
2017-12-11 AAOI 40.33
2017-11-15 AAOI 44.3492
2017-12-15 AEIS 70.98
2017-11-15 AEIS 80.137
2017-10-23 AIEQ 25.1601
2017-11-15 AMBA 52.6501
2017-12-05 ...
1
vote
2
answers
945
views
Print only one value from duplicates [duplicate]
I have following content in a file.
$ cat file.txt
code-coverage-api
jsch
cloudbees-folder
apache-httpcomponents-client-4-api
apache-httpcomponents-client-4-api
jsch
apache-httpcomponents-client-...
3
votes
3
answers
2k
views
Delete duplicate lines in file without creating new file in ubuntu
I can't seem to find a command that lets me delete duplicates in my file without creating a new file and also preserving the order of the contents in my file.
Would there be another command besides ...