How to generate list of unique lines in text file using a Linux shell script?

Question

Suppose I have a file that contain a bunch of lines, some repeating:

line1
line1
line1
line2
line3
line3
line3

What linux command(s) should I use to generate a list of unique lines:

line1
line2
line3

Does this change if the file is unsorted, i.e. repeating lines may not be in blocks?

parkydr · Accepted Answer · 2013-05-30 16:06:53Z

35

If you don't mind the output being sorted, use

sort -u

This sorts and removes duplicates

answered May 30, 2013 at 16:06

parkydr

7,8523 gold badges38 silver badges46 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

go-oleg · Accepted Answer · 2013-05-30 16:07:02Z

11

cat to output the contents, piped to sort to sort them, piped to uniq to print out the unique values:

cat test1.txt | sort | uniq

you don't need to do the sort part if the file contents are already sorted.

answered May 30, 2013 at 16:07

go-oleg

19.5k3 gold badges46 silver badges46 bronze badges

Comments

Kevin Sabbe · Accepted Answer · 2018-04-10 06:08:56Z

7

Create a new sort file with unique lines :

sort -u file >> unique_file

Create a new file with uniques lines (unsorted) :

cat file | uniq >> unique_file

answered Apr 10, 2018 at 6:08

Kevin Sabbe

1,46218 silver badges24 bronze badges

Comments

simhumileco · Accepted Answer · 2019-03-14 12:08:14Z

1

If we do not care about the order, then the best solution is actually:

sort -u file

If we also want to ignore the case letter, we can use it (as a result all letters will be converted to uppercase):

sort -fu file

It would seem that even a better idea would be to use the command:

uniq file

and if we also want to ignore the case letter (as a result the first row of duplicates is returned, without any change in case):

uniq -i file

However, in this case, may be returned a completely different result, than in case when we use the sort command, because uniq command does not detect repeated lines unless they are adjacent.

answered Mar 14, 2019 at 12:08

simhumileco

35.3k18 gold badges148 silver badges125 bronze badges

Collectives™ on Stack Overflow

How to generate list of unique lines in text file using a Linux shell script?

4 Answers 4

Comments

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

Comments

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related