17

Suppose I have a file that contain a bunch of lines, some repeating:

line1
line1
line1
line2
line3
line3
line3

What linux command(s) should I use to generate a list of unique lines:

line1
line2
line3

Does this change if the file is unsorted, i.e. repeating lines may not be in blocks?

4 Answers 4

35

If you don't mind the output being sorted, use

sort -u

This sorts and removes duplicates

Sign up to request clarification or add additional context in comments.

Comments

11

cat to output the contents, piped to sort to sort them, piped to uniq to print out the unique values:

cat test1.txt | sort | uniq

you don't need to do the sort part if the file contents are already sorted.

Comments

7

Create a new sort file with unique lines :

sort -u file >> unique_file

Create a new file with uniques lines (unsorted) :

cat file | uniq >> unique_file

Comments

1

If we do not care about the order, then the best solution is actually:

sort -u file

If we also want to ignore the case letter, we can use it (as a result all letters will be converted to uppercase):

sort -fu file

It would seem that even a better idea would be to use the command:

uniq file

and if we also want to ignore the case letter (as a result the first row of duplicates is returned, without any change in case):

uniq -i file

However, in this case, may be returned a completely different result, than in case when we use the sort command, because uniq command does not detect repeated lines unless they are adjacent.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.