How do I find all files containing a specific string of text within their file contents?
The following doesn't work. It seems to display every single file in the system.
find / -type f -exec grep -H 'text-to-find-here' {} \;
I wrote a Python script which does something similar. This is how one should use this script.
./sniff.py path pattern_to_search [file_pattern]
The first argument, path, is the directory in which we will search recursively. The second argument, pattern_to_search, is a regular expression which we want to search in a file. We use the regular expression format defined in the Python re library. In this script, the . also matches newline.
The third argument, file_pattern, is optional. This is another regular expression which works on a filename. Only those files which matches this regular expression will be considered.
For example, if I want to search Python files with the extension py containing Pool( followed by word Adaptor, I do the following,
./sniff.py . "Pool(.*?Adaptor" .*py
./Demos/snippets/cubeMeshSigNeur.py:146
./Demos/snippets/testSigNeur.py:259
./python/moose/multiscale/core/mumbl.py:206
./Demos/snippets/multiComptSigNeur.py:268
And voila, it generates the path of matched files and line number at which the match was found. If more than one match was found, then each line number will be appended to the filename.
All previous answers suggest grep and find. But there is another way: Use Midnight Commander
It is a free utility (30 years old, proven by time) which is visual without being GUI. It has tons of functions, and finding files is just one of them.
The below command will work fine for this approach:
find ./ -name "file_pattern_name" -exec grep -r "pattern" {} \;
find and then grep -r? They are meant for the same, so this is redundant.find.I tried the grep command below. It helps searching contents within my repository at /etc/yum.repos.d.
grep -Ril -e 'texttoSearch' /etc/yum.repos.d
Try this:
find . | xargs grep 'word' -sl
grep directly it pipes all the files find finds to xargs running grep on it. I'm sure you understand that but just to add to those who might not. The command here is .. I can't atm think of a good analogy but it's adding a lot of unnecessary and harmless overhead.Try this:
find / -type f -name "*" -exec grep -il "String_to_search" {} \;
Or
for i in /*;do grep -Ril "String_to_search" $i;done 2> /dev/null
xargs is for. Unsure on AIX if it has that though; no comment on your actual commands.You can use below command as you don't want file name but you want to search from all the files. Here are i am capturing "TEXT" form All the log files making sure that file name is not printed
grep -e TEXT *.log | cut -d' ' --complement -s -f1
grep with -e option is quite quick compared to other option as it is for PATTERN match
# because other than comments that typically implies something - and you shouldn't be root unless you absolutely have to be. Even so you needn't have the prompt surely? Call this petty but I have seen people many times over the years simply copy and paste and do things without truly understanding it. Not saying any will here but still.. Just a thought.Avoid the hassle and install ack-grep. It eliminates a lot of permission and quotation issues.
apt-get install ack-grep
Then go to the directory you want to search and run the command below
cd /
ack-grep "find my keyword"
find with xargs is preferred when there are many potential matches to sift through. It runs more slowly than other options, but it always works. As some have discovered,xargs does not handle files with embedded spaces by default. You can overcome this by specifying the -d option.
Here is @RobEarl's answer, enhanced so it handles files with spaces:
find / -type f | xargs -d '\n' grep 'text-to-find-here'
Here is @venkat's answer, similarly enhanced:
find . -name "*.txt" | xargs -d '\n' grep -i "text_pattern"
Here is @Gert van Biljon's answer, similarly enhanced:
find . -type f -name "*.*" -print0 | xargs -d '\n' --null grep --with-filename --line-number --no-messages --color --ignore-case "searthtext"
Here is @LetalProgrammer's answer, similarly enhanced:
alias ffind find / -type f | xargs -d '\n' grep
Here is @Tayab Hussain's answer, similarly enhanced:
find . | xargs -d '\n' grep 'word' -sl
grep -rl doesn't work with many matches?Use:
grep -Erni + "text you wanna search"
The command will search recursively in all files and directories of the current directory and print the result.
Note: if your grep output isn't colored, you can change it by using the grep='grep --color=always' alias in your shell source file.
-i makes the search case-insensitive; by default it doesn't have that - nor should it as Unix (etc.) isn't a case-insensitive OS. You might also want to specify what the other options are for too.My use case was to find Python code I had written way back that wrote jsonlines a particular way. I knew that jsonl would be part of the function name and to_json would appear in the body, but not much else.
Despite 50 answers, finding more than one string in the same file (whether or not in the same line) hasn't been answered.
The -q in grep is for quiet. Nothing is printed, only the return value is set. Thus the -print at the end. Each -exec only runs if the previous one succeeded. So if you have many files it pays to think about patterns that will eliminate files you aren't interested in.
find . -type f -name "*.py" \
-exec grep -q -e 'to_json' {} \; \
-exec grep -q -e 'def\s.*jsonl' {} \; \
-print
You can use ripgrep which will respect the default project's .gitignore file.
$ rg fast README.md
75: faster than both. (N.B. It is not, strictly speaking, a "drop-in" replacement
88: color and full Unicode support. Unlike GNU grep, ripgrep stays fast while
119:### Is it really faster than everything else?
124:Summarizing, ripgrep is `fast` because:
129: optimizations to make searching very fast.
where fast keyword is highlighted in the terminal.
To suppress Permission denied errors:
$ rg -i rustacean 2> /dev/null
Which will redirect the standard error (stderr) output to /dev/null.
If you have a set of files that you will always be checking you can alias their paths, for example:
alias fd='find . -type f -regex ".*\.\(inc\|info\|module\|php\|test\|install\|uninstall\)"'
Then you can simply filter the list like this:
grep -U -l $'\015' $(fd)
Which filters out the list fd to files that contain the CR pattern.
I find that aliasing the files that I am interested in helps me create easier scripts then always trying to remember how to get all those files. The recursive stuff works as well but sooner or later you are going to have to contend with weeding out specific file types. Which is is why I just find all the file types I'm interested in to begin with.
Try this
find . -type f -name some_file_name.xml -exec grep -H PUT_YOUR_STRING_HERE {} \;
As Peter in the previous answer mentioned, all previous answers suggest grep and find.
But there is a more sophisticated way using Gnome Commander with a perfect GUI and with tons of options since 2001, and finding files is just one of them. It is a free utility as well, proven by time.
GUI Search Alternative - For Desktop Use:
- As the question is not precisely asking for commands
Searchmonkey: Advanced file search tool without having to index your system using regular expressions. Graphical equivalent to find/grep. Available for Linux (Gnome/KDE/Java) and Windows (Java) - open source GPL v3
Features:
Download - Links:
.
Screen-shot:
You can use the following commands to find particular text from a file:
cat file | grep 'abc' | cut -d':' -f2
See also The Platinium Searcher, which is similar to The Silver Searcher and it's written in Go.
Example:
pt -e 'text to search'
I'm trying to find a way to scan my entire Linux system for all files containing a specific string of text. ... Is this close to the proper way to do it? If not, how should I? ... This ability to find text strings in files would be extraordinarily useful for some programming projects I'm doing.
While you should never replace (or alias) a system command with a different program, due to risk of mysterious breakage of scripts or other utilities, if you are running a text search manually or from your own scripts or programs you should consider the fastest suitable program when searching a large number of files a number of times. Ten minutes to half an hour time spent installing and familiarizing yourself with a better utility can be recovered after a few uses for the use-case you described.
A webpage offering a "Feature comparison of ack, ag, git-grep, GNU grep and ripgrep" can assist you to decide which program offers the features you need.
Andrew Gallant's Blog claims: "ripgrep is faster than {grep, ag, git grep, ucg, pt, sift}" (a claim shared by some of the others, this is why a feature comparison is helpful). Of particular interest is his section on regex implementations and pitfalls.
The following command searches all files, including hidden and executable:
$ rg -uuu foobar
The Silver Searcher (ag) claims it is 5-10x faster than Ack. This program is suggested in some other answers. The GitHub doesn't appear as recent as ripgrep's and there are noticably more commits and branches with fewer releases, it's hard to draw an absolute claim based on those stats. The short version: ripgrep is faster, but there's a tiny learning curve to not get caught by the differences.
So what could be next, you guessed it, the platinum searcher. The claims are: it searches code about 3–5× faster than ack, but its speed is equal to the silver searcher. It's written in GoLang and searches UTF-8, EUC-JP and Shift_JIS files; if that's of greater interest. The GitHub is neither particularly recent or active. GoLang itself has a fast and robust regex, but the platinum searcher would be better recommended if it had a better user interest.
For a combination of speed and power indexed query languages such as ElasticSearch or Solr can be a long term investment that pays off, but not if you want a quick and simple replacement for grep. OTOH both have an API which can be called from any program you write, adding powerful searches to your program.
While it's possible to spawn an external program, execute a search, intercept its output and process it, calling an API is the way to go for power and performance.
This question was protected Aug 6 '15 at 19:34 with this caution:
We're looking for long answers that provide some explanation and context. Don't just give a one-line answer; explain why your answer is right, ideally with citations.
While some answers suggest alternative ways to accomplish a search they don't explain why other than it's "free", "faster", "more sophisticated", "tons of features", etc. Don't try to sell it, just tell us "why your answer is right". I've attempted to teach how to choose what's best for the user, and why. This is why I offer yet another answer, when there are already so many. Otherwise I'd agree that there are already quite a few answers; I hope I've brought a lot new to the table.
Kindly customize the below command according to demand and find any string recursively from files.
grep -i hack $(find /etc/ -type f)
I think it is worth mentioning how you can find:
All files containing at least one text, among a big set of texts:
grep -rlf ../patternsFile.txt .
Output:
./file1
./file2
./file4
The above, grouped by each text:
cat ../patternsFile.txt | xargs -I{} sh -c "echo {}; grep -rl \"{}\" ."
Output:
pattern1
./file1
./file2
pattern2
./file1
./file4
pattern3
./file1
./file2
./file4
Note that in order not to match patternsFile.txt itself, you need to add it one directory up (as shown in the above examples).
Go to the directory.
For search, then search your text by -> grep -r "yoursearchtext"
Now you should see all files which has matching text.
Then go to file → less fileName
Then read the file's full text → Shift + G
Then search the text in the file → ?+yousearchText
Then search all matching case → Ctrl + N
You can also use awk:
awk '/^(pattern)/{print}' /path/to/find/*
pattern is the string you want to match in the files.
user@host:/dir$ awk '/^(getCookie)/{print}' . awk: warning: command line argument .' is a directory: skipped`After checking the alternatives for a desktop, I wrote an open-source GUI program for full text search. It's blazing fast in comparison with grep flavors.
You can check it at Missing Linux GUI app to full-text search files. It can search for both content and file names in a intuitive search string, so you can look for "license mit *.md" and get few false positives ;)
The offered solutions could be better quality. There are a number of unsolved issues with the given answers:
If you don't know where to look, then searching the entire filesystem isn't as trivial as specifying the root directory on Linux. There's some stuff to exclude. What?
Following symbolic links can lead to loops which means the search never terminates and never investigates some of the files on the disk.
In most cases, you do not want to search inside virtual directories, like /dev/, /proc/ and /sys/, because that will spew errors, and not search actual files on disk, instead searching program memory and raw device data, causing the search to take very, very long.
You probably also don't want to search in /tmp/, which is usually a memory-mounted filesystem that is purged upon reboot and automatically cleaned on modern Linuxes.
The terminal has a limited capacity for text. If this is exceeded, results are lost. Results should be put in a file.
If the terminal connection drops at any point in the search, results are lost and everything has to be restarted. Running in the background would be much preferred.
Searching for code, with all the examples, is still very tricky on the command line. In particular: Escaping stuff. In particular:
Various special characters in bash have to be escaped.
Grep searches for a regex which has to be escaped.
If commands are put into other commands, that leads to more things being escaped.
All three combined when searching code is a literal nightmare to figure out: the user should have an input for what to search for that does not require any escaping.
Filenames can have special characters in them, mucking with your search. The command should be able to deal with evil filenames with quotes and spaces and newlines and other shenanigans in them.
Files could be removed or changed while you're searching, leading to 'File not Found' errors cluttering the output. You could not have permission to things, also cluttering the output. Including an option to suppress errors helps.
Most of the examples use only a single thread, making them unnecessarily dreadfully slow on modern many-core servers, even though the task is embarrassingly parallel. The search command should start one thread per CPU core to keep it busy.
The following should be a big improvement:
# Note: Change search string below here.
nCores=$(nproc --all)
read -r -d '' sSearch <<'EOF'
echo $locale["timezone"]." ".$locale['offset'].PHP_EOL;
EOF
find -print0 \( -type f \) -and \( -not \( -type l \) \) -and \( -not \( -path "./proc/*" -o -path "./sys/*" -o -path "./tmp/*" -o -path "./dev/*" \) \) | xargs -P $nCores -0 grep -Fs "$sSearch" | tee /home/npr/results.txt &
If you do not want to suppress grep errors, use this:
# Note: Change search string below here.
nCores=$(nproc --all)
read -r -d '' sSearch <<'EOF'
echo $locale["timezone"]." ".$locale['offset'].PHP_EOL;
EOF
find -print0 \( -type f \) -and \( -not \( -type l \) \) -and \( -not \( -path "./proc/*" -o -path "./sys/*" -o -path "./tmp/*" -o -path "./dev/*" \) \) | xargs -P $nCores -0 grep -F "$sSearch" | tee /home/npr/results.txt &
Change EOF to any other A-Za-z variable if it's desired to search for the literal text EOF.
With this, I reduced a day-long search that had thousands of errors resulting from several of the top answers here into an easy sub 1-minute command.
Reference:
Also see these answers:
Nested quotes nightmare : sending an e-mail from a remote host
How do I exclude a directory when using `find`? (most answers were wrong and I had to fix it for modern find).
.as a single-character wildcard, among others. My advice is to alway use either fgrep or egrep.-Hwith-l(and maybegrepwithfgrep). To exclude files with certain patterns of names you would usefindin a more advanced way. It's worthwile to learn to usefind, though. Justman find.find … -exec <cmd> +is easier to type and faster thanfind … -exec <cmd> \;. It works only if<cmd>accepts any number of file name arguments. The saving in execution time is especially big if<cmd>is slow to start like Python or Ruby scripts.