0

I tried to use some unix tools inside a perl driver script because I knew little about writing shell script. My purpose is to just combine a few simple unix commands together so I can run the script on 100 directories in one perl command.

The task is I have more than 100 folders, in each folder, there are n number of files. I want to do the same thing on each folder, which is to combine the files in them and sort the combined file and use bedtools to merge overlapping regions (quite common practice in bioinformatics)

Here is what I have:

#!/usr/bin/perl -w
use strict;

my $usage ="
This is a driver script to merge files in each folder into one combined file
";
die $usage unless @ARGV;

my ($in)=@ARGV;
open (IN,$in)|| die "cannot open $in";

my %hash;
my $final;

while(<IN>){
    chomp;
    my $tf = $_;
    my @array =`ls $tf'/.'`;
    my $tmp;
    my $tmp2;
    foreach my $i (@array){
        $tmp = `cut -f 1-3 $tf'/'$i`;
        $tmp2 = `cat $tmp`;
    }
    my $tmp3;
    $tmp3=`sort -k1,1 -k2,2n $tmp2`;
    $final = `bedtools merge -i $tmp3`;
}
print $final,"\n";

I know that this line isn't working at all..

$tmp2 = `cat $tmp`;

The issue is how to direct the output into another variable in perl and use that variable later on in another unix command...

Please let me know if you can point out where I can change to make it work. Greatly appreciated.

3
  • 3
    You might know shell scripting, but not it's good practices then: never parse the output of ls is one of the first thing one should learn! Dear, you're mixing a beautiful language (Perl) with the worst practices of shell scripting! (no offence). Commented Oct 28, 2013 at 21:05
  • Could you explain what your script is supposed to do? Commented Oct 28, 2013 at 21:08
  • I have more than 100 folders, in each folder, there are n number of files. I want to do the same thing on each folder, which is to combine the files in them and sort the combined file and use bedtools to merge overlapping regions (quite common practice in bioinformatics) Commented Oct 28, 2013 at 21:10

3 Answers 3

1

The output from backticks usually includes newlines, which usually have to be removed before using the output downstream. Add some chomp's to your code:

chomp( my @array =`ls $tf'/.'` );

my $tmp;
my $tmp2;
foreach my $i (@array){
    chomp( $tmp = `cut -f 1-3 $tf'/'$i` );
    chomp( $tmp2 = `cat $tmp` );
}
my $tmp3;
chomp( $tmp3=`sort -k1,1 -k2,2n $tmp2` );
$final = `bedtools merge -i $tmp3`;
Sign up to request clarification or add additional context in comments.

Comments

0

To use a perl variable in the shell, this is an example :

#!/usr/bin/env perl

my $var = "/etc/passwd";

my $out = qx(file $var);

print "$out\n";

For the rest, it's very messy. You should take the time learning perl and not mixing coreutils commands and Perl, where perl itself is a better tool to do the whole joke.

Comments

0

OK. I gave it up on perl and decided to give it a try using shell script. It worked!! Thanks for the above answers though!

for dir in `ls -d */`
do
    name=$(basename $dir /)
    cd $dir
    for file in `ls`
    do
        cut -f 1-3 $file > $file.tmp
    done
    for x in `ls *tmp`
    do
        cat $x >> $name.tmp1
    done
    sort -k1,1 -k2,2n $name.tmp1 > $name.tmp2
    bedtools merge -i $name.tmp2 > $name.combined
done

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.