0

I've pinged a list of computers many times however I am trying to use Python for my first time with a list of about 4,000 computer names and my script is very slow. How would I go about making this much faster and writing the output to a comma delimited text file?

import pandas as pd
import os
import sys
import subprocess
import datetime

#Get current date and time
now = datetime.datetime.now()
dt = now.strftime("%Y-%m-%d")
dtnow = now.strftime("%Y-%m-%d %H:%M")

#Open the file and read into memory
fh = pd.read_csv('list.csv')

#Fix column headers by replacing the spaces with a underscore
fh.columns = fh.columns.str.strip().str.replace(' ', '_')

#Read the computer names into a variable called "computers"
computers = fh.Machine_Name
#Debug - Uncomment line below to see a list of computer names from csv file
#print(computers)

def ping(comp):
    args = ["ping", "-n", "2", comp]
    p = subprocess.Popen(args, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
    output, error = p.communicate()
    if 'bytes=32' in output:
        writetofile(comp, ',online')
    else:
        writetofile(comp, ',offline')
    #endIf
#endDef

def writetofile(compname, data):
    with open('DLPProv_' + dt + '.txt', 'a') as f:
        f.write(compname + data + '\n')
    #endWith
#endDef

for i in computers:
    ping(i)
#endFor

f.write('END: ' + dtnow)
f.close()

I've tried using the code that @rolandsmith posted but i'm getting errors:


import concurrent.futures as cf
import os
import pandas as pd
from pythonping import ping

#Open the file and read into memory
fh = pd.read_csv('list.csv')

#Fix column headers by replacing the spaces with a underscore
fh.columns = fh.columns.str.strip().str.replace(' ', '_')

#Read the computer names into a variable called "computers"
computers = fh.Machine_Name

def pingworker(address):
    rv = ping(address, count=4)
    if rv.success():
        return address, True
    return address,False

with cf.ThreadPoolExecutor() as tp:
    res = tp.map(pingworker, computers)

2
  • I think the ping function takes more time. Could add a debug statement before and after it printing to calculate the time it takes? Commented Sep 23, 2019 at 18:34
  • 1
    Use concurrent.futures.ThreadPoolExecutor's map or multiprocessing.dummy.Pool's imap/imap_unordered to get thread-based parallelism? That's the usual solution for latency bound problems like this. Commented Sep 23, 2019 at 18:40

1 Answer 1

1

When you think a script is slow, you should measure what causes it to be slow. Use e.g. line-profiler.

My guess would be that in this case it is the subprocess that takes up most of the time. Ideally, you want to remove the overhead of starting a process for each ping. So instead of calling the ping program, install the pythonping module. This allows you to perform an ICMP echo request from Python. Note that this uses raw sockets, so depending on the OS you might need to run the script as root or make it capable of using raw sockets. Using this module removes the overhead of using subprocess.

Next, when your script is doing this, it is mostly waiting for a reply from the network. So we use a concurrent.futures.ThreadPoolExecutor to start more than one ping in parallel

import concurrent.futures as cf
import os
from pythonping import ping

def pingworker(address):
    rv = ping(address)
    if rv.success():
        return address, True
    return address,False

with cf.ThreadPoolExecutor() as tp:
    res = tp.map(pingworker, list_of_addresses)

After this, res is a list of 2-tuples, each containing the address and boolean if it failed or succeeded.

Note that from Python 3.5 onward, a ThreadPoolExecutor launches 5*N threads, where N is the number of cores on your machine. So for a four-core machine, there would be 20 ping calls running at once. You can experiment with the max_workers parameter when creating a ThreadPoolExecutor, but at a certain point you're going to saturate your network connection with ping calls.

Edit

The pythonping.ping function requires an IP address, not a name. So you would have to do name lookup first. Luckily, this is built into the socket module. You can use e.g. socket.gethostbyname_ex to do IPv4 address lookup. Or socket.getaddrinfo to get both IPv4 and IPv6 addresses.

If you have a list of names, presuming you're using IPv4, you could change the worker like this:

import concurrent.futures as cf
import socket
import os
from pythonping import ping

def pingworker(name):
    """
    Ping a hostname.

    Arguments:
        name (str): hostname:

    Returns:
        a 3-tuple (hostname, IP-address, ping-result)
        where hostname and IP-address are strings and
        ping-result is a bool.
    """
    try:
        _, _, IPs = socket.gethostbyname_ex(name)
        address = IPs[0]
    except socket.gaierror:
        return name, None, False  # Name lookup failed.
    rv = ping(address)
    if rv.success():
        return name, address, True
    return name, address, False   # Host doesn't respond.

with cf.ThreadPoolExecutor() as tp:
    res = tp.map(pingworker, list_of_names)

I also modified the worker function to also return the IP address. That way you can distinguish a host that doesn't return pings from a host whose name cannot be resolved.

Sign up to request clarification or add additional context in comments.

6 Comments

Requires raw sockets? That's... suboptimal. Third party packages that require admin/root to run are giant security holes; if the owner's creds get compromised (or the owner decides to be malicious themselves), the package can silently be updated to hijack machines. Clearly ping itself runs without admin/root, so I'm somewhat skeptical of the need for it in the package.
It doesn't actually require root privileges, just the cap_net_raw capability that can be set on the file by root. This is what ping does on my box.
@ShadowRanger If this was a server application, I would agree. Since it is not, the danger is much less. Second, your ping binary also runs setuid root.
@randomusername I suspect that cap_net_raw is linux specific. And the OP doesn't mention which OS he is using.
@RolandSmith Thanks for the detailed response. I'm not sure If I'm adding the computer names correct because I'm getting a lot of errors when running your script.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.