1

I am trying to loop through a large 8gb database with psycopg2 and python. I have followed the documentation and I'm getting an error. I am trying to loop through each line of the database without using .fetchall() because its just to big to fetch it all into memory. You can't use fetchone() because it will fetch each column individually.

Note that the first time through it will return a value, on the second time through it will give the error.

The documentation reads:

Note cursor objects are iterable, so, instead of calling explicitly fetchone() in a loop, the object itself can be used:
>>> cur.execute("SELECT * FROM test;")
>>> for record in cur:
...     print record
...
(1, 100, "abc'def")
(2, None, 'dada')
(3, 42, 'bar')

My code reads:

statement = ("select source_ip,dest_ip,bytes,datetime from IPS")
cursor.execute(statement)

for sip,dip,bytes,datetime in cursor:
    if sip in cidr:
        ip = sip
        in_bytes = bytes
        out_bytes = 0
        time = datetime
    else:
        ip = dip
        out_bytes = bytes
        in_bytes = 0
        time = datetime    
    cursor.execute("INSERT INTO presum (ip, in_bytes, out_bytes, datetime) VALUES (%s,%s,%s,%s);", (ip, in_bytes, out_bytes, time,))
    conn.commit()
    print "writing to presum"

and i get the following error:

for sip,dip,bytes,datetime in cursor: psycopg2.ProgrammingError: no results to fetch

2
  • How are you creating the cursor? Also, you can use the same cursor for two different purposes at the same time, use a second cursor for that. Commented May 27, 2014 at 20:16
  • @jjanes cursor = conn.cursor() it wont let me do it with just cursor i have to use fetchall() it will pull the first few values with just cursor but then it will say no results to fetch but obviously there are sum Commented May 28, 2014 at 15:36

3 Answers 3

1

Looks like you're passing a tuple to cursor.execute. Try passing the sql string you want to run.

statement = "select source_ip,dest_ip,bytes,datetime from IPS"
cursor.execute(statement)
Sign up to request clarification or add additional context in comments.

1 Comment

ok I did that and then it gets through about half of the test data and then gives me to results to fetch but if I use fetch all it will loop through them all.
1

You are changing the result set inside the loop here

cursor.execute("INSERT INTO presum (ip, in_bytes, out_bytes, datetime) VALUES (%s,%s,%s,%s);", (ip, in_bytes, out_bytes, time,))

In instead do it all in sql

statement = """
    insert into presum (ip, in_bytes, out_bytes, datetime)

    select source_ip, bytes, 0, datetime
    from IPS
    where source_ip << %(cidr)s

    union all

    select dest_ip, 0, bytes, datetime
    from IPS
    where not source_ip << %(cidr)s
"""

cidr = IP('200.90.230/23')

cursor.execute(statement, {'cidr': cidr.strNormal()})
conn.commit()

I'm assuming source_ip is of type inet. the << operator checks if an inet address is contained within a subnet

3 Comments

you can't use fetch all on a 8gb database because it can't hold that all into memmory
@PythonDevOps A better approach
the cidr is a IPy network cidr range. It's defined as a global variable. cidr = IPy.IP('200.90.230/23') I changed the numbers but you get the idea
0

I was interested in this question. I think perhaps what you could do is to use cursor.fetchmany(size). For example:

cursor.execute("select * from large_table")

# Set the max number of rows to fetch at each iteration
max_rows = 100
while 1:
  rows = cursor.fetchmany(max_rows)
  if len(rows) == 0:
     break
  else:
     for arow in rows:
        # do some processing of the row

Maybe that would work for you?

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.