0

I want to take all the records from my MySQL table and check if there are duplicates. I had the idea of storing them all in an array and then checking the array for duplicates. The problem is, I have about 1.5 million rows in my MySQL table.

This is my code so far:

<?php

$con = mysql_connect('localhost', 'root', '');
$sel = mysql_select_db('usraccts', $con);

$users = array();

$q = "SELECT usrname FROM `users`";
$r = mysql_query($q, $con);

while($row = mysql_fetch_assoc($r))
{
 $users[] = $row['usrname'];
}

print_r($emails);

?>

I'm not sure how I can adapt this to check for duplicates in the array entries, especially with 1.5 million of them :|

Thanks for any help.

1
  • The best bet is to handle it in the SQL query as opposed to the PHP. That way you are only processing the usernames once on the database not once on the database and a second time in the PHP. Commented Jan 4, 2010 at 22:33

7 Answers 7

1

You can do it in MYSQL with something like

SELECT usrname, COUNT(usrname) as duplicates FROM `users` WHERE duplicates > 1 GROUP BY usrname

Obviously all of the usrname returned have duplicates

Sign up to request clarification or add additional context in comments.

Comments

1

$q = "SELECT distinct usrname FROM users";

With this query you get all unique usernames.

Comments

1

Maybe you could try a SQL query like:

SELECT usrname, 
COUNT(usrname) AS NumOccurrences
FROM users
GROUP BY usrname
HAVING ( COUNT(usrname) > 1 )

this should return all users that exist more than once.

Comments

0
$q = "SELECT count(*),usrname FROM `users` group by usrname having count(*)>1";

Comments

0

A few comments:

One, is you can use a DISTINCT keyword in your SQL to return originals only (no dupes)

Two, why are you inserting duplicates in the db in the first place? You might want to fix that.

Three, you could select all the rows (not a good idea) and just stick them in the array like your doing, except make this change:

$users[$row['username']] = $row['username'];

No dupes in that logic! heh

1 Comment

Thanks, I'm actually fixing up a friends website and he has his users seperated by their IDs, and they can have the same username. hence why I'm trying to fix.
0

You could use the group by mysql function to find out, emails exist twice or more. This is very heavy load on the mysql server though.

SELECT usrname, count(*)
FROM `users`
GROUP BY `email`
HAVING count(*) > 1;

Comments

0

array_unique() will return only the unique array values. In all honesty, I wouldn't delegate this task to PHP, I'd handle it during my query to the database.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.