Why is Python showing 'ValueError: could not convert string to float'?

Question

I have a CSV containing numbers which I am trying to convert to floats.

filename = "filename.csv"
enclosed_folder = "path/to/Folder"
full_path = os.path.join(enclosed_folder,filename)

with open(full_path) as input_data:
    temp = input_data.readlines()
    n = len(temp) #int(temp.pop(0))
    matrix = [x.split(" ") for x in temp]
    for i in range(n):
        for j in range(n):
            matrix[i][j] = float(matrix[i][j])
    input_data.close()

When I open the file in any text editor, it does not show the \n at the end of each row.

But running the python code shows the `ValueError: could not convert string to float' because of '\n' being present at the end of each row.

Traceback (most recent call last):
  File "hierarchical-clustering.py", line 37, in <module>
    matrix[i][j] = float(matrix[i][j])
ValueError: could not convert string to float: '1,0.058824,0.076923,0.066667,0.055556,0.058824,0.071429,0.052632,0.076923,0.0625,0.0625,0.055556,0.055556,0.05,0.066667,0,0,0.055556,0.0625,0.058824,0.058824,0.047619,0.055556,0.0625,0,0.052632,0.066667,0.055556,0.0625,0.058824,0.041667,0.066667,0.058824,0.071429,0.066667,0.076923,0,0.083333,0.052632,0.071429,0.076923,0,0.0625,0.076923,0.058824,0.076923,0.055556,0,0.0625,0.071429,0.0625,0.0625,0.083333,0,0,0,0.058824,0.0625,0,0.058824,0.0625,0.0625,0.066667,0.0625,0.052632,0.066667,0.076923,0.058824,0.071429,0.066667,0.058824,0.071429,0.058824,0.071429,0.058824,0.071429,0.071429\n'

So, how do I fix this error?

EDIT: I used strip() as well as rstrip() as suggested in some of the answers to remove whitespaces, but still the error does not go away:

Traceback (most recent call last):
  File "hierarchical-clustering.py", line 37, in <module>
    matrix[i][j] = float(matrix[i][j].rstrip())
ValueError: could not convert string to float: '1,0.058824,0.076923,0.066667,0.055556,0.058824,0.071429,0.052632,0.076923,0.0625,0.0625,0.055556,0.055556,0.05,0.066667,0,0,0.055556,0.0625,0.058824,0.058824,0.047619,0.055556,0.0625,0,0.052632,0.066667,0.055556,0.0625,0.058824,0.041667,0.066667,0.058824,0.071429,0.066667,0.076923,0,0.083333,0.052632,0.071429,0.076923,0,0.0625,0.076923,0.058824,0.076923,0.055556,0,0.0625,0.071429,0.0625,0.0625,0.083333,0,0,0,0.058824,0.0625,0,0.058824,0.0625,0.0625,0.066667,0.0625,0.052632,0.066667,0.076923,0.058824,0.071429,0.066667,0.058824,0.071429,0.058824,0.071429,0.058824,0.071429,0.071429'

I don't think float cares about newlines. I just tried float("1.0\n") on my machine and it happily gives me 1.0. I think the problem is your commas. float("1,2") does not work, for instance. — Kevin
– Kevin, Commented Jun 29, 2017 at 13:19
Have you considered using the csv module to read your csv file? If you use that instead of trying to parse the file manually, IIRC it will perform rudimentary type conversion on your behalf. Then you don't need to call float at all. — Kevin
– Kevin, Commented Jun 29, 2017 at 13:20
@Kevin - No, Python's csv will not assume any types. It deliberately considers everything a string. (This is both more Pythonic (explicit is better than implicit) and avoids one of the things that programmers hate most about Excel.) — John Y
– John Y, Commented Jun 29, 2017 at 13:30
Oops. Perhaps I was thinking of a third-party csv parser, then. Still, the module is useful even without providing type conversion. — Kevin
– Kevin, Commented Jun 29, 2017 at 13:36

Robert · Accepted Answer · 2017-06-29 13:24:02Z

6

The error is due to your line parsing. You are separating on spaces, not commas, which is what should happen according to your screenshot. The key is looking at the error returned. It is trying to convert the entire line from a string into a float.

Change:

matrix = [x.split(" ") for x in temp]

To:

matrix = [x.split(",") for x in temp]

answered Jun 29, 2017 at 13:24

Robert

8,7672 gold badges29 silver badges34 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Robert Over a year ago

@Kristada673, it happens to us all. The best thing to do is read the error messages very carefully to determine the root cause. Otherwise, you're likely to go down a rabbit hole and waste a ton of time before realizing how simple the mistake was.

Shai · Accepted Answer · 2017-06-29 13:19:20Z

2

you can use strip() to remove whitespaces from the string.

matrix[i][j] = float(matrix[i][j].strip())

If the commas are troubling you, you might want to .split(',') with commas and not spaces:

matrix = [x.strip().split(",") for x in temp]

answered Jun 29, 2017 at 13:19

Shai

115k39 gold badges259 silver badges398 bronze badges

Comments

Ofer Sadan · Accepted Answer · 2017-06-29 13:19:33Z

1

Remove the newline char with rstrip() like this:

matrix[i][j] = float(matrix[i][j].rstrip())

answered Jun 29, 2017 at 13:19

Ofer Sadan

12k6 gold badges42 silver badges66 bronze badges

Collectives™ on Stack Overflow

Why is Python showing 'ValueError: could not convert string to float'?

3 Answers 3

1 Comment

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

1 Comment

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related