Extract numbers from strings in python

Question

I want to extract numbers from strings like. They appear in many columns so what is the most efficient way to remove these strings and get only the numbers? Is there a way other than using regex

Its always important to add 3 simple things in your question. 1st- Samples of input, 2nd- Samples of output and 3rd- your efforts in form of code, kindly do add these in your question to make it clear, thank you. — RavinderSingh13
– RavinderSingh13, Commented Apr 5, 2021 at 23:47
@RavinderSingh13 Thanks for letting me know. I've added sample input and outputs. — Ilovenoodles
– Ilovenoodles, Commented Apr 6, 2021 at 1:04
@AmitVikramSingh Thanks for letting me know. I've added them. — Ilovenoodles
– Ilovenoodles, Commented Apr 6, 2021 at 1:04

Tim Biegeleisen · Accepted Answer · 2021-04-05 23:46:43Z

4

Assuming you expect only one number per column, you could try using str.extract here:

df["some_col"] = df["some_col"].str.extract(r'(\d+(?:\.\d+)?)')

answered Apr 5, 2021 at 23:46

Tim Biegeleisen

526k32 gold badges323 silver badges399 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Ilovenoodles Over a year ago

Hi Tim, What if I have 32 columns that need to be modified like this? Is there a more efficient way to do this?

Tim Biegeleisen Over a year ago

@Ilovenoodles Reference this accepted answer. You may use str.extract on multiple columns, by passing a list of columns.

git_rekt · Accepted Answer · 2021-04-06 00:28:42Z

I would use a function with regex that matches the pattern of what you are seeing. Since you tagged pandas and dataframe I am assuming you are working with a dataframe but a sample output would certainly help. Here is how I would tackle it:

import pandas as pd
import numpy as np
import re

def extract_numbers (column1: str):
  result = np.nan
  for x in column1.split():
    if re.search(r'\d+\.?\d+', x)
      result = float(re.search(r'\d+\.?\d+', x).group())

    if pd.notnunll(result):
      return result

df['Numbers'] = df['YourColumn'].apply(extract_numbers)

The result of this function would be a new column called "Numbers" that contains the extracted number from each string. It will return NaN when a number is not found (or matched to). Once you have a column with the number value from each string you can interact with it however you please.

Collectives™ on Stack Overflow

Extract numbers from strings in python

2 Answers 2

2 Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related