1

I'm working with a large set of csv(table) and I need to remove character-containing cells and keep the numeric cells.

For example.

   p1     p2      p3       p4      p5
 dcf23e   2322   acc41   4212     cdefd

So In this case, I only want to remove dcf23e, acc41 and cdefd. After removing those strings, I want to keep them as empty cells.

How would I do this? Thanks in advance.

The code that I've tried is this... , this code remove characters in a string but the problem is, if a string is 23cdgf2, it makes a string 232 which is not what I want. And after removing all the characters, when I try to convert strings to int for calculations, some of the strings became decimals since some string have 123def.24 -> 123.24

temp = ''.join([c for c in temp if c in '1234567890.']) # Strip all non-numeric characters
# Now converting strings to integers for calculations, Using function to use   int() , because of the blank spaces cannot be converted to int
def mk_int(s):
    s = s.strip()
    return int(s) if s else 0
mk_int(temp)
print(temp)
1
  • I read the csv link and it doesn't really say anything about sorting out character containing strings.. Commented Jun 7, 2015 at 3:46

4 Answers 4

3

Compile regex for performance and split the string for correctness

import re
regex = re.compile(r'.*\D+.*')
def my_parse_fun(line):
    return [regex.sub('', emt) for emt in line.split()]

From AbhiP's answer, you can also do

[val if val.isdigit() else '' for val in line.split()]
Sign up to request clarification or add additional context in comments.

Comments

2

I would use a simple setup for doing quick tests.

a = 'dcf23e   2322   acc41   4212     cdefd'
cleaned_val = lambda v: v if v.isdigit() else ''
[cleaned_val(val) for val in a.split()]

It will give you the results if strings are valid numbers otherwise empty string in their place.

['', '2322', '', '4212', '']

However, this provides the strings only. If you want to convert the values into integers (replacing the wrong ones with 0 instead), change your lambda:

convert_to_int = lambda v: int(v) if v.isdigit() else 0

[convert_to_int(val) for val in a.split()]

Your new results will be all valid integers:

[0, 2322, 0, 4212, 0]

Comments

2

use regex

import re
def covert_string_to_blank(_str):
    return ['' if re.findall("[a-zA-Z]+", c) else c for c in _str.split()]

or use isalpha:

def convert_string_to_blank(_str):
    return ['' if any(c.isalpha() for c in s) else s for s in _str.split()]

1 Comment

Your findall function call will not produce correct result. Please test your code.
0

have you tried a for loop with a try statement?

temp = ['dcf23e','2322','acc41','4212','cdefd']
    index = 0
    for element in temp:
        try:
            element+1
        except:
            del temp[index]
        index = index+1
    print temp

or, if you want to convert the value to an int element you can write this:

temp = ['dcf23e','2322','acc41','4212','cdefd']
    index = 0
    for element in temp:
        try:
            element+1
        except:
            temp[index] = 0
        index = index+1
    print temp

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.