How to properly split a specific string in Python

Question

I have a list of string values that represents the id.age of some users:

users = ["1.20", "2.35", "3", "4", "5.", "6.30", "7."]

How can I properly split it to get the id and age separately?

Note that we have some data with the age information missing (e.g. "3" and "4"), and even worse, we have some data only with an id and a point (e.g. "5." and "7.").

Sure I can use the split function, for example:

>>> "1.2".split('.')
['1', '2']
>>> "2".split('.')
['2']
>>> "3.".split('.')
['3', '']

But, then I will need to check each result. Maybe, something like this:

res = "3.".split('.')
id = int(res[0])
if len(res) > 1:
    if res[1] != "":
        age = int(res[1])

Another option is to use the rpartition function, for example:

>>> "1.2".rpartition('.')
('1', '.', '2')
>>> "2".rpartition('.')
('', '', '2')
>>> "3.".rpartition('.')
('3', '.', '')

But I still need to check the results 'manually' and, in the second example, the value that should be the id is in the age position. (e.g. ('', '', '2')).

Is there a built in function that I can get the result like this?

>>> "1.2".some_split_function('.')
('1', '.', '2')
>>> "2".some_split_function('.')
('2', None, None)
>>> "3.".some_split_function('.')
('3', '.', None)

So I can just call it in a loop like this:

for user_info in users:
    id, _, age = user_info.some_split_function('.')
    print int(id)
    if age is not None:
        print int(age)

Slater Victoroff · Accepted Answer · 2017-07-22 22:55:36Z

2

Yup, you just use partition instead of rpartition.

for user_info in users:
    id, _, age = user_info.partition('.')
    if age.isdigit():
        print int(age)

You'll want to change that conditional from being None to just checking if you've pulled out a number appropriately. This will take care of empty strings etc...

In general though, the way to avoid this problem is to not structure your data like that in the first place.

Seeing some of the other answers, no reason to do anything so complex. If you want a functional solution that maps id to age, then I would advocate for something like this:

>>> {id: age or None for id, _, age in [user.partition(".") for user in users]}
{'1': '20', '3': None, '2': '35', '5': None, '4': None, '7': None, '6': '30'}

edited Jul 22, 2017 at 22:55

answered Jul 22, 2017 at 16:08

Slater Victoroff

22k23 gold badges92 silver badges149 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Francisco Over a year ago

Have you tested your code? Because I'm pretty there's no such a thing as lpartition.

KelvinS Over a year ago

I'm getting the following error when trying to use lpartition: AttributeError: 'str' object has no attribute 'lpartition'. Just to clarify, the data is not 'correctly' structured because it is coming from a text file manually edited.

Slater Victoroff Over a year ago

Apologies, I'm on a strange python version, just partition

ettanany · Accepted Answer · 2017-07-22 16:22:02Z

0

Try the following, we split u only if it contains ., if not, u is the id and age is assigned None.

users = ["1.20", "2.35", "3", "4", "5.", "6.30", "7."]
data = []

for u in users:
    id, age = u.split('.') if '.' in u else [u, None]
    age = None if age == '' else age
    data.append({id: age})

If you want your ids to be integers, just call int() function on id like this:

data.append({int(id): age})

Output:

>>> data
[{'1': '20'}, {'2': '35'}, {'3': None}, {'4': None}, {'5': None}, {'6': '30'}, {'7': None}]

edited Jul 22, 2017 at 16:22

answered Jul 22, 2017 at 16:17

ettanany

20k9 gold badges49 silver badges64 bronze badges

3 Comments

KelvinS Over a year ago

Thanks a lot @ettanany, this is a great example and fits very well in my case.

ettanany Over a year ago

@KelvinS Great! Happy to help :)

Slater Victoroff Over a year ago

@ettanany Code is overly complex. Injecting a lot of useless logic just to compensate for the fact that you're using split instead of partition, which is a better solution. List of dicts as an output format also doesn't make a lot of sense for this problem.

Collectives™ on Stack Overflow

How to properly split a specific string in Python

2 Answers 2

3 Comments

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

3 Comments

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related