0

I am trying to split a csv data containing data with arrays into multiple columns. This works perfectly for most arrays since these are whole numbers, but if I try to split following array (containing dot values) I get a problem.

So here the example. Suppose you have the following array data saved in a column called "Array"

{58.5 |58.5 |58.5 |58.5 |58.5 |58.5 |58.5 |58.5 |58.5 |58.5 |58.5 |58.5}

If I apply the following
```python
splitted_data=raw_data["Array"].str.split("\D",expand=True).add_prefix("setrwds_x")

I get the follwing result

   setrwds_x1  setrwds_x2  setrwds_x4  setrwds_x5  setrwds_x7  setrwds_x8  \
0             58           5          58           5          58           5   
1             58           5          58           5          58           5   
2             58           5          58           5          58           5   
3             58           5          58           5          58           5   
4             58           5          58           5          58           5   
5             58           5          58           5          58           5   
6             58           5          58           5          58           5   
7             58           5          58           5          58           5   
8             58           5          58           5          58           5   
9             58           5          58           5          58           5   
10            58           5          58           5          58           5   
11            58           5          58           5          58           5   
12            58           5          58           5          58           5   
13            58           5          58           5          58           5   
14            58           5          58           5          58           5   
15            58           5          58           5          58           5   
16            58           5          58           5          58           5   
17            58           5          58           5          58           5   

It splits the 58.5 into two columns, which is wrong. I need to keep the 58.5.

Do you guys have an advice how to solve the problem?

1
  • why aren't you splitting by the | delimiter? Commented Jan 22, 2021 at 10:28

1 Answer 1

1

Try this. \D in regex stands for non-digit, which includes | and ., you would want to explicitly split on | only. You also need to avoid the first and last bracket using str[1:-1]

raw_data["Array"].str[1:-1].str.split("|",expand=True).add_prefix("setrwds_x")

Tested this out with a dummy series -

#Dummy series
d = ['{58.5 |58.5 |58.5 |58.5 |58.5 |58.5 |58.5 |58.5 |58.5 |58.5 |58.5 |58.5}', 
     '{58.5 |58.5 |58.5 |58.5 |58.5 |58.5 |58.5 |58.5 |58.5 |58.5 |58.5 |58.5}', 
     '{58.5 |58.5 |58.5 |58.5 |58.5 |58.5 |58.5 |58.5 |58.5 |58.5 |58.5 |58.5}']
dd = pd.Series(d)

out = dd.str[1:-1].str.split("|",expand=True).add_prefix("setrwds_x")
print(out)
  setrwds_x0 setrwds_x1 setrwds_x2 setrwds_x3 setrwds_x4 setrwds_x5  \
0      58.5       58.5       58.5       58.5       58.5       58.5    
1      58.5       58.5       58.5       58.5       58.5       58.5    
2      58.5       58.5       58.5       58.5       58.5       58.5    

  setrwds_x6 setrwds_x7 setrwds_x8 setrwds_x9 setrwds_x10 setrwds_x11  
0      58.5       58.5       58.5       58.5        58.5         58.5  
1      58.5       58.5       58.5       58.5        58.5         58.5  
2      58.5       58.5       58.5       58.5        58.5         58.5
Sign up to request clarification or add additional context in comments.

9 Comments

I already tried it before. The problem is that the whole array is saved as string I get the following output applying your line of code ('Unable to parse string "{58 " at position 0', 'occurred at index realwds_x0')
Your data seems to have some typos in that case. It seems there is either a . or a | missing from where it should be
I editted the array. There is a whitespace before the "|" sign. Hope it gets clearer
try now.. just added a white space as well.
thank you soo much. Works perfectly :) I owe you one
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.