0

I'm very new to Python but have an analysis to do.

I have created a new variable from the data set data_2015 and data 2016:

rfd_per_district_15 = data_2015[['Nom_Districte', 'Índex RFD Barcelona = 100']].groupby(['Nom_Districte'], as_index=False).sum()['Índex RFD Barcelona = 100']
rfd_per_district_16 = data_2016[['Nom_Districte', 'Índex RFD Barcelona = 100']].groupby(['Nom_Districte'], as_index=False).sum()['Índex RFD Barcelona = 100']

but the outcomes are very different. The 'rfd_per_district_16' came out exactly the way I would like it to come out. In float format, so I can carry on working with it

0     367.7
1     719.4
2     523.3
3     921.2
4     476.6
5     676.0
6     494.4
7     952.6
8     604.0
9    1054.8
Name: Índex RFD Barcelona = 100, dtype: float64

but 'rfd_per_district_15' came out very strangely. Like data from multiple lines get attached to each other:

0                                     75.8108.576.696.4
1                          104.895.8165.8128.9103.898.6
2                              111.289.0114.4106.3103.0
3          90.488.182.795.256.974.593.572.392.689.380.9
4                                       124.5109.9250.5
6     65.061.448.851.155.955.647.855.454.035.647.134...
7                          43.160.258.075.676.870.185.8
8           78.784.796.8150.295.6162.554.4102.868.357.5
9                      74.336.970.483.282.274.377.088.2
10                       151.7199.1214.1188.9205.1141.0
Name: Índex RFD Barcelona = 100, dtype: object

I see the difference as 'rfd_per_district_15' came out as object type, but why? I had to delete index [5] in 'rfd_per_district_15' as there was some weird value, but even after that data came out strangely (not as rfd_per_district_16). I know just the basics of python so really don't know how to figure that out ...

2 Answers 2

1

Can you tell the difference between

  1. Name: Índex RFD Barcelona = 100, dtype: float64

  2. Name: Índex RFD Barcelona = 100, dtype: object

Problem is with the data type /dtype of the column

Try to convert the data type of the dataframe data_2015 to match the data_2016

Use data_2015.dtypes to check the dtypes of columns.

Sign up to request clarification or add additional context in comments.

1 Comment

Hi Sam, yes you are right, for some reason data_15[Índex RFD Barcelona] is stored as object type, I realised that also. I tried: rfd_per_district_15 = list(np.float_(rfd_per_district_15)) as I did with other variables from the same dataset, bit its says it couldn't convert string to float
1
data_2015["Índex RFD Barcelona = 100"] = data_2015['Índex RFD Barcelona = 100'].astype('float')

with this code, I managed to convert the type of column. Thank you @sam

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.