1

I'm working on an Machine Learning Assignment where I go over the bug database, do a multi-class classification and then insert a new column with the classified text. As part of debug, when I run that particular cell again, it says column already exists. I was just wondering if there is a way to get over it (other than the usual Exception handling).

The piece of code that I have written is as follows:

trigger_dict = {
    'Config-Change':['change','changing','changed'], \
    'Upgrade-Downgrade':['Upgrade','Downgrade','ISSU'], \
    'VPC-Related':['MCT','MCEC','VPC'], \
    'CLI-Related':['CC','Consistency','Checker','Show','Debug','Clear'], \
    'Interface-Flap': ['Flap','Shut'] ,\
    'Reload-Related': ['reload','reboot','ASCII','Replay'],\
    'Process-Related': ['Restart','Kill','Process'],\
    'ACL-Related': ['RACL','PACL','IFACL'],\
    'Config-Unconfig': ['config','remove','removal','Unconfig','reconfig'],\
    'HA-Related': ['SSO','LC','Switchover'],\
}


cat_1 = pd.Series([])
flag = 0

for index in range(df['Headline'].shape[0]):
    text = df['Headline'][index]
    for key, value in trigger_dict.items():
        for val in value:
            if re.search(val, text, re.I):
                if not flag:
                    cat_1[index] = key
                    flag = 1
    flag = 0
        
df.insert(len(df.columns),"Trigger_Type", cat_1)


---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-45-d23348f7bbac> in <module>
     12     flag = 0
     13 
---> 14 df.insert(len(df.columns),"Trigger_Type", cat_1)

~/Desktop/Anaconda/anaconda3/envs/nlp_course/lib/python3.7/site-packages/pandas/core/frame.py in insert(self, loc, column, value, allow_duplicates)
   3220         value = self._sanitize_column(column, value, broadcast=False)
   3221         self._data.insert(loc, column, value,
-> 3222                           allow_duplicates=allow_duplicates)
   3223 
   3224     def assign(self, **kwargs):

~/Desktop/Anaconda/anaconda3/envs/nlp_course/lib/python3.7/site-packages/pandas/core/internals.py in insert(self, loc, item, value, allow_duplicates)
   4336         if not allow_duplicates and item in self.items:
   4337             # Should this be a different kind of error??
-> 4338             raise ValueError('cannot insert {}, already exists'.format(item))
   4339 
   4340         if not isinstance(loc, int):

ValueError: cannot insert Trigger_Type, already exists

2 Answers 2

1

It's not working because you already have a column with that name. If you are ok with having duplicate columns then, you can pass allow_duplicates=True.

df.insert(len(df.columns),"Trigger_Type", cat_1, allow_duplicates=True)

Otherwise, you will have to rename the column to something else.

If you want to completely replace the column, you can also use:

df['Trigger_Type'] = cat1

Sign up to request clarification or add additional context in comments.

1 Comment

Having Duplicates might not be a good option. i was just thinking if we have something like Over-write ..
0

Here I'm providing code with output:

Code:

import pandas as pd

data = {
    'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Emily'],
    'Age': [25, 30, 22, 35, 28],
    'Salary': [50000, 60000, 45000, 70000, 55000]
}

df = pd.DataFrame(data)

df['Bonus'] = [1000, 1500, 800, 2000, 1200]  # Add a new column 'Bonus' with random bonus values
print("Updated DataFrame:")
print(df)

Output:

Updated DataFrame:
Name  Age  Salary  Bonus
0    Alice   25   50000   1000
1      Bob   30   60000   1500
2  Charlie   22   45000    800
3    David   35   70000   2000
4    Emily   28   55000   1200

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.