0

I am using a for loop to read multiple csv files and create dataframe.I would like to access these data frames outside for loop as well. For which I used the Global keyword as well but it doesn't work.

for file in os.listdir('C:\\Users\\ABCDE\\Desktop\\Measurement'):
  if file.endswith('.csv'):
     print(file)
     name = file[3:6]
     global df_name   # this is the line 
     df_name = 'df' + name  
     print(df_name)
     df_name = pd.read_csv('C:\\Users\\ABCDE\\Desktop\\Measurement\\' + str(file),low_memory = False)
     df_name.rename(columns={0:'values'}, 
             inplace=True)       
     g = df_name.level_1.str[-2:] # Extracting column names
     df_name['lvl'] = df_name.level_1.apply(lambda x: int(''.join(filter(str.isdigit, x))))

As you can see above, I would like to access these dataframes (df_name (3 dataframes as I have 3 files) outside for loop as well

How do I use Global keyword to make these dataframes accessible outside for loop?

7
  • Is your for loop in another function? If not you don't even need to use global. You can just define a variable before your loop and then modify it inside your loop. Commented Jul 29, 2019 at 6:25
  • 1
    @BerkayÖz - I am reading all the files from a directory. So, My aim is to have unique variable name for each dataframe. It's not the same dataframe name for different files. Each file will have different datframe name. So in this case, should I still be declaring a variable outside? Is it a recommended ? Commented Jul 29, 2019 at 6:28
  • You are trying 2 actions in one line, that is why it gives an error. And also it is a must not a recommendation, you can define outside of the scope. Commented Jul 29, 2019 at 6:31
  • @AVLES In that case you should declare a list or a dictionary. Create a local variable in your loop, use that local variable for dataframe purposes and then add that variable to your list or your dictionary. Commented Jul 29, 2019 at 6:33
  • 2
    @AVLES As I mentioned before you don't need to create variables for each of your files. Also it is not recommended and not a good practice. Just add them to a list or a dictionary and then access them from there. Commented Jul 29, 2019 at 6:35

4 Answers 4

1

After your clarification with comments, you can achieve what you want using a list or a dictionary.

dataFrames = list()
dataFrameDict = dict()

for file in os.listdir('C:\\Users\\ABCDE\\Desktop\\Measurement'):
  if file.endswith('.csv'):
     print(file)
     name = file[3:6]
     df_name = pd.read_csv('C:\\Users\\ABCDE\\Desktop\\Measurement\\' + str(file),low_memory = False)
     df_name.rename(columns={0:'values'}, 
             inplace=True)       
     g = df_name.level_1.str[-2:] # Extracting column names
     df_name['lvl'] = df_name.level_1.apply(lambda x: int(''.join(filter(str.isdigit, x))))
     # ADD TO A LIST
     dataFrames.append(df_name)
     # OR TO A DICT
     dataFramesDict[name] = df_name


# How to Access

# Index for 10 files would be 0-9
index = 0
dataFrames[index]

# Name of the dataset you want to access
name = "..."
dataFrameDict[name]
Sign up to request clarification or add additional context in comments.

4 Comments

Oz - Just curious to know, other than using list and dicts, isn't there anyway to use global keyword just in this for loop and be able to access them outside for loop? I don't have any functions as well.. Just a plain for loop
@AVLES Using global keyword lets you access a variable that is outside of your function body. And that will be just a single variable. In your case you need multiple variables but how would you know how many variables you need if you work with a dynamic range of files. In that case you can't just describe variables for each of your files. Only implementation that can handle this will be a list or a dictionary. If you had a single file, you could just define a variable and then attach your dataframe to that variable. You wouldn't need to use global
Oz - Can you help me with this post ? stackoverflow.com/questions/57250943/…
Can you help me with this? stackoverflow.com/questions/57307386/…
1

You need to define the variable name at the top of the function then use

a = dataframe

def func():
    global a
    a = yourdataframe

Comments

1

You need to add a separate line after declaring the variable to make it global Something like this

df_name = 'df' + name 
global df_name

Comments

1

I can understand what you're trying to achieve, but not why do you expect your code to work. 'df' + name is a string, not a variable; plus, you don't declare an external variable like that. The syntax is much simpler, and has nothing to do with pandas. Here's an example of the usage:

a = 'foo'

def get_a():
    global a
    return a

def set_a(b):
    global a
    a = b

if __name__ == '__main__':  # Just defining the entry point of the python script
    print(get_a())
    set_a(2)
    print(get_a())
    print(a)

And here is what you should expect as output of the script:

'foo'
2
2

6 Comments

Yes, am using that string as a dataframe name
Ok, then you have three main options: using globals()['df' + name], using getattr(module_where_the_variable_is, 'df' + name) or eval('df' + name). Watch out for this last one, because eval can execute any code and thus is risky if exposed to public.
I tried globals. It doesn't work. I just have a simple for loop and don't really have any function or multiple modules. Just want that dataframe to be accessible outside for loop. Thanks for your response and help
Then answer given by Berkay Öz is the most suitable for your needs, although the question was very misleading :)
Can you help me with this ? stackoverflow.com/questions/57250943/…
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.