-1

I have a workspace with different projects as shown below.

enter image description here

I have a code in my main_scripts.py which is under main_scripts sub folder that needs to call a function inside the file config_reader.py which is inside the folder user_functions.

testing_framework is my current working directory with pyspark_training as my root project.

My main_script.py looks like this

enter image description here

and my config_reader.py file like this:

enter image description here

I tried to create a dev.env file into the main pyspark_training folder: enter image description here

and also i tried modifying the setting for pyspark_training but i am not too sure whether this is correct.

enter image description here

But still getting : ModuleNotFoundError: No module named 'user_functions'

can anyone help me to solve this? I have gone through a bunch of stack verflow topics covering the issue but to no use. I am still getting the same error.

4
  • 3
    Please do not upload images of code/data/errors. Commented May 28 at 15:52
  • it may need to add /full/path/to/testing_framework to sys.path before importing anything from user_functions Commented May 29 at 15:47
  • 1
    eventually in main script you can use BASE = os.path.dirname(os.path.abspath(__file__)) to get folder in which you have this script, and later create path to .. - parent_folder = os.path.join(BASE, "..") (eventually with os.path.abspath()) and later add parent_folder to sys.path. All because IDE (like VSCode) may run your script with different Current Working Directory and using only .. may mean different place as you expect. You could use print( os.getcwd() ) to check Current Working Directory. Commented May 29 at 15:51
  • 1
    Thank you @furas. This worked. I added the parent folder to the abspath() and it worked. I will paste the working scenario below just in case if anyone else has a similar problem. Thank you!!! Commented Jun 1 at 15:50

3 Answers 3

1

Your user_functions.py lies in a different folder parallel to your main_script.py so you need to let python know to import from the parent (`..`) folder.

import sys
sys.path.append("..")
from user_functions import config_reader
Sign up to request clarification or add additional context in comments.

3 Comments

I tried this, but still the same issue. "ModuleNotFoundError: No module named 'user_functions'"
in main_script.py you changed the import to the suggested lines? this should work, just tested it. what version of python are you using?
Yes i changed to the lines mentioned in the answers, to append the path. But still the same error. I am using python 3.11
1

This is the working version of the problem. The problem was with the cwd which was mentioned in the comments by furas. I am pasting the working version for future reference.

Short summary:
want to access: testing_framework > user_functions > config_reader.py

from: testing_framework > main_scripts > main_script.py

(Note: Folder structure is given in the question above )

main_script.py

import os, sys 
import pyspark
from pyspark.sql import SparkSession

BASE = os.path.dirname(os.path.abspath(__file__)) # added the base path which points to where the script is currently
parent_folder = os.path.join(BASE, "..") # joined with the base path

sys.path.append(parent_folder) # used the parent_path to point to the cwd.
from user_functions import config_reader

spark = SparkSession.builder.appName('validation').master("local").getOrCreate()

configs = config_reader.read_config(spark, config_folder, config_file)

print(configs)

config_reader.py

import pyspark
from pyspark.sql import SparkSession
import os

def read_config(spark: SparkSession, config_folder, config_file):
    full_path = os.path.join(config_folder, config_file)
    
    if config_file.endswith('.csv'):
        df_config = spark.read.format('csv') \
                .option('header', True) \
                .load(full_path)
    
    return df_config.collect()

Comments

0

It's not an issue with vscode but rather with how you're importing your packages.

Using an import like from testing_framework.user_functions.config_reader import read_config should work, but the underscores in your files and module names might break the thing.

This answer could be useful, and others in the thread will give you more information on how imports work in python.

1 Comment

I tried this as well. But again its the "ModuleNotFoundError". The only thing which I can think of and is working is putting the main_scripts file outside the main_scripts folder and then running it.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.