2

I'm trying to train a Keras model with my structured input data stored in csv files. I' reading files as

import tensorflow as tf
from tensorflow.keras import layers
from tensorflow import feature_column

import pathlib

csvs =  sorted(str(p) for p in pathlib.Path('.').glob("My_Dataset/*/*/*.csv"))

data_set=tf.data.experimental.CsvDataset(
    csvs, record_defaults=defaults, compression_type=None, buffer_size=None,
    header=True, field_delim=',', use_quote_delim=True, na_value=""
)
print(type(data_set))

#Output: <class 'tensorflow.python.data.experimental.ops.readers.CsvDatasetV2'>

data_set.take(1)

#Output: <TakeDataset shapes: ((), (), (), (), (), (), (), (), (), (), (), (), (), (), (), (), (), (), (), (), (), (), (), (), (), (), (), (), (), (), (), (), (), (), (), (), (), (), (), (), (), (), (), (), (), (), (), (), (), ()), types: (tf.float32, tf.float32, tf.float32, tf.float32, tf.float32, tf.float32, tf.float32, tf.float32, tf.float32, tf.float32, tf.float32, tf.float32, tf.float32, tf.float32, tf.float32, tf.float32, tf.float32, tf.float32, tf.float32, tf.float32, tf.float32, tf.float32, tf.float32, tf.float32, tf.float32, tf.float32, tf.float32, tf.float32, tf.float32, tf.float32, tf.float32, tf.float32, tf.float32, tf.float32, tf.float32, tf.float32, tf.float32, tf.float32, tf.float32, tf.float32, tf.float32, tf.float32, tf.float32, tf.float32, tf.float32, tf.float32, tf.float32, tf.float32, tf.float32, tf.float32)>

validate_ds = data_set.batch(1000).take(20).repeat()
train_ds = data_set.batch(1000).skip(20).take(80).repeat()

model = tf.keras.Sequential([
    layers.Dense(49,activation='elu'),  
    layers.Dense(49,activation='elu'),  
    layers.Dense(49,activation='elu'),  
    layers.Dense(1,activation='sigmoid') 
])


model.compile(optimizer='adam',
            loss=tf.keras.losses.BinaryCrossentropy(from_logits=True),
            metrics=['accuracy'])    #have to find the related evaluation metrics


model.fit(train_ds,
        validation_data=validate_ds,
        validation_steps=5,
        steps_per_epoch= 5,
        epochs=20,
        verbose=1
        )

But when I compile the model, I get this error:

ValueError: in user code: ValueError: Data is expected to be in format x, (x,), (x, y), or (x, y, sample_weight), found: (<tf.Tensor 'ExpandDims:0' shape=(None, 1) dtype=float32>, <tf.Tensor 'ExpandDims_1:0' shape=(None, 1) dtype=float32>, <tf.Tensor 'ExpandDims_2:0' shape=(None, 1) dtype=float32>, ..... <tf.Tensor 'ExpandDims_49:0' shape=(None, 1) dtype=float32>)

I'm just stuck... Please help!

Edit

As per the answer by Nikaido, I managed to fix the syntactical errors, but now, I'm getting zero accuracy on model training. Which is very unlikely. At least I know there is no problem with my dataset in the csv files. I have checked on the same model using Dataframe. But the issue is I have a large dataset, and now I have to configure my input pipelines to load the dataset from the disk.

enter image description here

1 Answer 1

1

You didn't specify the target in your csv. The model fit expects couples (x, y)

if your csv has the target as last values, you can create a function like this one:

def preprocess(*fields):
    return tf.stack(fields[:-1]), tf.stack(fields[-1:]) # x, y

to split the dataset in data and labels

validate_ds = dataset.map(preprocess).take(1000).batch(32).repeat()
train_ds = dataset.map(preprocess).skip(1000).take(1000).batch(32).repeat()

Regarding the results, I suppose that the problem is that you are trying to make a classification on a target value that is a float

(All you values in the CSV are floats)

What kind of dataset do you have and what you need to do?

Anyway for this specific update I suggest you to open a new question, because it is a totally different problem

Sign up to request clarification or add additional context in comments.

4 Comments

Thanks Dear! You solved one problem. Please check the Edit part of my question
@DevLoverUmar check my answer again with the updates
I added another question stackoverflow.com/q/64838581/7344164
@DevLoverUmar, I suggest you to share also the kind of data and a subset of your data as example

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.