0

In the function read_train_sets() an empty class is created called DataSets. It has no methods or variables. An object called data_sets is then created.

My question is, is data_sets.train an object of the class DataSet().

Or are you creating a method called train() and setting it equal to an object of the DataSet() class.

Note that there are two classes called DataSet and DataSets in the code.

import cv2
import os
import glob
from sklearn.utils import shuffle
import numpy as np


def load_train(train_path, image_size, classes):
    images = []
    labels = []
    img_names = []
    cls = []

    print('Going to read training images')
    for fields in classes:   
        index = classes.index(fields)
        print('Now going to read {} files (Index: {})'.format(fields, index))
        path = os.path.join(train_path, fields, '*g')
        files = glob.glob(path)
        for fl in files:
            image = cv2.imread(fl)
            image = cv2.resize(image, (image_size, image_size),0,0, cv2.INTER_LINEAR)
            image = image.astype(np.float32)
            image = np.multiply(image, 1.0 / 255.0)
            images.append(image)
            label = np.zeros(len(classes))
            label[index] = 1.0
            labels.append(label)
            flbase = os.path.basename(fl)
            img_names.append(flbase)
            cls.append(fields)
    images = np.array(images)
    labels = np.array(labels)
    img_names = np.array(img_names)
    cls = np.array(cls)

    return images, labels, img_names, cls


class DataSet(object):

  def __init__(self, images, labels, img_names, cls):
    self._num_examples = images.shape[0]

    self._images = images
    self._labels = labels
    self._img_names = img_names
    self._cls = cls
    self._epochs_done = 0
    self._index_in_epoch = 0

  @property
  def images(self):
    return self._images

  @property
  def labels(self):
    return self._labels

  @property
  def img_names(self):
    return self._img_names

  @property
  def cls(self):
    return self._cls

  @property
  def num_examples(self):
    return self._num_examples

  @property
  def epochs_done(self):
    return self._epochs_done

  def next_batch(self, batch_size):
    """Return the next `batch_size` examples from this data set."""
    start = self._index_in_epoch
    self._index_in_epoch += batch_size

    if self._index_in_epoch > self._num_examples:
      # After each epoch we update this
      self._epochs_done += 1
      start = 0
      self._index_in_epoch = batch_size
      assert batch_size <= self._num_examples
    end = self._index_in_epoch

    return self._images[start:end], self._labels[start:end], self._img_names[start:end], self._cls[start:end]


def read_train_sets(train_path, image_size, classes, validation_size):
  class DataSets(object):
    pass
  data_sets = DataSets()

  images, labels, img_names, cls = load_train(train_path, image_size, classes)
  images, labels, img_names, cls = shuffle(images, labels, img_names, cls)  

  if isinstance(validation_size, float):
    validation_size = int(validation_size * images.shape[0])

  validation_images = images[:validation_size]
  validation_labels = labels[:validation_size]
  validation_img_names = img_names[:validation_size]
  validation_cls = cls[:validation_size]

  train_images = images[validation_size:]
  train_labels = labels[validation_size:]
  train_img_names = img_names[validation_size:]
  train_cls = cls[validation_size:]

  data_sets.train = DataSet(train_images, train_labels, train_img_names, train_cls)
  data_sets.valid = DataSet(validation_images, validation_labels, validation_img_names, validation_cls)

  return data_sets
3
  • 1
    data_sets.train = DataSet(... tells you that data_sets.train is a DataSet object. Why are you uncertain? Commented Jul 9, 2018 at 16:47
  • @JohnColeman I'm new to python coming from c and some c++. The .train suggests your calling a method called train on the data_sets object. But there isn't a method called train defined in the DataSets class. Commented Jul 9, 2018 at 16:57
  • Python allows you to dynamically add properties to an object. In that sense it is closer to JavaScript than C++. Even though it is possible to do so, it isn't used heavily in Python. Personally, I don't like this idiom of creating an empty class and then adding properties to its objects. Commented Jul 9, 2018 at 17:00

2 Answers 2

1

You can dynamically assign attributes to your objects in Python. Try inserting hasattr(data_sets, 'train') which asks if data_sets has attribute train after you assign it and see what you get. Also you can call type(data_sets.train) and convince yourself that it is indeed of type DataSet.

Sign up to request clarification or add additional context in comments.

2 Comments

thanks - is the reason or benefit for defining the DataSets class inside the read_train_sets() function purely to limit its scope. I don't see the advantage of it. If you could comment. Thanks
I don't see much benefit in defining it inside your function and it does seem a bit unusual but i may be wrong. I almost exclusively encounter (and practice) class definitions in the global scope, so at the top level of the module, just like your DataSet class.
0
data_sets.train = DataSet(train_images, train_labels, train_img_names, train_cls)

This is quite clear since we are assigning a Class object to the data_sets.train With respect to data_sets object, train and validate will be 2 attributes to it. Hope this helps.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.