1

I have many records of the following type whose attribute values are in string

{
  "City": "Pune",
  "Temperature": "32",
  "Unit": "C",
  "Date": "22-11-2012"
}

and an associated record descriptor that defines datatype and other attribute properties

{
  "City": {
    "datatype": "string"
  },
  "Temperature": {
    "datatype": "double"
  },
  "Unit": {
    "datatype": "string"
  },
  "Date": {
    "datatype": "datetime",
    "dateformat": "%d-%m-%Y",
    "timezone": "UTC"
  }
}

I need to convert the record attribute values from string to the appropriate datatype mentioned in the record descriptor

I have a function dispatch dictionary

{
   "int" : string_to_int,
   "double": string_to_double,
   "bool": string_to_bool,
   "datetime": string_to_datetime
}

def string_to_int(value):
    <<convert to integer>>

def string_to_double(value):
    <<convert to double>>

def string_to_bool(value):
    <<convert to bool>>

def string_to_datetime(value, date_format, time_zone):
    <<convert to datetime>>

By looping through each attribute, how can I do a function dispatch in python to convert the attribute values to appropriate data types? What is the right way to pass additional arguments for datatime conversion without using any if..else logic within the loop?

8
  • 1
    would you be open to using a library like marshmallow to solve this? -- marshmallow.readthedocs.io/en/latest Commented Jul 3, 2018 at 16:23
  • What have you tried? You are expected to show some research effort. Commented Jul 3, 2018 at 16:28
  • 1
    At the very least, store a reference to each converter function, not its name, in your dispatch dictionary. { "int": string_to_int, "double": string_to_double, ... }. Commented Jul 3, 2018 at 16:31
  • @chepner sorry I made a mistake while entering Commented Jul 3, 2018 at 16:33
  • 1
    This doesn't seem like a good case for actual argument-dispatch, because you want to dispatch on return type and Python's (optional) static typing system does not support that (few languages actually do). Moreover, argument-dispatch is quite an overkill in your case: you already have a well-defined mapping between different data fields and their types. Commented Jul 3, 2018 at 16:39

1 Answer 1

5

To answer your specific question, if you modify your type name to function map to store the functions themselves, rather than the name of the function:

type_to_function_map = {
   "int" : string_to_int,
   "double": string_to_double,
   "bool": string_to_bool,
   "datetime": string_to_datetime
}

And you change the additional arguments to functions like string_datetime to be keyword arguments:

def string_to_datetime(value, date_format=None, time_zone=None):
    pass

You can easily write a function like:

def load(payload, schema)
    output = {}
    for k, v in payload.items():
        field_info = schema[k].copy()
        output[k] = type_to_function_map[field_info.pop("datatype")](v, **field_info)

    return output

But having said that, there are a number of libraries that do what you're trying to do far better. My personal favorite of these is marshmallow:

from marshmallow import fields, Schema

class MySchema(Schema):
    City = fields.String()
    Temperature = fields.Float()
    Unit = fields.String()
    Date = fields.DateTime("%d-%m-%Y")

Which you can then use like this:

MySchema().loads(data)
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.