0

Due to annoying overflow problems with C++, I want to instead use Python to precompute some values. I have a function f(a,b) that will then spit out a value. I want to be able to output all the values I need based on ranges of a and b into a file, and then in C++ read that file and popular a vector or array or whatever's better.

  1. What is a good format to output f(a,b) in?
  2. What's the best way to read this back into C++?
  3. Vector or multidim array?
3
  • 2
    "Due to annoying overflow problems with C++" What? Commented Apr 9, 2012 at 20:09
  • @ildjarn I took it to mean that there are intermediate results of a calculation that overflow a given integer size. Since Python automatically switches to a bigint when necessary it doesn't have that problem. Commented Apr 9, 2012 at 20:13
  • @Mark : Ah, that makes sense. I took it to mean buffer overflow, which is a silly reason to switch languages. Commented Apr 9, 2012 at 20:15

3 Answers 3

1

You can use Python to write out a .h file that is compatible with C++ source syntax.

h_file.write('{')
for a in range(a_size):
    h_file.write('{' + ','.join(str(f(a, b)) for b in range(b_size)) + '},\n')
h_file.write('}')

You will probably want to modify that code to throw some extra newlines in, and in fact I have such code that I can show later (don't have access to it now).

Sign up to request clarification or add additional context in comments.

7 Comments

The question is for run-time loading of a file.. not static generation?
@MahmoudAl-Qudsi, the question is unspecific enough that I felt a compile-time solution would be best. The OP is free to come back and tell me I'm wrong.
How does this output get used though? The eventual "array" will have millions of values.
It's source code for a C array. So in your C/C++ source code, do static const int my_data[]= then #include that header. Then recompile your code. Your C++ code can just use the values from the my_data array. (Personally I'd make the Python script write out a full .cpp file, but that's a style issue. The idea is the same).
Your comment was edited to add "millions" while I was writing my comment... Your OS should cope with a huge program file, and with virtual memory it should "just work". But there is a good chance that your compiler will choke on the huge .cpp file. And even if it accepts it, you may find your compile/link is unacceptably slow.
|
0

You can use Python to write out C++ source code that contains your data. E.g:

def f(a, b):
    # Your function here, e.g:
    return pow(a, b, 65537)
num_a_values = 50
num_b_values = 50
# Write source file
with open('data.cpp', 'wt') as cpp_file:
    cpp_file.write('/* Automatically generated file, do not hand edit */\n\n')
    cpp_file.write('#include "data.hpp"\n')
    cpp_file.write('const int f_data[%d][%d] =\n'
                       % (num_a_values, num_b_values))
    cpp_file.write('{\n')
    for a in range(num_a_values):
        values = [f(a, b) for b in range(num_b_values)]
        cpp_file.write('  {' + ','.join(map(str, values)) + '},\n')
    cpp_file.write('}\n')
# Write corresponding header file
with open('data.hpp', 'wt') as hpp_file:
    hpp_file.write('/* Automatically generated file, do not hand edit */\n\n')
    hpp_file.write('#ifndef DATA_HPP_INCLUDED\n')
    hpp_file.write('#define DATA_HPP_INCLUDED\n')
    hpp_file.write('#define NUM_A_VALUES %d\n' % num_a_values)
    hpp_file.write('#define NUM_B_VALUES %d\n' % num_b_values)
    hpp_file.write('extern const int f_data[%d][%d];\n'
                              % (num_a_values, num_b_values))
    hpp_file.write('#endif\n')

You then compile the generated source code as part of your project. You can then use it by #including the header and accessing the f_data[] array directly.

This works really well for small to medium size data tables, e.g. icons. For larger data tables (millions of entries) some C compilers will fail, and you may find that the compile/link is unacceptably slow.

If your data is more complicated, you can use this same method to define structures.

[Based on Mark Ransom's answer, but with some style differences and more explanation].

Comments

0

If there is megabytes of data, then I would read the data in by memory mapping the data file, read-only. I would arrange things so I can use the data file directly, without having to read it all in at startup.

The reason for doing it this way is that you don't want to read megabytes of data at startup if you're only going to use some of the values. By using memory mapping, your OS will automatically read just the parts of the file that you need. And if you run low on RAM, your OS can reuse the memory allocated for that file without having to waste time writing it to the swap file.

If the output of your function is a single number, you probably just want an array of ints. You'll probably want a 2D array, e.g.:

#define DATA_SIZE (50 * 25)
typedef const int (*data_table_type)[50];
int fd = open("my_data_file.dat", O_RDONLY);
data_table_type data_table = (data_table_type)mmap(0, DATA_SIZE,
                                  PROT_READ, MAP_SHARED, fd, 0);
printf("f(5, 11) = %d\n", data_table[5][11]);

For more info on memory mapped files, see Wikipedia, or the UNIX mmap() function, or the Windows CreateFileMapping() function.

If you need more complicated data structures, you can put C/C++ structures and arrays into the file. But you can't embed pointers or any C++ class that has a virtual anything.

Once you've decided on how you want to read the data, the next question is how to generate it. struct.pack() is very useful for this - it will allow you to convert Python values into a properly-formatted Python string, which you can then write to a file.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.