20

I have quite big dataset. All information stored in the hdf5 format file. I found h5py library for python. All works properly except of the

[<HDF5 object reference>]

I have no idea how to convert it in something more readable. Can I do it at all ? Because documentation in this question slightly hard for me. Maybe there are some others solutions with different languages not only Python. I appreciate every help I will get.

In the ideal it should be link to the file.

It's the part of my code:

import numpy as np
import h5py 
import time

f = h5py.File('myfile1.mat','r') 
#print f.keys()
test = f['db/path']
st = test[3]
print(  st )

st output is [<HDF5 object reference>]

test output is <HDF5 dataset "path": shape (73583, 1), type "|O8">

And I expect instead [<HDF5 object reference>] something like that one: /home/directory/file1.jpg. If it is possible of course.

1
  • 1
    My question isn't about the format only, but about data representation that more important. Maybe I didn't say it correctly in my post, but unfortunately these answers not for my question in real. Commented Feb 16, 2015 at 14:03

3 Answers 3

38

My friend answered my question and I understood how it was easy. But I spent more than 4 hours solving my small problem. The solution is:

import numpy as np
import h5py 
import time

f = h5py.File('myfile1.mat','r') 
test = f['db/path']
st = test[0][0]
obj = f[st]
str1 = ''.join(chr(i) for i in obj[:])
print( str1 )

I'm sorry if don't specified my problem accurately. But this the solution I tried to find.

Sign up to request clarification or add additional context in comments.

5 Comments

Can you explain, what does it mean?
@Dims If I understand correctly, the trouble we're running into is that we have a <HDF5 object reference>, in other words, a reference, not the object itself. The "object" itself is our string. (This is what st is in the code in the answer). Therefore, since this reference is a referring to the object on the file that we read (f), so then we do f[st], which returns our actual object (obj). Then to convert this HDF5 object into a string, we have to iterate over it, take each integer i, convert it to a character (by doing chr(i)) and join it together to get our string
This question and answer are similar: stackoverflow.com/a/12048685/6952495
@RyanQuey The questions are siblings, true but not the same (aka duplicates).
@DmytroChasovskyi definitely, I'd agree. Wasn't trying to say they were duplicate, just wanted to tag them as similar for those who were trying to solve something that the other question addressed
3

Solution

Derive a class from HDF5 and overwrite __repr__ method.

Explanation

When you print an object the interpreter give to you call the function __repr__ on that object wich by default returns the class name and the memory location of the instance.

class Person: 
    def __init__(self, name):
        self.name = name

p = Person("Jhon Doe")
print(p)

>>> <__main__.Person object at 0x00000000022CE940>

In your case, you have a list with just one instance of HDF5 object. The equivalent would be:

print([p])
>>> [<__main__.Person object at 0x000000000236E940>]

Now, you can change how objects are printed by overwirting the __repr__ function of such class.

Note: You could overwrite __str__ as well, see Difference between str and repr in Python for more detail.

class MyReadablePerson(Person):
    def __init__(self, name):
        super(MyReadablePerson, self).__init__(name)
    def __repr__(self):
        return "A person whose name is: {0}".format(self.name)

p1 = MyReadablePerson("Jhon Doe")
print(p1)

>>> A person whos name is: Jhon Doe

Comments

3

You can define your own __str__() or __repr__() method for this class, or create a simple wrapper which formats a string with the information you want to see. Based on quick browsing of the documentation, you could do something like

from h5py import File

class MyHDF5File (File):
    def __repr__ (self):
        return '<HDF5File({0})>'.format(self.filename)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.