2

im following the instructions to read data from influx into pandas and im getting the following error:

ValueError                                Traceback (most recent call last) <ipython-input-13-1e63a2e6d3db> in <module>()
----> 1 df = pd.DataFrame(AandCStation)
      2 
      3 #AandCStation['time'] # gets the name
      4 
      5 #AandCStation.values

C:\Users\user\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\frame.py in __init__(self, data, index, columns, dtype, copy)
    328                                  dtype=dtype, copy=copy)
    329         elif isinstance(data, dict):
--> 330             mgr = self._init_dict(data, index, columns, dtype=dtype)
    331         elif isinstance(data, ma.MaskedArray):
    332             import numpy.ma.mrecords as mrecords

C:\Users\user\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\frame.py in _init_dict(self, data, index, columns, dtype)
    459             arrays = [data[k] for k in keys]
    460 
--> 461         return _arrays_to_mgr(arrays, data_names, index, columns, dtype=dtype)
    462 
    463     def _init_ndarray(self, values, index, columns, dtype=None, copy=False):

C:\Users\user\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\frame.py in _arrays_to_mgr(arrays, arr_names, index, columns, dtype)    6161   
# figure out the index, if necessary    6162     if index is None:
-> 6163         index = extract_index(arrays)    6164     else:    6165         index = _ensure_index(index)

C:\Users\user\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\frame.py in extract_index(data)    6200     6201         if not indexes and not raw_lengths:
-> 6202             raise ValueError('If using all scalar values, you must pass'    6203                              ' an index')    6204 

ValueError: If using all scalar values, you must pass an index

Read DataFrame defaultdict(<class 'list'>, {'NoT/machinename':         MachineName  MachineType SensorWorking  \

This is the code im running :

client = DataFrameClient(host, port, user, password, dbname)

print("Read DataFrame")
AandCStation = client.query("""SELECT * FROM "NoT/machinename" WHERE time >= now() - 12h""")
print(AandCStation)

print(type(AandCStation))

df = pd.DataFrame(AandCStation)

This is the data:

Read DataFrame
defaultdict(<class 'list'>, {'NoT/sensor':                                       MachineName  MachineType SensorWorking  \
2018-07-16 04:11:19.912895848+00:00  Quench tank          Yes   
2018-07-16 04:11:22.961838564+00:00  Quench tank          Yes   
2018-07-16 04:11:25.872680626+00:00  Quench tank          Yes   
2018-07-16 04:11:28.850205591+00:00  Quench tank          Yes   
...                                           ...          ...           ...   
2018-07-16 16:08:05.188868516+00:00  Quench tank          Yes   
2018-07-16 16:08:08.169862344+00:00  Quench tank          Yes   
2018-07-16 16:08:11.144413930+00:00  Quench tank          Yes   
2018-07-16 16:08:14.126290232+00:00  Quench tank          Yes   
2018-07-16 16:08:17.107127232+00:00  Quench tank          Yes   
2018-07-16 16:08:20.079248843+00:00  Quench tank          Yes   

                                     TempValue  
2018-07-16 04:09:50.467145647+00:00      32.69  
2018-07-16 04:09:53.888973858+00:00      32.69  
2018-07-16 04:09:55.879811649+00:00      32.69  
2018-07-16 04:09:58.818001127+00:00      32.69  
...                                        ...  
2018-07-16 16:08:05.188868516+00:00      34.19  
2018-07-16 16:08:08.169862344+00:00      34.19  
2018-07-16 16:09:43.209347998+00:00      34.19  
2018-07-16 16:09:46.187872612+00:00      34.19  

[12233 rows x 4 columns]})
<class 'collections.defaultdict'>

Any ideas why im getting the error?

2 Answers 2

4

I came across this same issue today.

So it turns out that you are getting a dictionary of DataFrames which you can concat and then droplevel to have the desired columns.

client = DataFrameClient(host, port, user, password, dbname)

print("Read DataFrame")
AandCStation = client.query("""SELECT * FROM "NoT/machinename" WHERE time >= now() - 12h""")
AandCStation = pd.concat(AandCStation, axis=1)
AandCStation.columns = AandCStation.columns.droplevel()

print(AandCStation.head())

print(type(AandCStation))

Hope this helps!

Sources:

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks, that helped! Much appreciated! if you any other tips on analyzing influxdb time series data through pandas would be appreciated?
0

Alternatively you can just index the dicts key with the measurements name to get the query result as DataFrame:

client.query("""SELECT * FROM "NoT/machinename" WHERE time >= now() - 12h""")["NoT/machinename"]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.