@bgarcial wrote:
I have the followings pandas dataframe
phreatic_level_l2n1_28w_df.head() Fecha Hora PORVL2N1 # PORVLxNx column change their name in each data frame 0 2012-01-12 01:37:47 0.65 1 2012-01-12 02:37:45 0.65 2 2012-01-12 03:37:50 0.64 3 2012-01-12 04:37:44 0.63 4 2012-01-12 05:37:45 0.61 phreatic_level_l2n2_28w_df.head() Fecha Hora PORVL2N2 # PORVLxNx column change their name in each data frame 0 2018-01-12 01:58:22 0.71 1 2018-01-12 02:58:22 0.71 2 2018-01-12 03:58:23 0.71 3 2018-01-12 04:58:23 0.71 4 2018-01-12 05:58:24 0.71 phreatic_level_l4n1_28w_df.head() Fecha Hora PORVL4N1 # PORVLxNx column change their name in each data frame 0 2018-01-12 01:28:49 0.96 1 2018-01-12 02:28:49 0.96 2 2018-01-12 03:28:50 0.96 3 2018-01-12 04:28:52 0.95 4 2018-01-12 05:28:48 0.94
And so, successively until have 25 data frames of type
phreatic_level_l24n2_28w_df
. . . phreatic_level_l24n2_28w_df.head() Fecha Hora PORVL24N2 # PORVLxNx column change their name in each data frame 0 2018-01-12 01:07:28 1.31 1 2018-01-12 02:07:28 1.31 2 2018-01-12 03:07:29 1.31 3 2018-01-12 04:07:27 1.31 4 2018-01-12 05:07:27 1.31
Each one of these previous data frames in their
PORVLxNx
column contains values per day in the date range (Fecha
column) from2018-01-12
until2018-08-03
, having per each day many values ofPORVLxNx
columnphreatic_level_l24n2_28w_df.tail() Fecha Hora PORVL24N2 4875 2018-08-03 20:31:01 1.15 4876 2018-08-03 21:31:00 1.15 4877 2018-08-03 22:31:01 1.16 4878 2018-08-03 23:31:02 1.17 4879 NaN NaN NaN
My objective is to take each dataframe and generate the average per day of each value of
PORVLxNx
column, something like this:Fecha PORVL2N1 0 2018-01-12 0.519130 1 2018-01-13 0.138750 2 2018-01-14 0.175417 3 2018-01-15 0.111667 4 2018-01-16 0.291250
I have the following approach:
I place my
DataFrame
s in a dict and I reference them by string:dfs = { 'phreatic_level_l2n1_28w_df': phreatic_level_l2n1_28w_df, # FOR THE MOMENT I ONLY TEST with the first dataframe # 'phreatic_level_l2n2_28w_df': phreatic_level_l2n2_28w_df, # 'phreatic_level_l4n1_28w_df': phreatic_level_l4n1_28w_df, # 'phreatic_level_l5n1_28w_df': phreatic_level_l5n1_28w_df, # 'phreatic_level_l6n1_28w_df': phreatic_level_l6n1_28w_df, # 'phreatic_level_l7n1_28w_df': phreatic_level_l7n1_28w_df, # 'phreatic_level_l8n1_28w_df': phreatic_level_l8n1_28w_df, # 'phreatic_level_l9n1_28w_df': phreatic_level_l9n1_28w_df, # 'phreatic_level_l10n1_28w_df': phreatic_level_l10n1_28w_df, # 'phreatic_level_l13n1_28w_df': phreatic_level_l13n1_28w_df, # 'phreatic_level_l14n1_28w_df': phreatic_level_l14n1_28w_df, # 'phreatic_level_l15n1_28w_df': phreatic_level_l15n1_28w_df, # 'phreatic_level_l16n1_28w_df': phreatic_level_l16n1_28w_df, # 'phreatic_level_l16n2_28w_df': phreatic_level_l16n2_28w_df, # 'phreatic_level_l18n1_28w_df': phreatic_level_l18n1_28w_df, # 'phreatic_level_l18n2_28w_df': phreatic_level_l18n2_28w_df, # 'phreatic_level_l18n3_28w_df': phreatic_level_l18n3_28w_df, # 'phreatic_level_l18n4_28w_df': phreatic_level_l18n4_28w_df, # 'phreatic_level_l21n1_28w_df': phreatic_level_l21n1_28w_df, # 'phreatic_level_l21n2_28w_df': phreatic_level_l21n2_28w_df, # 'phreatic_level_l21n3_28w_df': phreatic_level_l21n3_28w_df, # 'phreatic_level_l21n4_28w_df': phreatic_level_l21n4_28w_df, # 'phreatic_level_l21n5_28w_df': phreatic_level_l21n5_28w_df, # 'phreatic_level_l24n1_28w_df': phreatic_level_l24n1_28w_df, # 'phreatic_level_l24n2_28w_df': phreatic_level_l24n2_28w_df }
I am iterating over the data frames (in this moment just over
phreatic_level_l2n1_28w_df
)for name, df in dfs.items(): # We turn to datetime the Fecha column values df['Fecha'] = pd.to_datetime(df['Fecha']) # I am iterating over each *`PORVLxNx`* column for i in range(1,24): if(i==2): # To N1 l2_n1_average_per_day = (df.groupby(pd.Grouper(key='Fecha', freq='D'))['PORVL{}N{}'.format(i,i-1)].mean().reset_index()) l2_n1_average_per_day.to_csv('L{}N{}_average_per-day.csv'.format(i,i-1), sep=',', header=True, index=False) print(l2_n1_average_per_day.head())
And my output of
l2_n1_average_per_day.head()
is:Fecha PORVL2N1 0 2018-01-12 0.519130 1 2018-01-13 0.138750 2 2018-01-14 0.175417 3 2018-01-15 0.111667 4 2018-01-16 0.291250 l2_n1_average_per_day.tail() Fecha PORVL2N1 199 2018-07-30 0.630417 200 2018-07-31 0.609583 201 2018-08-01 0.533333 202 2018-08-02 0.470833 203 2018-08-03 0.713333
Until here, my idea it’s works.
When I want to apply this solution (is very possible that there is not the more optimal) to other data frames contained in my
dfs
dictionarydfs = { 'phreatic_level_l2n1_28w_df': phreatic_level_l2n1_28w_df, 'phreatic_level_l2n2_28w_df': phreatic_level_l2n2_28w_df, # I've added the L2N2 phreatic_level_l2n2_28w_df dataframe item }
I’ve iterate again …
for name, df in dfs.items(): df['Fecha'] = pd.to_datetime(df['Fecha']) for i in range(1,24): if(i==2): # To N1 l2_n1_average_per_day = (df.groupby(pd.Grouper(key='Fecha', freq='D'))['PORVL{}N{}'.format(i,i-1)].mean().reset_index()) l2_n1_average_per_day.to_csv('L{}N{}_average_per-day.csv'.format(i,i-1), sep=',', header=True, index=False) # To N2. I've generate the average per day to L2N2 l2_n2_average_per_day = (df.groupby(pd.Grouper(key='Fecha', freq='D'))['PORVL{}N{}'.format(i,i)].mean().reset_index()) l2_n2_average_per_day.to_csv('L{}N{}_average_per-day.csv'.format(i,i), sep=',', header=True, index=False)
In my output, the
PORVL2N2
is not found.--------------------------------------------------------------------------- KeyError Traceback (most recent call last) <ipython-input-161-fbe6eaf8a824> in <module>() 11 print(phreatic_level_l2_n1_average_per_day.tail()) 12 # To N2 ---> 13 phreatic_level_l2_n2_average_per_day = (df.groupby(pd.Grouper(key='Fecha', freq='D'))['PORVL{}N{}'.format(i,i)].mean().reset_index()) 14 phreatic_level_l2_n2_average_per_day.to_csv('L{}N{}_average_per-day.csv'.format(i,i), sep=',', header=True, index=False) 15 ~/anaconda3/envs/sioma/lib/python3.6/site-packages/pandas/core/base.py in __getitem__(self, key) 265 else: 266 if key not in self.obj: --> 267 raise KeyError("Column not found: {key}".format(key=key)) 268 return self._gotitem(key, ndim=1) 269 KeyError: 'Column not found: PORVL2N2'
This is strange, because in my dataframe inside the dictionary, which is iterated, I have the
PORVL2N2
columnphreatic_level_l2n2_28w_df.head() Fecha HOra PORVL2N2 0 2018-01-12 01:58:22 0.71 1 2018-01-12 02:58:22 0.71 2 2018-01-12 03:58:23 0.71 3 2018-01-12 04:58:23 0.71 4 2018-01-12 05:58:24 0.71
Is possible, that in my iteration, I’ve overridden the data frame or something to be happening?
Posts: 1
Participants: 1