[reportlab-users] pandas dataframe in a reportlab table

Robin Becker robin at reportlab.com
Thu Feb 21 04:45:13 EST 2019


Hi,

I'm not a great pandas user, but for another project have been looking at it in connection with reportlab. As Andy's post suggests 
for really simple data you need not use pandas to create a reportlab table.

However, I suspect the more common usage would be to obtain some data in pandas and perform some manipulation on that data and 
then output a computed dataframe as a reportlab table.

As you observe it looks like pandas dataframes are column organized; to get our row oriented data you would need to have a 
construct like this

rlab_table_data = [['Mean','Max','Min','TestA','TestB']]+ [list(row[1].values) for row in df.iterrows()]

Alternatively you can go with column orientation like this and magic python transpose

rlab_table_data = [['Mean','Max','Min','TestA','TestB']]+[list(x) for x in map(list,zip(*[df[i].values for i in df]))]

On the other hand since pandas is based on numpy I suppose there must be some simple way to transpose and apparently 
example_df.transpose() or example_df.T will produce transposed data frames so then

dft = df.T
rlab_table_data = [['Mean','Max','Min','TestA','TestB']]+[list(dft[i].values) for i in dft]


I timed these using timeit eg

> $ python -mtimeit -s'import pandas as pd;df=pd.DataFrame([[1,2,3,4,5],[1,2,3,4,5],[1,2,3,4,5]])' "rlab_table_data = [['Mean','Max','Min','TestA','TestB']]+ [list(row[1].values) for row in df.iterrows()]"
> 10000 loops, best of 3: 159 usec per loop
> (ifa) rptlab at denali:~/devel/ifa
> $ python -mtimeit -s'import pandas as pd;df=pd.DataFrame([[1,2,3,4,5],[1,2,3,4,5],[1,2,3,4,5]])' "[['Mean','Max','Min','TestA','TestB']]+[list(x) for x in map(list,zip(*[df[i].values for i in df]))]"
> 10000 loops, best of 3: 24.2 usec per loop
> (ifa) rptlab at denali:~/devel/ifa
> $ python -mtimeit -s'import pandas as pd;df=pd.DataFrame([[1,2,3,4,5],[1,2,3,4,5],[1,2,3,4,5]])' "dft = df.T;rlab_table_data = [['Mean','Max','Min','TestA','TestB']]+[list(dft[i].values) for i in dft]"
> 1000 loops, best of 3: 257 usec per loop
> (ifa) rptlab at denali:~/devel/ifa
> $

and amazingly the second seems much the fastest and the transpose is slowest :( I'm not sure exactly how this would scale and I 
suspect #2 might break for extremely large datasets.




On 20/02/2019 21:49, Pizzolato, Larissa (EC) wrote:
> Hi there,
> 
> I am seeking a way to input a pandas dataframe into an existing reportlab table.
> 
> So far I have tried numerous different things, with this being the most recent:
> 
> example_df = inputPandasDataframe.values.astype(str) # the data looks like this:[[1,2,3,4,5],[1,2,3,4,5],[1,2,3,4,5]]
> 
> 
> rlab_table_data = [['Mean','Max','Min','TestA','TestB'], example_df]]
> 
> 
> 
>...........


-- 
Robin Becker


More information about the reportlab-users mailing list