Skip to content Skip to sidebar Skip to footer

Select Only One Value In Df Col Rows In Same Df For Calc Results From Different Val, And Calc Df Only On One Ticker At A Time

I try to calculate some KPIs from different companies/tickers. My stock-info resides in a df, with this structure Ticker Open High Low Adj Close

Solution 1:

You can use transform on your groupby object to maintain a column with the same shape:

Here, for example, is the 3 day moving average of the Adj Close (Pandas < 0.18.0).

df['MA3']=df.groupby('Ticker').Adj_Close.transform(lambdagroup:pd.rolling_mean(group,window=3))>>>dfDateTickerOpenHighLowAdj_CloseVolumeMA302015-04-09  vws.co3153163123121686800NaN12015-04-10  vws.co3173203163131396500NaN22015-04-13  vws.co318322315316156450031332015-04-14  vws.co320322319315137060031442015-04-15  vws.co32032231931694500031652015-04-16  vws.co319320310308223610031362015-04-17  vws.co310310302299271190030872015-04-20  vws.co303312303306162970030482016-03-31     mmm1671681661671762800NaN92016-04-01     mmm1661681651681993700NaN102016-04-04     mmm1671671661662022800167112016-04-05     mmm1651671651661610300167122016-04-06     mmm1651671651672092200166132016-04-07     mmm1661671651672721900167

Solution 2:

Use groupby

Setup

import pandas as pd
from StringIO import StringIO

text = """Date   Ticker        Open        High         Low   Adj_Close   Volume
2015-04-09  vws.co  315.000000  316.100000  312.500000  311.520000  1686800
2015-04-10  vws.co  317.000000  319.700000  316.400000  312.700000  1396500
2015-04-13  vws.co  317.900000  321.500000  315.200000  315.850000  1564500
2015-04-14  vws.co  320.000000  322.400000  318.700000  314.870000  1370600
2015-04-15  vws.co  320.000000  321.500000  319.200000  316.150000   945000
2015-04-16  vws.co  319.000000  320.200000  310.400000  307.870000  2236100
2015-04-17  vws.co  309.900000  310.000000  302.500000  299.100000  2711900
2015-04-20  vws.co  303.000000  312.000000  303.000000  306.490000  1629700
2016-03-31     mmm  166.750000  167.500000  166.500000  166.630005  1762800
2016-04-01     mmm  165.630005  167.740005  164.789993  167.529999  1993700
2016-04-04     mmm  167.110001  167.490005  165.919998  166.399994  2022800
2016-04-05     mmm  165.179993  166.550003  164.649994  165.809998  1610300
2016-04-06     mmm  165.339996  167.080002  164.839996  166.809998  2092200
2016-04-07     mmm  165.880005  167.229996  165.250000  167.160004  2721900"""

df = pd.read_csv(StringIO(text), delim_whitespace=1, parse_dates=[0], index_col=0)

Looks like:

printdfTickerOpenHighLowAdj_CloseVolumeDate2015-04-09  vws.co315.000000316.100000312.500000311.52000016868002015-04-10  vws.co317.000000319.700000316.400000312.70000013965002015-04-13  vws.co317.900000321.500000315.200000315.85000015645002015-04-14  vws.co320.000000322.400000318.700000314.87000013706002015-04-15  vws.co320.000000321.500000319.200000316.1500009450002015-04-16  vws.co319.000000320.200000310.400000307.87000022361002015-04-17  vws.co309.900000310.000000302.500000299.10000027119002015-04-20  vws.co303.000000312.000000303.000000306.49000016297002016-03-31     mmm166.750000167.500000166.500000166.63000517628002016-04-01     mmm165.630005167.740005164.789993167.52999919937002016-04-04     mmm167.110001167.490005165.919998166.39999420228002016-04-05     mmm165.179993166.550003164.649994165.80999816103002016-04-06     mmm165.339996167.080002164.839996166.80999820922002016-04-07     mmm165.880005167.229996165.250000167.1600042721900

Solution

df.groupby('Ticker').sum()OpenHighLowAdj_CloseVolumeTickermmm995.891003.590011   991.9499811000.339998  12203700vws.co2521.80  2543.400000  2497.900000  2484.550000  13541100

You can aggregate and do many things with the groupby object.

Post a Comment for "Select Only One Value In Df Col Rows In Same Df For Calc Results From Different Val, And Calc Df Only On One Ticker At A Time"