Follow along at:Â¶

mattgidden.com/presentations/pyam-iamc2017 Â¶

Find us on github:Â¶

github.com/IAMConsortium/pyam-analysis Â¶

Diagnostics, analysis and visualization tools
for Integrated Assessment timeseries dataÂ¶

First steps with the `pyam_analysis` packageÂ¶

The pyam-analysis package provides a range of diagnostic tools and functions
for analyzing and working with IAMC-style timeseries data.

The package can be used with data that follows the data template convention of the Integrated Assessment Modeling Consortium (IAMC). An illustrative example is shown below; see data.ene.iiasa.ac.at/database for more information.

model	scenario	region	variable	unit	2005	2010	2015
MESSAGE V.4	AMPERE3-Base	World	Primary Energy	EJ/y	454.5	479.6	...
...	...	...	...	...	...	...	...

Import package and load data from the AR5 tutorial csv snapshot fileÂ¶

First, we import the snapshot timeseries data from the file tutorial_AR5_data.csv in the tutorial folder.

As a first step, we show lists of all models, scenarios, regions, and variables (with units) included in the snapshot.

In [1]:

import pyam_analysis as iam

In [2]:

data = '/home/gidden/work/iiasa/message/pyam-analysis/tutorial/tutorial_AR5_data.csv'
df = iam.IamDataFrame(data=data)

What's in our dataset?Â¶

In [3]:

df.models()

Out[3]:

['AIM-Enduse 12.1',
 'GCAM 3.0',
 'IMAGE 2.4',
 'MERGE_EMF27',
 'MESSAGE V.4',
 'REMIND 1.5',
 'WITCH_EMF27']

In [4]:

df.scenarios()

Out[4]:

['EMF27-450-Conv',
 'EMF27-450-NoCCS',
 'EMF27-550-LimBio',
 'EMF27-Base-FullTech',
 'EMF27-G8-EERE',
 'AMPERE3-450',
 'AMPERE3-450P-CE',
 'AMPERE3-450P-EU',
 'AMPERE3-550',
 'AMPERE3-Base-EUback',
 'AMPERE3-CF450P-EU',
 'AMPERE3-RefPol',
 'AMPERE3-550P-EU']

In [5]:

df.regions()

Out[5]:

['ASIA', 'LAM', 'MAF', 'OECD90', 'REF', 'World']

In [6]:

df.variables(include_units=True)

Out[6]:

	variable	unit
0	Emissions\|CO2	Mt CO2/yr
1	Emissions\|CO2\|Fossil Fuels and Industry	Mt CO2/yr
2	Primary Energy	EJ/yr
3	Emissions\|CO2\|Fossil Fuels and Industry\|Energy...	Mt CO2/yr
4	Emissions\|CO2\|Fossil Fuels and Industry\|Energy...	Mt CO2/yr
5	Price\|Carbon	US$2005/t CO2
6	Primary Energy\|Coal	EJ/yr
7	Primary Energy\|Fossil\|w/ CCS	EJ/yr
8	Temperature\|Global Mean\|MAGICC6\|MED	deg C

Filtering DataÂ¶

In [7]:

df.scenarios({'model': 'MESSAGE'})

Out[7]:

['AMPERE3-450',
 'AMPERE3-450P-EU',
 'AMPERE3-550',
 'AMPERE3-RefPol',
 'EMF27-550-LimBio',
 'EMF27-Base-FullTech']

In [8]:

df.scenarios({'model': 'ESSAGE'})

Out[8]:

[]

In [9]:

df.variables(filters={'variable': 'Emissions|*'})

Out[9]:

['Emissions|CO2',
 'Emissions|CO2|Fossil Fuels and Industry',
 'Emissions|CO2|Fossil Fuels and Industry|Energy Supply',
 'Emissions|CO2|Fossil Fuels and Industry|Energy Supply|Electricity']

In [10]:

df.variables(filters={'variable': 'Emissions|*', 'level': 2})

Out[10]:

['Emissions|CO2',
 'Emissions|CO2|Fossil Fuels and Industry',
 'Emissions|CO2|Fossil Fuels and Industry|Energy Supply']

In [11]:

df.variables(filters={'level': 1})

Out[11]:

['Emissions|CO2', 'Price|Carbon', 'Primary Energy', 'Primary Energy|Coal']

Working with TimeseriesÂ¶

In [13]:

df.timeseries(filters={
    'scenario': 'AMPERE3-450', 
    'variable': 'Primary Energy|Coal', 
    'region': 'World'
}).head()

Out[13]:

				year	2005	2010	2020	2030	2040	2050	2060	2070	2080	2090	2100
model	scenario	region	variable	unit
GCAM 3.0	AMPERE3-450	World	Primary Energy\|Coal	EJ/yr	120.76	144.95	176.44	204.42	212.84	186.02	138.23	106.98	82.44	36.55	14.89
	AMPERE3-450P-CE	World	Primary Energy\|Coal	EJ/yr	120.76	144.95	178.98	218.24	213.35	192.45	142.64	108.72	82.73	36.89	15.22
	AMPERE3-450P-EU	World	Primary Energy\|Coal	EJ/yr	120.76	144.95	189.86	241.25	224.25	191.70	136.72	102.51	80.72	35.70	14.45
IMAGE 2.4	AMPERE3-450	World	Primary Energy\|Coal	EJ/yr	111.62	138.69	148.60	121.24	102.62	101.41	111.41	138.40	181.03	224.03	264.77
IMAGE 2.4	AMPERE3-450P-CE	World	Primary Energy\|Coal	EJ/yr	111.62	138.69	161.72	154.18	125.14	105.32	120.83	151.58	192.67	249.50	294.40

In [14]:

df.pivot_table(
    index=['year'], 
    columns=['scenario'], 
    values='value', 
    aggfunc='sum',
    filters={'variable': 'Primary Energy', 'region': 'World'}
).head()

Out[14]:

scenario	AMPERE3-450	AMPERE3-450P-CE	AMPERE3-450P-EU	AMPERE3-550	AMPERE3-550P-EU	AMPERE3-Base-EUback	AMPERE3-CF450P-EU	AMPERE3-RefPol	EMF27-450-Conv	EMF27-450-NoCCS	EMF27-550-LimBio	EMF27-Base-FullTech	EMF27-G8-EERE
year
2005	1821.09	1366.48	1821.09	1818.71	464.82	922.58	925.23	1818.44	2234.35	1381.84	3130.81	3130.60	868.79
2010	1972.13	1492.28	1972.02	1969.57	514.07	1015.78	1018.44	1969.50	2504.99	1542.90	3457.08	3459.28	985.16
2020	2253.49	1787.40	2399.41	2322.23	611.34	1258.24	1262.07	2401.37	2428.61	1424.26	3781.16	4135.65	947.08
2030	2530.95	2101.60	2863.85	2670.22	734.11	1532.12	1536.54	2869.96	2545.94	1470.64	4057.28	4846.37	933.08
2040	2795.47	2206.09	2940.96	3000.37	789.70	1802.62	1574.14	3305.70	2698.99	1670.51	4355.16	5588.19	1007.83

If you are familiar with the python package pandas, you can access the pd.DataFrame directly.

In [15]:

df.data.head()

Out[15]:

	model	scenario	region	variable	unit	year	value
0	AIM-Enduse 12.1	EMF27-450-Conv	ASIA	Emissions\|CO2	Mt CO2/yr	2005	10540.74
1	AIM-Enduse 12.1	EMF27-450-Conv	ASIA	Emissions\|CO2\|Fossil Fuels and Industry	Mt CO2/yr	2005	9126.18
2	AIM-Enduse 12.1	EMF27-450-Conv	ASIA	Primary Energy	EJ/yr	2005	133.56
3	AIM-Enduse 12.1	EMF27-450-Conv	LAM	Emissions\|CO2	Mt CO2/yr	2005	3285.00
4	AIM-Enduse 12.1	EMF27-450-Conv	LAM	Emissions\|CO2\|Fossil Fuels and Industry	Mt CO2/yr	2005	1422.06

Plotting TimeseriesÂ¶

In [16]:

df.plot_lines({'variable': 'Emissions|CO2', 'region': 'World'})

Validating and querying timeseries dataÂ¶

In [18]:

df.validate('Primary Energy')

INFO:root:48 scenarios satisfy the criteria

In [19]:

df.validate({'Primary Energy': {'up': 515, 'year': 2010}})

INFO:root:9 data points do not satisfy the criteria (out of 48 scenarios)

Out[19]:

						value
model	scenario	region	variable	unit	year
AIM-Enduse 12.1	EMF27-450-Conv	World	Primary Energy	EJ/yr	2010	518.89
	EMF27-450-NoCCS	World	Primary Energy	EJ/yr	2010	518.81
	EMF27-550-LimBio	World	Primary Energy	EJ/yr	2010	518.81
	EMF27-Base-FullTech	World	Primary Energy	EJ/yr	2010	518.81
	EMF27-G8-EERE	World	Primary Energy	EJ/yr	2010	518.64
REMIND 1.5	EMF27-450-Conv	World	Primary Energy	EJ/yr	2010	519.64
	EMF27-450-NoCCS	World	Primary Energy	EJ/yr	2010	519.64
	EMF27-550-LimBio	World	Primary Energy	EJ/yr	2010	519.64
	EMF27-Base-FullTech	World	Primary Energy	EJ/yr	2010	519.64

In [20]:

df.validate(
    {'Primary Energy|Coal': {'up': 400, 'year': 2050}}, 
    filters={'region': 'World'}, 
    exclude=False
)

INFO:root:2 data points do not satisfy the criteria (out of 48 scenarios)

Out[20]:

						value
model	scenario	region	variable	unit	year
GCAM 3.0	AMPERE3-Base-EUback	World	Primary Energy\|Coal	EJ/yr	2050	424.09
MERGE_EMF27	EMF27-Base-FullTech	World	Primary Energy\|Coal	EJ/yr	2050	605.76

Categorization of scenarios by timeseries characteristicsÂ¶

In [21]:

df.plot_lines({'variable': 'Temperature*'})

We now use the categorization feature of the pyam-analysis package. By default, each model/scenario is assigned as "uncategorized".

The next function resets all scenarios back to "uncategorized". This may be helpful in this tutorial if you are going back and forth between cells.

In [22]:

df.reset_category()

In [23]:

df.category(
    'Below 1.6C',
    {'Temperature|Global Mean|MAGICC6|MED': {'up': 1.6, 'year': 2100}},
    color='cornflowerblue',
    display='list'
)

INFO:root:4 scenarios categorized as 'Below 1.6C'

Out[23]:


model	scenario
GCAM 3.0	EMF27-450-Conv
GCAM 3.0	EMF27-450-NoCCS
REMIND 1.5	EMF27-450-Conv
REMIND 1.5	EMF27-450-NoCCS

In [24]:

df.category(
    'Below 2.0C',
    {'Temperature|Global Mean|MAGICC6|MED': {'up': 2.0, 'year': 2100}},
    filters={'category': 'uncategorized'}, 
    color='forestgreen'
)

INFO:root:8 scenarios categorized as 'Below 2.0C'

In [25]:

df.category(
    'Below 2.5C',
    {'Temperature|Global Mean|MAGICC6|MED': {'up': 2.5, 'year': 2100}},
    filters={'category': 'uncategorized'}, 
    color='gold'
)

INFO:root:16 scenarios categorized as 'Below 2.5C'

In [26]:

df.category(
    'Below 3.5C',
    {'Temperature|Global Mean|MAGICC6|MED': {'up': 3.5, 'year': 2100}},
    filters={'category': 'uncategorized'}, 
    color='firebrick'
)

INFO:root:3 scenarios categorized as 'Below 3.5C'

In [27]:

df.category(
    'Above 3.5C',
    {'Temperature|Global Mean|MAGICC6|MED': {}},
    filters={'category': 'uncategorized'}, 
    color='magenta'
)

INFO:root:9 scenarios categorized as 'Above 3.5C'

In [28]:

df.category('uncategorized', display='list')

Out[28]:


category	model	scenario
uncategorized	AIM-Enduse 12.1	EMF27-450-Conv
		EMF27-450-NoCCS
		EMF27-550-LimBio
		EMF27-Base-FullTech
		EMF27-G8-EERE
	WITCH_EMF27	EMF27-450-Conv
		EMF27-550-LimBio
		EMF27-Base-FullTech

Now, we again display the median global temperature increase for all scenarios, but we use the colouring by category to illustrate the common charateristics across scenarios.

In [29]:

df.plot_lines({'variable': 'Temperature*'}, color_by_cat=True)

As a last step, we display the aggregate CO2 emissions by category. This allows to highlight alternative pathways within the same category.

In [30]:

df.plot_lines(
    {'variable': 'Emissions|CO2', 'region': 'World'}, 
    color_by_cat=True
)