Using the IMF Data API: Retrieving an IFS series with Python

April 5, 2017 Update: I’ve put together a newer version of the code/guide located here.

The following tutorial shows how to use python’s requests package to get data directly from the International Monetary Fund (IMF). The IMF’s application programming interface (API) provides access to economic and financial data of more than 180 countries over more than 60 years.

More information on the IMF Data API

The IMF offers guidance on using their data services. The API is not stable, so check the IMF data services news if you receive error messages.

More information on Python

This example works with the Anaconda distribution of Python 2.7.

A short example: Loading IMF data into pandas

Below is a short working example of loading the Australian export price index time-series from International Financial Statistics (IFS) into a pandas dataframe directly from the IMF API.

In [1]:

# Example: loading IMF data into pandas

# Import libraries
import requests
import pandas as pd

# URL for the IMF JSON Restful Web Service, 
# IFS database, and Australian export prices series
url = 'http://dataservices.imf.org/REST/SDMX_JSON.svc/CompactData/IFS/Q.AU.PXP_IX.?startPeriod=1957&endPeriod=2016'

# Get data from the above URL using the requests package
data = requests.get(url).json()

# Load data into a pandas dataframe
auxp = pd.DataFrame(data['CompactData']['DataSet']['Series']['Obs'])

# Show the last five observiations
auxp.tail()

Out [1]:

@BASE_YEAR @OBS_VALUE @TIME_PERIOD
232 2010 94.6044171093095 2015-Q1
233 2010 90.4668716801789 2015-Q2
234 2010 90.4668716801789 2015-Q3
235 2010 85.5465473860777 2015-Q4
236 2010 81.5208275090858 2016-Q1

Breaking down the URL

The ‘key’ in our request is the URL, which contains instructions about which data we want.

The URL has three parts,

  1. http://dataservices.imf.org/REST/SDMX_JSON.svc/CompactData/ The base for requests using the JSON restful data service;
  2. IFS/ the database ID, IFS, for International Financial Statistics;
  3. Q.AU.PXP_IX.?startPeriod=1957&endPeriod=2016 the data dimension and time period information.

The third part, data dimension and time period information, is broken down on the IMF Web Service knowledge base as:

{item1 from dimension1}+{item2 from dimension1}{item N from dimension1}.{item1 from dimension2} +{item2 from dimension2}+{item M from dimension2}? startPeriod={start date}&endPeriod={end date}

For guidance on finding dimension information and building your request, see my previous example of using the IMF API to retrieve Direction of Trade Statistics (DOTS) data. 

Cleaning up the dataframe

Let’s add more meaningful headers and set the date as our index

In [2]:

# Rename columns
auxp.columns = ['baseyear','auxp','date']

# Set the price index series as a float (rather than string)
auxp.auxp = auxp.auxp.astype(float)

# Read the dates as quarters and set as the dataframe index
rng = pd.date_range(pd.to_datetime(auxp.date[0]), periods=len(auxp.index), freq='QS')
auxp = auxp.set_index(pd.DatetimeIndex(rng))
del auxp['date']

# Show last five rows
auxp.tail()

Out [2]:

baseyear auxp
2015-01-01 2010 94.604417
2015-04-01 2010 90.466872
2015-07-01 2010 90.466872
2015-10-01 2010 85.546547
2016-01-01 2010 81.520828

Plot the data

Now we can use matplotlib to create a line plot showing the history of the Australian export price index.

In [3]:

# import matplotlib and pyplot
import matplotlib as mpl
import matplotlib.pyplot as plt
%matplotlib inline

# Create line plot and add labels and title
auxp.auxp.plot(grid=True, figsize=(9, 5), color="orange", linewidth=2,)
plt.ylabel('Index')
plt.xlabel('Year')
plt.title('Australia: Export Price Index (baseyear=' + str(auxp.baseyear[0]) + ')');

Out [3]:

Output from script

Save as a csv file

To save the data for use in another program, you can easily create a csv file.

In [4]:

auxp.to_csv('auxp.csv')

Machine Reading IMF Data: Data Retrieval with Python 

Updated November 16, 2016

Introduction

The International Monetary Fund (IMF) Statistics Department (STA) allows API access to their economic time series. Well-known datasets such as International Financial Statistics (IFS) can be machine read through the API. This example will use Python to retrieve Direction of Trade Statistics (DOTS) data from STA’s JSON RESTful Web Service so that we can determine the United States’ share of world exports over the past 50 years.

The IMF knowledge base provides more information on the three avaiable API formats and IMF data services. For more information on the work of STA, see their PDF annual report (PDF), STA at a glance 2015.

Gathering series and dimension information

First, we will need to import the requests and pandas libraries. These will allow us to read json data, open urls, and request information from the web.

Libraries

In [1]:
# Import libraries
import requests
import pandas as pd

Since we are using the JSON RESTful API, we start by using the ‘Dataflow’ endpoint URL to look at what series are available and find the series id of interest. The full output is long, so I’ve removed the data unrelated to this example. The IMF has many more series than what is shown below.

Find Series Name

In [2]:
# Find the series id and text name.
url = 'http://dataservices.imf.org/REST/SDMX_JSON.svc/Dataflow/'
seriesids = requests.get(url).json()
df = pd.DataFrame(seriesids['Structure']['Dataflows']['Dataflow'])
for x in range(6, 13):
     items = (str(df['@id'][x]), str(df['Name'][x]['#text']))
     print ': '.join(items)
DOT: Direction of Trade Statistics (DOTS)
FSIREM: Financial Soundness Indicators (FSI), Reporting Entities - Multidimensional
CDIS: Coordinated Direct Investment Survey (CDIS)
GFS01M: Government Finance Statistics (GFS 2001) - Multidimensional
GFS01: Government Finance Statistics (GFS 2001)
BOP: Balance of Payments (BOP)
BOPAGG: Balance of Payments (BOP), World and Regional Aggregates

We found above that the id for Direction of Trade Statistics is DOT. We can use this id to read notes about the series. We will next need to identify the dimensions of the data. For example, direction of trade data is based on a home country a flow and measure and a counterpart country. The data also has multiple frequencies and units of measurement. All of this information will be needed to later make our data request.

Find Series Details and Description

In [3]:
# Annotations for the series
url = "http://dataservices.imf.org/REST/SDMX_JSON.svc/DataStructure/DOT"
dotstruct = requests.get(url).json()
df = pd.DataFrame(dotstruct['Structure']['KeyFamilies']\
     ['KeyFamily']['Annotations'])
for x in range(0, 7): 
     items = (str(df['Annotation'][x]['AnnotationTitle']), \
     str(df['Annotation'][x]['AnnotationText']['#text']))
     print ': '.join(items)
Latest Update Date: 04/26/2016
Name: Direction of Trade Statistics (DOTS)
Temporal Coverage: Monthly and quarterly data are available starting 1960. Annual data are available starting 1947.
Geographic Coverage: DOTS covers 184 countries, the world, and major areas.
Methodology: Guide to Direction of Trade Statistics, 1993. See Documents tab.
Definition: The <B>Direction of Trade Statistics (DOTS)</B> presents current figures on the value of merchandise exports and imports disaggregated according to a country's primary trading partners. Area and world aggregates are included in the display of trade flows between major areas of the world. Reported data is supplemented by estimates whenever such data is not available or current. Imports are reported on a cost, insurance and freight (CIF) basis and exports are reported on a free on board (FOB) basis, with the exception of a few countries for which imports are also available FOB. Time series data includes estimates derived from reports of partner countries for non-reporting and slow-reporting countries.
Code: DOT

Find Series Dimensions

In [4]:
# Look at structure of DOTS data to find the dimensions for our data request
url = "http://dataservices.imf.org/REST/SDMX_JSON.svc/DataStructure/DOT"
dotstruct = requests.get(url).json()
df = pd.DataFrame(dotstruct['Structure']['KeyFamilies']['KeyFamily']\
          ['Components']['Dimension'])
for x in range(0, 4): 
     items = ("Dimension", str(x+1), str(df['@codelist'][x]))
     print ': '.join(items)
Dimension: 1: CL_FREQ
Dimension: 2: CL_AREA_DOT
Dimension: 3: CL_INDICATOR_DOT
Dimension: 4: CL_COUNTERPART_AREA_DOT

We can now copy the code for each dimension into the CodeList Method to get the list of possible values. For example, we will need to identify the value of the second dimension, CL_AREA_DOT, for the United States. Below, we show that the code is US. I’ve manually placed the index number for the U.S. and World codes (again to save space), however, you can replace [200, 248] with [0, 248] to get the full list of country/area codes.

Find Country Codes

In [5]:
# Obtain country codes
url = "http://dataservices.imf.org/REST/SDMX_JSON.svc/CodeList/CL_AREA_DOT"
country = requests.get(url).json()
df = pd.DataFrame(country['Structure']['CodeLists']['CodeList']['Code'])
for x in [200, 248]: 
     items = (str(df['@value'][x]), str(df['Description'][x]['#text']))
     print ': '.join(items)
US: United States
W00: All Countries, excluding the IO

The series ID is DOT and the country codes (we will use this with the exporting country, CL_AREA_DOT, and the counterpart, CL_COUNTERPART_AREA_DOT) of interest are W00 for world and US for the US. We see below that the indicator of interest is TXG_FOB_USD, Goods, Value of Exports, Free on board (FOB), US Dollars.

Find Column IDs

In [6]:
# Obtain series info and ids
url = "http://dataservices.imf.org/REST/SDMX_JSON.svc/CodeList/CL_INDICATOR_DOT"
series = requests.get(url).json()
df = pd.DataFrame(series['Structure']['CodeLists']['CodeList']['Code'])
for x in range(0, 4): 
     items = (str(df['@value'][x]), str(df['Description'][x]['#text']))
     print ': '.join(items)
TXG_FOB_USD: Goods, Value of Exports, Free on board (FOB), US Dollars
TMG_CIF_USD: Goods, Value of Imports, Cost, Insurance, Freight (CIF), US Dollars
TMG_FOB_USD: Goods, Value of Imports, Free on board (FOB), US Dollars
All Indicators: All Indicators

We repeat the above steps for each dimension and record which series values are of interest to us.

Retrieving Data

The guide to STA’s API shows how we can combine information from the previous steps to call and retrieve data. For direction of trade statistics, we see that the dimensions are as follows:

  • Dimension 1: CL_FREQ (the frequency of the data–we want to use monthly data) – M
  • Dimension 2: CL_AREA_DOT (the primary country) – US
  • Dimension 3: CL_INDICATOR_DOT (the measure–we want to look at exports free of board) – TXG_FOB_USD
  • Dimension 4: CL_COUNTERPART_AREA_DOT (the counterpart country) – W00

The JSON RESTful API method for requesting the data is the CompactData Method. The format for putting together dimension and time period information is shown on the Web Service knowledge base as:

http://dataservices.imf.org/REST/SDMX_JSON.svc/CompactData/{database ID}/ {item1 from dimension1}+{item2 from dimension1}{item N from dimension1}.{item1 from dimension2} +{item2 from dimension2}+{item M from dimension2}? startPeriod={start date}&endPeriod={end date}

Putting all of this information together, the URL to retrieve a JSON dictionary for 1966-2016 US exports to the world data is:

http://dataservices.imf.org/REST/SDMX_JSON.svc/CompactData/DOT/M.US.TXG_FOB_USD.W00.?startPeriod=1981&endPeriod=2016

The python code which gets the data and saves it as a dictionary is as follows:

Request data from IMF API

In [7]:
url = 'http://dataservices.imf.org/REST/SDMX_JSON.svc/CompactData/DOT/M.US.TXG_FOB_USD.W00.?startPeriod=1981&endPeriod=2016'
data = requests.get(url).json()
usexp = pd.DataFrame(data['CompactData']['DataSet']['Series']['Obs'])
usexp.columns = ['status', 'usexports','date'];
usexp.tail()
Out[7]:
date usexports
415 2015-08 123065777734
416 2015-09 125394024247
417 2015-10 130599515853
418 2015-11 120731632371
419 2015-12 119907169367

We can repeat the above code with a different URL to obtain data on total world exports and the exports of other countries which we may want to compare with the United States. We combine the request for several series into one URL, by adding ‘+code2+code3’. For example, ‘US+JP+CN.TXG..’

Example of request with multiple columns

In [8]:
ourl = 'http://dataservices.imf.org/REST/SDMX_JSON.svc/CompactData/DOT/M.US+CN+JP+W00.TXG_FOB_USD.W00.?startPeriod=1972&endPeriod=2016'
odata = requests.get(ourl).json();

Cleaning the dataframe and naming rows

In [9]:
wexp = pd.DataFrame(odata['CompactData']['DataSet']['Series'][0]['Obs'])
wexp.columns = ['status','wexports','date']
del wexp['date']
del wexp['status']
chexp = pd.DataFrame(odata['CompactData']['DataSet']['Series'][1]['Obs'])
chexp.columns = ['status', 'chexports','date']
del chexp['date']
del chexp['status']
jpexp = pd.DataFrame(odata['CompactData']['DataSet']['Series'][2]['Obs'])
jpexp.columns = ['jpexports','date']
del jpexp['date']
usexp = pd.DataFrame(odata['CompactData']['DataSet']['Series'][3]['Obs'])
usexp.columns = ['status', 'usexports','date']
del usexp['status'];

Now we combine the two series into one dataframe and tell our script to read the export value columns as numbers.

Read as numeric

In [10]:
combined = pd.concat([usexp, wexp, chexp, jpexp], axis=1)
pd.to_datetime(combined.date)
combined = combined.set_index(pd.DatetimeIndex(combined['date']))
usexports = pd.to_numeric(combined.usexports)
wexports = pd.to_numeric(combined.wexports)
cexports = pd.to_numeric(combined.chexports)
jexports = pd.to_numeric(combined.jpexports)

Finally, we can calculate the U.S. percentage share of world exports. We simply divide the us exports by the world exports and multiply by 100. If using the data for economic research, we would likely take the log forms and apply some filters.

Capture

Calculate share of world exports for each country

In [11]:
combined['usshare'] = usexports / wexports * 100
combined['chinashare'] = cexports / wexports * 100
combined['japanshare'] = jexports / wexports * 100
combined.tail()

Out[11]:

date usshare chinashare japanshare
2015-08 9.460 15.121 3.668
2015-09 8.830 14.455 3.754
2015-10 9.330 13.737 3.892
2015-11 9.018 14.663 3.645
2015-12 8.776 16.362 3.804

Graphing the data

Let’s use matplotlib to view the result of our work.

Graph of US share of world exports

In [12]:
import matplotlib as mpl
import matplotlib.pyplot as plt
%matplotlib inline
txt = '''Source: International Monetary Fund.'''

# Plot US share of world exports
combined.usshare.plot(grid=True, figsize=(9, 5), color="blue", linewidth=2,)
plt.ylabel('percentage of world exports')
plt.xlabel('Year')
plt.text(20,4.5,txt)
plt.title('U.S. Share of World Exports');
download

The graph shows a decrease in the U.S. share of exports from nearly 20 percent in 1966 to roughly 9 percent in 2015. We can also easily examine how changes in the U.S. share of exports compare with changes in the share of Japan and China.

Graph of moving average of US, China, and Japan shares of world exports

In [13]:
# Calculate moving average for each share, to reduce noise
combined['ussharema'] = combined['usshare'].rolling(12,12).mean()
combined['chsharema'] = combined['chinashare'].rolling(12,12).mean()
combined['jpsharema'] = combined['japanshare'].rolling(12,12).mean()

combshares = combined[['ussharema', 'chsharema', 'jpsharema']]
shares = list(combshares);
# Plot various shares of world exports
combined[shares][120:].plot(grid=True, figsize=(9, 5), linewidth=2)
plt.ylabel('percentage of world exports')
plt.xlabel('Year')
plt.text(150,-2,txt)
plt.title('Share of World Exports', );
download (1)

Export dataset to .csv

Let’s save the dataset in a portable format that can be read by any statistical software. My preference is to create a .csv file, which I will use for my U.S. Macroeconomic and Markets Dashboard.

Create csv file

In [14]:
combined.to_csv('us_share_exports.csv')

 

Machine reading IMF data with Python: an example

UPDATE June 16, 2016: The IMF API has changed. My updated tutorial is here:

https://briandew.wordpress.com/2016/05/01/machine-reading-imf-data-data-retrieval-with-python/


Download PDF Example – Machine Reading IMF Data with Python

During the past 18 months, the International Monetary Fund has made all of its data available free-of-charge, and developed an API to allow access to data. This means data from sources such as International Financial Statistics (IFS) can be collected without the need for manual download.

I’ve written a crude example using Python. The example retrieves IMF Direction of Trade Statistics to show the declining U.S. share of world exports and the rapid rise of China’s export-led economy.

imf_api-Copy1_26_0

If you are getting started with Python for economic research, I recommend the free version of the Enthought Canopy deployment, which integrates several useful tools.