Using the IMF Data API: Retrieving an IFS series with Python

April 5, 2017 Update: I’ve put together a newer version of the code/guide located here.

The following tutorial shows how to use python’s requests package to get data directly from the International Monetary Fund (IMF). The IMF’s application programming interface (API) provides access to economic and financial data of more than 180 countries over more than 60 years.

More information on the IMF Data API

The IMF offers guidance on using their data services. The API is not stable, so check the IMF data services news if you receive error messages.

More information on Python

This example works with the Anaconda distribution of Python 2.7.

A short example: Loading IMF data into pandas

Below is a short working example of loading the Australian export price index time-series from International Financial Statistics (IFS) into a pandas dataframe directly from the IMF API.

In [1]:

# Example: loading IMF data into pandas

# Import libraries
import requests
import pandas as pd

# URL for the IMF JSON Restful Web Service, 
# IFS database, and Australian export prices series
url = 'http://dataservices.imf.org/REST/SDMX_JSON.svc/CompactData/IFS/Q.AU.PXP_IX.?startPeriod=1957&endPeriod=2016'

# Get data from the above URL using the requests package
data = requests.get(url).json()

# Load data into a pandas dataframe
auxp = pd.DataFrame(data['CompactData']['DataSet']['Series']['Obs'])

# Show the last five observiations
auxp.tail()

Out [1]:

@BASE_YEAR @OBS_VALUE @TIME_PERIOD
232 2010 94.6044171093095 2015-Q1
233 2010 90.4668716801789 2015-Q2
234 2010 90.4668716801789 2015-Q3
235 2010 85.5465473860777 2015-Q4
236 2010 81.5208275090858 2016-Q1

Breaking down the URL

The ‘key’ in our request is the URL, which contains instructions about which data we want.

The URL has three parts,

  1. http://dataservices.imf.org/REST/SDMX_JSON.svc/CompactData/ The base for requests using the JSON restful data service;
  2. IFS/ the database ID, IFS, for International Financial Statistics;
  3. Q.AU.PXP_IX.?startPeriod=1957&endPeriod=2016 the data dimension and time period information.

The third part, data dimension and time period information, is broken down on the IMF Web Service knowledge base as:

{item1 from dimension1}+{item2 from dimension1}{item N from dimension1}.{item1 from dimension2} +{item2 from dimension2}+{item M from dimension2}? startPeriod={start date}&endPeriod={end date}

For guidance on finding dimension information and building your request, see my previous example of using the IMF API to retrieve Direction of Trade Statistics (DOTS) data. 

Cleaning up the dataframe

Let’s add more meaningful headers and set the date as our index

In [2]:

# Rename columns
auxp.columns = ['baseyear','auxp','date']

# Set the price index series as a float (rather than string)
auxp.auxp = auxp.auxp.astype(float)

# Read the dates as quarters and set as the dataframe index
rng = pd.date_range(pd.to_datetime(auxp.date[0]), periods=len(auxp.index), freq='QS')
auxp = auxp.set_index(pd.DatetimeIndex(rng))
del auxp['date']

# Show last five rows
auxp.tail()

Out [2]:

baseyear auxp
2015-01-01 2010 94.604417
2015-04-01 2010 90.466872
2015-07-01 2010 90.466872
2015-10-01 2010 85.546547
2016-01-01 2010 81.520828

Plot the data

Now we can use matplotlib to create a line plot showing the history of the Australian export price index.

In [3]:

# import matplotlib and pyplot
import matplotlib as mpl
import matplotlib.pyplot as plt
%matplotlib inline

# Create line plot and add labels and title
auxp.auxp.plot(grid=True, figsize=(9, 5), color="orange", linewidth=2,)
plt.ylabel('Index')
plt.xlabel('Year')
plt.title('Australia: Export Price Index (baseyear=' + str(auxp.baseyear[0]) + ')');

Out [3]:

Output from script

Save as a csv file

To save the data for use in another program, you can easily create a csv file.

In [4]:

auxp.to_csv('auxp.csv')

Global trade network analysis with Python: central players in bluefin tuna and large aircraft

Network analysis provides useful insights into complex bilateral trade data. Two methods are presented for calculating with Python each country’s influence in the global trade network for individual goods. Related concepts in graph and international trade theories are discussed.

Modern goods have complex trade networks

The things we buy increasingly travel long distances and from scattered origins before they reach us. Take one man’s repeatedly failed attempt to build a toaster (from scratch). Over several decades companies have changed their production techniques, relying on global value chains, for example, to keep costs low. These changes have gradually contributed to long-term growth in global trade.

The more deeply a country becomes involved in global trade, and in global supply chains in particular, the more subjected its economy becomes to changes abroad. This can be good; historically many powerful cities began as ports. However, the potential for higher returns from servicing foreign demand carries with it increased risk of economic contagion.

It is not hard to imagine how global supply chain connections transmit effects from other countries. If a strike in France delays the delivery of a crucial intermediate good, it may cause an assembly line stoppage in Taiwan. On an aggregate level, the recent slowdown in global trade results in vast shifts of wealth.

It may therefore prove useful to examine the complex networks of global trade using the tools provided largely by graph theory. As an example, let’s start with a graph of the global trade of tires in 2012.

401110
Trade networks are complex and large. Data source: UN Comtrade

Each country that exported or imported automobile tires in 2012 is represented above by one node labeled with its three letter country code (for example Germany is DEU). The precise location of a node on the graph is not critical (it is often arbitrary), but generally countries more central to the trade of tires are closer to the center of the network. Likewise, countries are generally graphed near their largest trading partners.

Each trading relationship is shown on the graph as an edge (a line connecting two nodes). If France exports tires to Aruba, the graph will include an edge connecting the two nodes labeled FRA and ABW. Trade network edges are considered directed, as the flow of goods has a direction (either imports or exports).

Rat’s nest or Rorschach?

You may look at the above ‘visualization’ and simply see a rat’s nest. This is a correct interpretation. The graph shows overall complexity in the trade network, not individual bilateral relationships (there are more than 4400 edges in this network). Indeed the automobile tire trade network is particularly large and dense. Many countries currently produce internationally competitive tires and all countries use them and import at least some. In fact, the average country imports tires from many other countries. A graph of the resultant trade network is reminiscent of a gray blob and practically as useful.

More useful, however, are individual metrics of network structure. For example, which countries tend to trade only with a select subgroup of other countries? Which goods are traded in networks where one country dominates trade? These questions relate theoretically to the respective graph theory concepts of clustering and centrality.

Let’s take a look at how the Python programming language can be used to measure centrality in trade networks, and discuss two specific measures of centrality.

Python for trade network analysis

What follows is a more technical segment with sample code for trade network analysis of using Python 2.7.

Let’s start by importing the packages

In [1]:

import networkx as nx
import csv
import numpy as np
import matplotlib
import matplotlib.pyplot as plt
from matplotlib import cm
from mpl_toolkits.axes_grid1 import make_axes_locatable
%matplotlib inline

We will rely heavily on NetworkX and give it the short name nx. Numpy is used to do certain calculations, and matplotlib helps with the visualizations.

Load the data and build the network

The example uses cleaned bilateral UN Comtrade trade data for scrap aluminum exports in 2012. The data follow the HS2002 classification system at the six-digit level of aggregation, and are sourced from WITS (subscription required for bulk download). Data are read from a csv file with the equivalent of three ‘columns’: the exporting country code, the importing country code, and the inflation-adjusted US Dollar value of exports in the one year period.

Data from the csv file are read line by line to build the network quickly. NetworkX is used to build the network, which is called G according to convention,  as a series of edges.

Read the data and build a network called G

In [2]:

G = nx.DiGraph() # create a directed graph called G
 
# Loop reads a csv file with scrap aluminum bilateral trade data
with open('760200_2012.csv', 'r') as csvfile:

     csv_f = csv.reader(csvfile)
     csv_f.next()

# Now we build the network by adding each row of data 
# as an edge between two nodes (columns 1 and 2).
     for row in csv_f:
          G.add_edge(row[0],row[1],weight=row[2])

Let’s look at a specific bilateral trade relationship to verify that the new network, G, is correct. Exports of scrap aluminum from the U.S. to China should be quite large in 2012.

Check individual trade flow (edge)

In [3]:

usachnexp = G.edge['USA']['CHN']['weight']
print 'USA 2012 scrap aluminum exports to China, in USD: ' + str(usachnexp)
USA 2012 scrap aluminum exports to China, in USD: 1199682944
Central players can affect the market

Now that the network has been built, we can use indicators from graph theory to identify potential weaknesses and risks in the network’s structure. In this example, we will look for the presence of dominant countries in the trade network. Dominant importers or exporters have the ability to influence supply and demand and therefore price. These dominant countries are highly influential players in the trade network, a characteristic measured in graph theory as centrality.

There are several measures of centrality and two are discussed briefly in this post. The first is eigenvector centrality, which iteratively computes the weighted and directed centrality of a node based on the centrality scores of its connections. The example below scores each importer country as a function of the import-value-weighted scores of its trading partners. That is, an importer is considered influential to a trade network (receives a high eigenvector centrality score) if it imports a lot from countries that are also influential. Mathematically, eigenvector centrality computes the left or right (left is import centrality, right is export centrality) principle eigenvector for the network matrix.

See the NetworkX documentation for questions on the code or this for more details on the the math.

Calculate eigenvector centrality of imports

In [4]:

# Calculate eigenvector centrality of matrix G 
# with the exports value as weights
ec = nx.eigenvector_centrality_numpy(G, weight='weight')

# Set this as a node attribute for each node
nx.set_node_attributes(G, 'cent', ec)

# Use this measure to determine the node color in viz
node_color = [float(G.node[v]['cent']) for v in G]
Calculate total exports

Next we calculate each country’s total exports of scrap aluminum in 2012 as the sum total of its individual exports (edges) to other nodes. In the script, total export data is assigned as a node attribute and set aside to be used as the node size in the visualization.

Calculate each country’s total exports

In [5]:

# Blank dictionary to store total exports
totexp = {}

# Calculate total exports of each country in the network
for exp in G.nodes(): 
     tx=sum([float(g) for exp,f,g in G.out_edges_iter(exp, 'weight')])
     totexp[exp] = tx
     avgexp = np.mean(tx)
nx.set_node_attributes(G, 'totexp', totexp)

# Use the results later for the node's size in the graph
node_size = [float(G.node[v]['totexp']) / avgexp for v in G]
Visualization of the scrap aluminum network

NetworkX works well with matplotlib to produce the spring layout visualization. It is another rat’s nest, but you may notice a different color on one of the medium-sized nodes.

Create graph using NetworkX and matplotlib

In [6]:

# Visualization
# Calculate position of each node in G using networkx spring layout
pos = nx.spring_layout(G,k=30,iterations=8) 

# Draw nodes
nodes = nx.draw_networkx_nodes(G,pos, node_size=node_size, \
                               node_color=node_color, alpha=0.5) 
# Draw edges
edges = nx.draw_networkx_edges(G, pos, edge_color='lightgray', \
                               arrows=False, width=0.05,)

# Add labels
nx.draw_networkx_labels(G,pos,font_size=5)
nodes.set_edgecolor('gray')

# Add labels and title
plt.text(0,-0.1, \
         'Node color is eigenvector centrality; \
         Node size is value of global exports', \
         fontsize=7)
plt.title('Scrap Aluminum trade network, 2012', fontsize=12)

# Bar with color scale for eigenvalues
cbar = plt.colorbar(mappable=nodes, cax=None, ax=None, fraction=0.015, pad=0.04)
cbar.set_clim(0, 1)

# Plot options
plt.margins(0,0)
plt.axis('off')

# Save as high quality png
plt.savefig('760200.png', dpi=1000)
760200
China (CHN) is influential to scrap aluminum trade in 2012. Data source: UN Comtrade
Central players on the demand side: scrap aluminum and bluefin tuna

The graph above shows plenty of large exporters (the large nodes) of scrap aluminum in 2012, including the US, Hong Kong (HKG), and Germany (DEU). The demand of one country in the network, however, actually dominates the market. In 2012, the booming Chinese economy was purchasing large quantities of industrial metals, including scrap metals. The surge in demand from China was enough to cause global price increases and lead to increased levels of recycling. Since 2012, however, Chinese imports of scrap aluminum have nearly halved, as has the market price of aluminum. The recent boom-and-bust cycle in scrap aluminum prices has a single country of origin but global ripples; the downturn generates domestic consequences for the large exporters and reduces the financial incentives for recycling.

The central influence of China in the 2012 scrap aluminum trade network is captured by its high eigenvector centrality score (node color in the graph above). We can also easily infer the out sized influence of China from a more simple measure–the high value of its imports relative to other countries. Centrality metrics, of which there are many, often prove useful in nuanced cases.

760200_cent
Chinese demand for scrap aluminum dominates the market in 2012. Data source: UN Comtrade

Another example of a central influence on a trade network can be found in Japanese demand for bluefin tuna. As shown below, Japan has very high eigenvector centrality for imports of this key ingredient in many sushi dishes.

030346

Australia (AUS) dominates bluefin tuna exports, but by eigenvector import centrality Japan (JPN) is the influential player in the market. The first Tokyo tuna auction of 2013 saw one fish fetch a record 1.76 million USD. Indeed in 2012 Japan imported more than 100 times as much bluefin tuna as the second largest importer, Korea.

Like scrap aluminum, the story here follows the familiar boom-and-bust cycle; prices for bluefin tuna have returned to lower levels since 2012. The structure of the trade network, with one central player, introduces a higher level of price volatility. During a downturn in prices, this transmits financial consequences to fishermen throughout the world.

Supply-side influential players: large aircraft production

Trade network analysis can also help to identify influential exporters of goods. Cases that come to mind are rare earth minerals found only in certain countries, or large and complex transportation equipment. Commercial aircraft manufacturers, for example, are limited (unfortunately this may have more to do with subsidies than limited supply of technological prowess). Very large aircraft production is dominated by two firms: Airbus, with production sites primarily in France and Germany, and U.S. competitor, Boeing.

Instead of using eigenvector centrality to measure the influence of each exporting country in the large aircraft global trade network, let’s use a more simple method called outdegree centrality. We compute outdegree centrality for each country, i , as its number of outgoing (exporting) connections, k^{out}_i , divided by the total number of possible importers, (v - 1) :

c^{out}_D (i) = k^{out}_i / (v - 1) .

You can think of this measure as the share of importers that are serviced by each exporter. Nodes with a high outdegree centrality are considered influential exporters in the network.

Calculate outdegree centrality

In [7]:

oc = nx.out_degree_centrality(G) # replaces ec in the above
880240
France, Germany, and the US dominate exports of large aircraft. Data source: UN Comtrade

As expected, France, Germany, and the U.S. receive high outdegree centrality scores. There simply aren’t many alternative countries from which to buy your large aircraft. Beyond lack of choice for buyers, central exporters in a trade network may introduce (or represent) vulnerability and barriers to competition.

880240_cent
Large aircraft suppliers are limited. Data for 2012, source: UN Comtrade
Network structure and (preventing) domestic consequences

Global trade is increasingly complex. Open economies are vulnerable to supply and demand shocks from the other countries in their trade network. The structure of the trade network itself determines in part the level of vulnerability and how and where supply and demand shocks may be transmitted. Certain trade networks, such as those for scrap aluminum or bluefin tuna, face dominant consumers and additional price volatility. Networks can also be subject to supply-side market structure issues, such as the virtual duopoly with very large aircraft.

Hindsight makes bubbles more visible; we easily find the previously missed warning signs once we know where to look. Decision makers aim for early detection of vulnerabilities, but face a geographically growing set of possible sources. Network analysis tools, such as centrality, can be applied to existing sets of complex bilateral trade data to provide new insight in the search for today’s warning signs. Such nontraditional tools may prove increasingly useful in a world where an individual is not capable of building a toaster from scratch, yet they sell down the street for $11.99.

Additional resources and reading
For fun:

The Toaster Project

Some references and further reading on networks and graph theory:

Easley and Kleinberg (2010) Networks, Crowds, and Markets: Reasoning about a Highly Connected World

Jackson (2010) Social and Economic Networks

Reading on trade and trade networks:

De Benedictis (2013) Network Analysis of World Trade using the BACI-CEPII dataset

Feenstra (2005) World Trade Flows

Nemeth (1985) International Trade and World-System Structure: A Multiple Network Analysis

Trade data source:

World Integrated Trade Solution (WITS)

Some python related resources:

Anaconda distribution for python

NetworkX

ComputSimu: Networks 1: Scraping + Data visualization + Graph stats (more useful code here)

Machine Reading IMF Data: Data Retrieval with Python 

Updated November 16, 2016

Introduction

The International Monetary Fund (IMF) Statistics Department (STA) allows API access to their economic time series. Well-known datasets such as International Financial Statistics (IFS) can be machine read through the API. This example will use Python to retrieve Direction of Trade Statistics (DOTS) data from STA’s JSON RESTful Web Service so that we can determine the United States’ share of world exports over the past 50 years.

The IMF knowledge base provides more information on the three avaiable API formats and IMF data services. For more information on the work of STA, see their PDF annual report (PDF), STA at a glance 2015.

Gathering series and dimension information

First, we will need to import the requests and pandas libraries. These will allow us to read json data, open urls, and request information from the web.

Libraries

In [1]:
# Import libraries
import requests
import pandas as pd

Since we are using the JSON RESTful API, we start by using the ‘Dataflow’ endpoint URL to look at what series are available and find the series id of interest. The full output is long, so I’ve removed the data unrelated to this example. The IMF has many more series than what is shown below.

Find Series Name

In [2]:
# Find the series id and text name.
url = 'http://dataservices.imf.org/REST/SDMX_JSON.svc/Dataflow/'
seriesids = requests.get(url).json()
df = pd.DataFrame(seriesids['Structure']['Dataflows']['Dataflow'])
for x in range(6, 13):
     items = (str(df['@id'][x]), str(df['Name'][x]['#text']))
     print ': '.join(items)
DOT: Direction of Trade Statistics (DOTS)
FSIREM: Financial Soundness Indicators (FSI), Reporting Entities - Multidimensional
CDIS: Coordinated Direct Investment Survey (CDIS)
GFS01M: Government Finance Statistics (GFS 2001) - Multidimensional
GFS01: Government Finance Statistics (GFS 2001)
BOP: Balance of Payments (BOP)
BOPAGG: Balance of Payments (BOP), World and Regional Aggregates

We found above that the id for Direction of Trade Statistics is DOT. We can use this id to read notes about the series. We will next need to identify the dimensions of the data. For example, direction of trade data is based on a home country a flow and measure and a counterpart country. The data also has multiple frequencies and units of measurement. All of this information will be needed to later make our data request.

Find Series Details and Description

In [3]:
# Annotations for the series
url = "http://dataservices.imf.org/REST/SDMX_JSON.svc/DataStructure/DOT"
dotstruct = requests.get(url).json()
df = pd.DataFrame(dotstruct['Structure']['KeyFamilies']\
     ['KeyFamily']['Annotations'])
for x in range(0, 7): 
     items = (str(df['Annotation'][x]['AnnotationTitle']), \
     str(df['Annotation'][x]['AnnotationText']['#text']))
     print ': '.join(items)
Latest Update Date: 04/26/2016
Name: Direction of Trade Statistics (DOTS)
Temporal Coverage: Monthly and quarterly data are available starting 1960. Annual data are available starting 1947.
Geographic Coverage: DOTS covers 184 countries, the world, and major areas.
Methodology: Guide to Direction of Trade Statistics, 1993. See Documents tab.
Definition: The <B>Direction of Trade Statistics (DOTS)</B> presents current figures on the value of merchandise exports and imports disaggregated according to a country's primary trading partners. Area and world aggregates are included in the display of trade flows between major areas of the world. Reported data is supplemented by estimates whenever such data is not available or current. Imports are reported on a cost, insurance and freight (CIF) basis and exports are reported on a free on board (FOB) basis, with the exception of a few countries for which imports are also available FOB. Time series data includes estimates derived from reports of partner countries for non-reporting and slow-reporting countries.
Code: DOT

Find Series Dimensions

In [4]:
# Look at structure of DOTS data to find the dimensions for our data request
url = "http://dataservices.imf.org/REST/SDMX_JSON.svc/DataStructure/DOT"
dotstruct = requests.get(url).json()
df = pd.DataFrame(dotstruct['Structure']['KeyFamilies']['KeyFamily']\
          ['Components']['Dimension'])
for x in range(0, 4): 
     items = ("Dimension", str(x+1), str(df['@codelist'][x]))
     print ': '.join(items)
Dimension: 1: CL_FREQ
Dimension: 2: CL_AREA_DOT
Dimension: 3: CL_INDICATOR_DOT
Dimension: 4: CL_COUNTERPART_AREA_DOT

We can now copy the code for each dimension into the CodeList Method to get the list of possible values. For example, we will need to identify the value of the second dimension, CL_AREA_DOT, for the United States. Below, we show that the code is US. I’ve manually placed the index number for the U.S. and World codes (again to save space), however, you can replace [200, 248] with [0, 248] to get the full list of country/area codes.

Find Country Codes

In [5]:
# Obtain country codes
url = "http://dataservices.imf.org/REST/SDMX_JSON.svc/CodeList/CL_AREA_DOT"
country = requests.get(url).json()
df = pd.DataFrame(country['Structure']['CodeLists']['CodeList']['Code'])
for x in [200, 248]: 
     items = (str(df['@value'][x]), str(df['Description'][x]['#text']))
     print ': '.join(items)
US: United States
W00: All Countries, excluding the IO

The series ID is DOT and the country codes (we will use this with the exporting country, CL_AREA_DOT, and the counterpart, CL_COUNTERPART_AREA_DOT) of interest are W00 for world and US for the US. We see below that the indicator of interest is TXG_FOB_USD, Goods, Value of Exports, Free on board (FOB), US Dollars.

Find Column IDs

In [6]:
# Obtain series info and ids
url = "http://dataservices.imf.org/REST/SDMX_JSON.svc/CodeList/CL_INDICATOR_DOT"
series = requests.get(url).json()
df = pd.DataFrame(series['Structure']['CodeLists']['CodeList']['Code'])
for x in range(0, 4): 
     items = (str(df['@value'][x]), str(df['Description'][x]['#text']))
     print ': '.join(items)
TXG_FOB_USD: Goods, Value of Exports, Free on board (FOB), US Dollars
TMG_CIF_USD: Goods, Value of Imports, Cost, Insurance, Freight (CIF), US Dollars
TMG_FOB_USD: Goods, Value of Imports, Free on board (FOB), US Dollars
All Indicators: All Indicators

We repeat the above steps for each dimension and record which series values are of interest to us.

Retrieving Data

The guide to STA’s API shows how we can combine information from the previous steps to call and retrieve data. For direction of trade statistics, we see that the dimensions are as follows:

  • Dimension 1: CL_FREQ (the frequency of the data–we want to use monthly data) – M
  • Dimension 2: CL_AREA_DOT (the primary country) – US
  • Dimension 3: CL_INDICATOR_DOT (the measure–we want to look at exports free of board) – TXG_FOB_USD
  • Dimension 4: CL_COUNTERPART_AREA_DOT (the counterpart country) – W00

The JSON RESTful API method for requesting the data is the CompactData Method. The format for putting together dimension and time period information is shown on the Web Service knowledge base as:

http://dataservices.imf.org/REST/SDMX_JSON.svc/CompactData/{database ID}/ {item1 from dimension1}+{item2 from dimension1}{item N from dimension1}.{item1 from dimension2} +{item2 from dimension2}+{item M from dimension2}? startPeriod={start date}&endPeriod={end date}

Putting all of this information together, the URL to retrieve a JSON dictionary for 1966-2016 US exports to the world data is:

http://dataservices.imf.org/REST/SDMX_JSON.svc/CompactData/DOT/M.US.TXG_FOB_USD.W00.?startPeriod=1981&endPeriod=2016

The python code which gets the data and saves it as a dictionary is as follows:

Request data from IMF API

In [7]:
url = 'http://dataservices.imf.org/REST/SDMX_JSON.svc/CompactData/DOT/M.US.TXG_FOB_USD.W00.?startPeriod=1981&endPeriod=2016'
data = requests.get(url).json()
usexp = pd.DataFrame(data['CompactData']['DataSet']['Series']['Obs'])
usexp.columns = ['status', 'usexports','date'];
usexp.tail()
Out[7]:
date usexports
415 2015-08 123065777734
416 2015-09 125394024247
417 2015-10 130599515853
418 2015-11 120731632371
419 2015-12 119907169367

We can repeat the above code with a different URL to obtain data on total world exports and the exports of other countries which we may want to compare with the United States. We combine the request for several series into one URL, by adding ‘+code2+code3’. For example, ‘US+JP+CN.TXG..’

Example of request with multiple columns

In [8]:
ourl = 'http://dataservices.imf.org/REST/SDMX_JSON.svc/CompactData/DOT/M.US+CN+JP+W00.TXG_FOB_USD.W00.?startPeriod=1972&endPeriod=2016'
odata = requests.get(ourl).json();

Cleaning the dataframe and naming rows

In [9]:
wexp = pd.DataFrame(odata['CompactData']['DataSet']['Series'][0]['Obs'])
wexp.columns = ['status','wexports','date']
del wexp['date']
del wexp['status']
chexp = pd.DataFrame(odata['CompactData']['DataSet']['Series'][1]['Obs'])
chexp.columns = ['status', 'chexports','date']
del chexp['date']
del chexp['status']
jpexp = pd.DataFrame(odata['CompactData']['DataSet']['Series'][2]['Obs'])
jpexp.columns = ['jpexports','date']
del jpexp['date']
usexp = pd.DataFrame(odata['CompactData']['DataSet']['Series'][3]['Obs'])
usexp.columns = ['status', 'usexports','date']
del usexp['status'];

Now we combine the two series into one dataframe and tell our script to read the export value columns as numbers.

Read as numeric

In [10]:
combined = pd.concat([usexp, wexp, chexp, jpexp], axis=1)
pd.to_datetime(combined.date)
combined = combined.set_index(pd.DatetimeIndex(combined['date']))
usexports = pd.to_numeric(combined.usexports)
wexports = pd.to_numeric(combined.wexports)
cexports = pd.to_numeric(combined.chexports)
jexports = pd.to_numeric(combined.jpexports)

Finally, we can calculate the U.S. percentage share of world exports. We simply divide the us exports by the world exports and multiply by 100. If using the data for economic research, we would likely take the log forms and apply some filters.

Capture

Calculate share of world exports for each country

In [11]:
combined['usshare'] = usexports / wexports * 100
combined['chinashare'] = cexports / wexports * 100
combined['japanshare'] = jexports / wexports * 100
combined.tail()

Out[11]:

date usshare chinashare japanshare
2015-08 9.460 15.121 3.668
2015-09 8.830 14.455 3.754
2015-10 9.330 13.737 3.892
2015-11 9.018 14.663 3.645
2015-12 8.776 16.362 3.804

Graphing the data

Let’s use matplotlib to view the result of our work.

Graph of US share of world exports

In [12]:
import matplotlib as mpl
import matplotlib.pyplot as plt
%matplotlib inline
txt = '''Source: International Monetary Fund.'''

# Plot US share of world exports
combined.usshare.plot(grid=True, figsize=(9, 5), color="blue", linewidth=2,)
plt.ylabel('percentage of world exports')
plt.xlabel('Year')
plt.text(20,4.5,txt)
plt.title('U.S. Share of World Exports');
download

The graph shows a decrease in the U.S. share of exports from nearly 20 percent in 1966 to roughly 9 percent in 2015. We can also easily examine how changes in the U.S. share of exports compare with changes in the share of Japan and China.

Graph of moving average of US, China, and Japan shares of world exports

In [13]:
# Calculate moving average for each share, to reduce noise
combined['ussharema'] = combined['usshare'].rolling(12,12).mean()
combined['chsharema'] = combined['chinashare'].rolling(12,12).mean()
combined['jpsharema'] = combined['japanshare'].rolling(12,12).mean()

combshares = combined[['ussharema', 'chsharema', 'jpsharema']]
shares = list(combshares);
# Plot various shares of world exports
combined[shares][120:].plot(grid=True, figsize=(9, 5), linewidth=2)
plt.ylabel('percentage of world exports')
plt.xlabel('Year')
plt.text(150,-2,txt)
plt.title('Share of World Exports', );
download (1)

Export dataset to .csv

Let’s save the dataset in a portable format that can be read by any statistical software. My preference is to create a .csv file, which I will use for my U.S. Macroeconomic and Markets Dashboard.

Create csv file

In [14]:
combined.to_csv('us_share_exports.csv')

 

Dollar-indices: hitting recent lows (and how to plot with Python)

While the dollar has depreciated against most major currencies over the past month (notably 4.4% against the Canadian dollar, 3.3% against the Yen, 4.2% against the Real, and 8.6% against the Rand), the Fed H.10 release provides measures of exactly how much the value of the greenback has fallen relative to a weighted set of its trading partners. The latest data show that the U.S. dollar has fallen on April 12 to its lowest level since June 2015 compared to other major currencies, and its lowest level since October 2015 compared to a broad index of currencies.

This post covers the trade-weighted dollar and is split into two segments: a description of the index and its recent behavior, and a short python script showing how you can use the pandas and matplotlib libraries to retrieve the time series and plot it.

Two trade-weighted dollar indices, described

The Fed’s two most common trade-weighted indices of the foreign exchange value of the U.S. dollar are the major-currencies index and the broad-index. Both measure the value of the dollar relative to other countries’ currencies and intend to be aligned, through their weighting system, with the currencies closely related to U.S. trade patterns. The broad index, however, covers more currencies, and especially more emerging market currencies.

Discrepancies between even the broad index and the way the dollar is actually traded are inevitable. For example, the broad index weight in 2016 for Canada is 12.664 percent (broad index weights are updated yearly), whereas year-t0-date, 15.2 percent of total U.S. trade has been with Canada. From a share-of-total-U.S.-trade perspective, the broad index is somewhat overweight the Yuan, Euro, and Yen, and underweight the Canadian dollar and Mexican peso.

dtwexm_apr152016
The major currencies index contains fewer currencies than the broad index, and has fallen to a 22-month low. (plot from dashboard)

From 2005 through 2015, the major currencies index remained largely within a range of 70-85. After a steep climb through 2015, in January, the measure peaked at 95.8, to the frustration of U.S. exporters whose customers essentially pay more for the same goods (and as a result buy fewer). On April 12, the major currencies index hit its lowest level since June 2015.

Meanwhile, the broad index has moved in the same direction, hitting a five month low on April 12. The plot of the broad index is below, including how to obtain it. You can substitute DTWEXB with DTWEXM if you are interested in the major currencies index.

Python: Retrieve and plot the trade-weighted dollar

Next we show how Python can be used to gather and plot data on the Fed’s broad index of the foreign exchange value of the dollar. The script gathers data from Fred and plots each business day’s index value since the start of 2014.

Gathering data

First, we import pandas, numpy, and matplotlib and give them conventional short names. We will also use datetime and date.

In [1]:
# Import libraries
import pandas as pd
import numpy as np
import matplotlib as mpl
import matplotlib.pyplot as plt
%matplotlib inline
import datetime
from datetime import date

Next, we use the pandas.io.data package to request the data from Fred. I’ve found the code for our series of interest, DTWEXB, by searching, but you can also find it on the Fred site by source (Board of Governors of the Federal Reserve System), or by release (H.10). We paste the series code into the datareader and provide start and end dates. Pandas retrieves the data into a dataframe.

In [2]:
import pandas_datareader.data as webdata
tstart = datetime.datetime(2014, 1, 1)
#retrieve trade-weighted dollar data from fred
dtwexb = webdata.DataReader("DTWEXB", "fred", tstart);
#display five most recent observations
dtwexb.tail(5)
Out[2]:
DTWEXB
DATE
2016-04-11 119.4527
2016-04-12 119.2860
2016-04-13 119.6278
2016-04-14 119.6472
2016-04-15 119.6701

When was the last time the measure was as low as its April 12 value?

In [3]:
print dtwexb[119.29>=dtwexb].dropna().tail(2)
              DTWEXB
DATE                
2015-10-21  119.1786
2016-04-12  119.2860

Line plot of data

Lastly, we can use matplotlib to plot the data.

In [4]:
#Create figure and plot dtwexb
fig = plt.figure(figsize=[7,5])
ax1 = plt.subplot(111)
line = dtwexb.DTWEXB.plot(color='blue',linewidth=2)

#Add a title
ax1.set_title('Trade-Weighted U.S. Dollar, Broad Index (1997=100)', fontsize=15)

#Add y label and no x-label since it is dates
ax1.set_ylabel('Index')
ax1.set_xlabel('')

#Axis options
ax1.spines["top"].set_visible(False)  
ax1.spines["right"].set_visible(False)  
ax1.get_xaxis().tick_bottom()
ax1.get_yaxis().tick_left()
ax1.tick_params(axis='x', which='major', labelsize=8)
ax1.yaxis.grid(True)

#Annotate with text
fig.text(0.15, 0.85,'Last: ' + str(dtwexb.DTWEXB[-1]) \
         + ' (as of: ' + str(dtwexb.index[-1].strftime('%Y-%m-%d'))\
         + ')');
url = 'https://research.stlouisfed.org/fred2/series/TWEXB'
fig.text(0.05, 0.025, 'Source: ' + url)
fig.text(0.65, 0.16, 'briandew.wordpress.com')

#Save as png
plt.savefig('dtwexb.png', dpi=1000)
dtwexb
The broad index hit a five-month low on April 12.

Oil Price Data with Python

This example shows how Python can be used to take a look at oil prices. The script gathers daily oil price data from Quandl and plots how the price has changed over the past few months.

Gathering data

First, we import pandas, numpy, and matplotlib and give them conventional short names.

In [1]:
# Import libraries
import pandas as pd
import numpy as np
import matplotlib as mpl
import matplotlib.pyplot as plt
%matplotlib inline

Next, we identify the url for our data. In this case, the data is provided by Quandl and the url can be obtained by clicking ‘csv’ under the API for any series on the right-hand side of the page. We read the CHRIS CME_CL1 csv file provided by Quandl into a pandas dataframe.

In [2]:
# Import from Quandl WTI crude oil price data
url = "https://www.quandl.com/api/v3/datasets/CHRIS/CME_CL1.csv"
wticl1 = pd.read_csv(url, index_col=0, parse_dates=True)
wticl1.sort_index(inplace=True)
wticl1_last = wticl1['Last']
wticl1['PctCh'] = wticl1.Last.pct_change()

Line plot of oil price

Lastly, we can use matplotlib to generate a line plot showing the most recent 68 days worth of closing prices for WTI crude front month contracts. The past week has seen this measure of oil prices reach nearly $40 per barrel.

In [3]:
fig = plt.figure(figsize=[7,5])
ax1 = plt.subplot(111)
line = wticl1_last.tail(68).plot(color='red',linewidth=3)
ax1.set_ylabel('USD per barrel')
ax1.set_xlabel('')
ax1.set_title('WTI Crude Oil Price', fontsize=18)
ax1.spines["top"].set_visible(False)  
ax1.spines["right"].set_visible(False)  
ax1.get_xaxis().tick_bottom()
ax1.get_yaxis().tick_left()
ax1.tick_params(axis='x', which='major', labelsize=8)
fig.text(0.15, 0.85,'Last: $' + str(wticl1.Last[-1])\
         + ' (as of: ' \
         + str(wticl1.index[-1].strftime('%Y-%m-%d'))\
         + ')');
fig.text(0.15, 0.80,'Change: $' + str(wticl1.Change[-1])\
         + '; ' \
         + str((np.round((wticl1.PctCh[-1] * 100), \
         decimals=2))) + '%')
fig.text(0.1, 0.06, 'Source: ' + url)
fig.text(0.1, 0.02, 'briandew.wordpress.com')
plt.savefig('oil.png', dpi=1000)
oil

Machine reading IMF data with Python: an example

UPDATE June 16, 2016: The IMF API has changed. My updated tutorial is here:

https://briandew.wordpress.com/2016/05/01/machine-reading-imf-data-data-retrieval-with-python/


Download PDF Example – Machine Reading IMF Data with Python

During the past 18 months, the International Monetary Fund has made all of its data available free-of-charge, and developed an API to allow access to data. This means data from sources such as International Financial Statistics (IFS) can be collected without the need for manual download.

I’ve written a crude example using Python. The example retrieves IMF Direction of Trade Statistics to show the declining U.S. share of world exports and the rapid rise of China’s export-led economy.

imf_api-Copy1_26_0

If you are getting started with Python for economic research, I recommend the free version of the Enthought Canopy deployment, which integrates several useful tools.