What is the EIA API?
The EIA API allows user to connect and import historical publicly available data from the EIA website. There are many data regarding oil prices, utility prices, oil production rates, oil consumption rates, etc.
This post is gathering data using the EIA API, and we will explore gathering data through the US EIA website using an API key.
I am interested in the different trends of commodity such as crude oil prices and production rates. These are some of the important indicators for the CPI Python Dashboard I am building.
Check out my previous post for the CPI Dashboard project:
Previous post showing how to gather data from the US BLS website using the BLS API.
Where to Find the Data for the EIA API?
First, let’s take a look at how to find the data from the US EIA website.
Finding the Series ID
To use the API to gather our data, we need to find the series ID for the data series in interest. This can be done by navigating to the US EIA website’s open data section.

From there, there is a BROWSE THE API
button, and we can click on that which brings us to the section that allows us to select the type of data we need.
In this case, I would like to find out the spot prices for petroleum, so I would select those and submit my query.
For the data I am gathering, I am looking for monthly data, and I am also interested in UK Brent Crude Oil and WTI Crude Oil Prices. However, I am only going to select one of the series to generate my request link, and I want the data retrieved to be sorted in descending order.

According to the US EIA website, this is the request URL that we should use. Now the reason why I only selected one series is because if I select more than one series at a time, the API will return the values for all the series together.
If I put all the series data into one data frame, then the data frame would have all the series data in one column. This is something I want to avoid since it makes it very difficult to work with when I need to merge this data frame with the BLS data frame.
After choosing your option above, the website will provide an URL. This is the URL that is required to get data using the EIA API.

To use my API code, you want to make sure you copy the URL highlighted in bold the following:
https://api.eia.gov/v2/petroleum/pri/spt/data/?frequency=monthly&data[0]=value&facets[product][]=EPCBRENT&sort[0][column]=period&sort[0][direction]=desc&offset=0&length=5000
How to Use EIA API to Retrieve Data?
Now, let’s actually and try to retrieve the data from the EIA website. This function takes series ID and series name as input. The setting for the API looks very similar to the API URL above because this is what it looks like after it gets broken down. This makes it easier for me to make any changes without having the go through the whole API URL to search what I need to change.
I realized after previous post of retrieving data from BLS API, that it might be better to use a input into the function to get extract the data from the series I need. What I was doing before did not allow much flexibility regarding on what series I can pull.
First, I need some settings for the EIA API such as the URL link, starting year, and ending year. You might be asking why the starting year is offset by 19 years; that is because the BLS API for my CPI Dashboard can only retrieve up to 20 years of data per request.
config.py
# Settings for API
from datetime import datetime
# Start year is current year - 19 years
start_year = datetime.now().year - 19
# End year is current year
end_year = datetime.now().year
# URL for the API
url_eia = 'https://api.eia.gov/v2'
Output:
url_eia: https://api.eia.gov/v2
Start Year: 2004
End Year: 2023
Now, the data retrieved from the API is a nested list of dictionaries. There might be a better way of tackling this problem, but this is the solution I have come up with — Please feel free to let me know if there is a better way.
I am using some helper functions to help me retrieve the data from the nested dictionaries. Basically what this function does is it goes through each list of dictionaries and takes the values in the dictionaries and adds it to an empty list.
misc_func
"""
Helper function for both the EIA and BLS API
"""
import pandas as pd
# Function that converts dictionaries to list
def dict_to_list(dicts, key):
"""
Function that takes in list of dictionaries and return the values in each of the dictionaries with the specified
key
:param key: List of keys
:param dicts: List of Dict
:return: Nested list of items
"""
list_array = []
list_item = [item.get(key) for item in dicts]
list_array.append(list_item)
return list_array
fetch_data.py
Lastly, I going to create an Updater class to house this API, this class also contains the API code for retrieving data from the BLS API in my previous post.
The API key can be obtained by registering on the EIA website, and just paste in your API key once you have obtained one.
# Importing libraries
import requests
import json
import api_keys
import config
import data.misc_func as misc_func
class Updater:
def __init__(self):
self.eia_url = config.url_eia
self.start_year = config.start_year
self.end_year = config.end_year
# API to retrieve data from the US Energy Information Administration
def retrieve_data_eia(self, eia_series, eia_series_name):
"""
Used to retrieve data from the US EIA website using an API key
:return:
"""
# Setting for the EIA API
sort_value = '&sort[0][column]=period&sort[0][direction]=desc&offset=0&length=5000' # Sort by descending order
frequency = '&frequency=monthly' # Pull data in monthly interval
start_date = '&start=' + str(self.start_year) + '-01'
api_key = '&api_key=' + api_keys.eia_api_key # API key
# Temporary list to store all the values
temp_list = []
# Loops through all the series needed, and request the data for each series separately
for series in eia_series:
api_request = self.eia_url + series + sort_value + frequency + start_date + api_key
r = requests.get(api_request)
json_data = r.json()
temp_list += misc_func.dict_to_list(json_data['response']['data'], 'value')
# Converts the list of values into Pandas Dataframe
eia_df = pd.DataFrame(temp_list).transpose()
# Setting column name for the dataframe
eia_df.columns = eia_series_name
# Get the date period for all the values in the list
eia_df['year_month'] = pd.DataFrame(misc_func.dict_to_list(json_data['response']['data'], 'period')).transpose()
return eia_df
Testing the EIA API
To test the API, I used the UK Brent Crude Oil and WTI Crude Oil Prices series to test out the API. This is what the data frame looks like. It contains all the data for each series up until the previous month.
# EIA series ID
eia_petroleum_price = ['/petroleum/pri/spt/data/?&data[0]=value&facets[product][]=EPCBRENT',
'/petroleum/pri/spt/data/?&data[0]=value&facets[product][]=EPCWTI']
# EIA series name
eia_petroleum_name = ['uk_brent_prices', 'wti_prices', 'period']
data = Updater()
eia_api = data.retrieve_data_eia(config.eia_petroleum_price, config.eia_petroleum_name)
Output:

The full code of the Updater class used to extract data using the API from US EIA website.
"""This file is used to retrieve data from the US BLS and US EIA website using their public API"""
# Importing libraries
import requests
import json
import api_keys
import config
import data.misc_func as misc_func
class Updater:
def __init__(self):
self.bls_url = config.url_bls
self.eia_url = config.url_eia
self.start_year = config.start_year
self.end_year = config.end_year
# API to retrieve data from the US Energy Information Administration
def retrieve_data_eia(self, eia_series, eia_series_name):
"""
Used to retrieve data from the US EIA website using an API key
:return:
"""
# Setting for the EIA API
sort_value = '&sort[0][column]=period&sort[0][direction]=desc&offset=0&length=5000' # Sort by descending order
frequency = '&frequency=monthly' # Pull data in monthly interval
start_date = '&start=' + str(self.start_year) + '-01'
api_key = '&api_key=' + api_keys.eia_api_key # API key
# Temporary list to store all the values
temp_list = []
# Loops through all the series needed, and request the data for each series separately
for series in eia_series:
api_request = self.eia_url + series + sort_value + frequency + start_date + api_key
r = requests.get(api_request)
json_data = r.json()
temp_list += misc_func.dict_to_list(json_data['response']['data'], 'value')
# Converts the list of values into Pandas Dataframe
eia_df = pd.DataFrame(temp_list).transpose()
# Setting column name for the dataframe
eia_df.columns = eia_series_name
# Get the date period for all the values in the list
eia_df['year_month'] = pd.DataFrame(misc_func.dict_to_list(json_data['response']['data'], 'period')).transpose()
return eia_df