Using the document to have a deep understanding of TEJ Rest API.
Table of Contents
TEJ has official packages which are specialized for R, Python users to make data collection more convenient. (Python API document)
TEJ also cares about the programmers of other language users(C, C#, Java), so we have developed Rest API to extract data from the databases.
The methodology of Rest API is similar to website crawl, and we can get the information from the Uniform Resource Locator (URL). In this episode, we will show you how to use Rest API to collect the data.
Before using our TEJ API, if you don’t have an API KEY, you can get it through the trial application in the link or directly buy the product package in the E-Shop, as the picture below:
import requests
import pandas as pd
import json
# 輸入 api_key
api_key = 'your key'
url = 'https://api.tej.com.tw/api/datatables/TWN/APRCD.json?api_key='+api_key
rq = requests.get(url)
rq.content
After transforming json to dataFrame, the data seems more tidy 😎😎~
data = json.loads(rq.content)['datatable']['data']
columns = pd.DataFrame(json.loads(rq.content)['datatable']['columns'])['cname'].to_list()
stock_price = pd.DataFrame(data,columns=columns)
stock_price
Analyzing the URL much further, we can find the combination of the URL has pattern! The former part of the URL is to connect to the TEJ databases, and the latter part (after the bold) is to select the database, the table, output format of the data, and some custom criteria.
TWN
: Taiwan database.APRCD
: Stock data table.json
: Output format.api_key
: Key (like the password while sign in the FB or Google ).
https://api.tej.com.tw/api/datatables/{datatable_code}/{table_code}.{format}?<row_filter_criteria>
https://api.tej.com.tw/api/datatables/TWN/APRCD.json?api_key=<YOURAPIKEY>
⚠️ When the parameters are more than two, you have to add & at the in front of the parameters, such as & coid and & api_key ⚠️
🔽 Individual Stock 🔽
Use Taiwan Weighted Index (code:Y9999) for example.
coid = 'Y9999'
url = 'https://api.tej.com.tw/api/datatables/TWN/APRCD.json?'+'&coid='+coid+'&api_key='+api_key
rq = requests.get(url)
data = json.loads(rq.content)['datatable']['data']
columns = pd.DataFrame(json.loads(rq.content)['datatable']['columns'])['cname'].to_list()
stock_price = pd.DataFrame(data,columns=columns)
stock_price
🔽 Multi-stocks 🔽
Use Taiwan Weighted Index (code:Y9999), Taiwan semiconductor (code:2330), Mediatek (code:2454), YangMing Marine (code: 2609) for examples.
coid = 'Y9999,2330,2454,2609'
url = 'https://api.tej.com.tw/api/datatables/TWN/APRCD.json?'+'&coid='+coid+'&api_key='+api_key
rq = requests.get(url)
data = json.loads(rq.content)['datatable']['data']
columns = pd.DataFrame(json.loads(rq.content)['datatable']['columns'])['cname'].to_list()
stock_price = pd.DataFrame(data,columns=columns)
stock_price
👺 Devil in details 👺
In the example of multi-stocks, we observed that the length of the data is 10,000. However, the data length of an individual stock is more than 5000, in theory, we must have more than 20,000 rows in the data. Why the differences?
For the stability of the host operation, TEJ limits the maximal amount of the output is 10,000 each time. When the data is more than 10,000, we should get the next_cursor_id, and add opts.cursor_id = next_cursor_id to get the rest of the data.
💻 next_cursor_id 💻
rq.json()['meta']['next_cursor_id']
According to the above logic, when there are 30,000 data, there will be 3 next_cursor_id, that is, there will be one next_cursor_id for every 10,000 data, and then it’s the turn to use the while loop ❗️
💻 Solution to next_cursor_id 💻
We have created a function for you, so users can use the function below to get the data easily 😎😎.
def tej_get_data(db_code,api_key,coid=None,columns=None):
# import requests
# import pandas as pd
common = 'https://api.tej.com.tw/api/datatables/'+ db_code +'.json?'
if (coid==None) & (columns==None):
tej_url = common+'&api_key='+api_key
elif (coid!=None) & (columns==None):
tej_url = common+'&api_key='+api_key+'&coid='+coid
elif (coid==None) & (columns!=None):
tej_url = common+'&api_key='+api_key+'&opts.columns='+columns
else:
tej_url = common+'&api_key='+api_key+'&coid='+coid+'&opts.columns='+columns
#print('get url:'+tej_url)
rq = requests.get(tej_url)
id_ = rq.json()['meta']['next_cursor_id']
data = rq.json()['datatable']['data']
columns = pd.DataFrame(rq.json()['datatable']['columns'])['cname'].to_list()
stock_price = pd.DataFrame(data,columns=columns)
#print('get next_cursor_id:'+id_)
while id_ != None:
urls = tej_url+'&opts.cursor_id='+id_
#print('get url:'+urls)
rqs = requests.get(urls)
id_ = rqs.json()['meta']['next_cursor_id']
data = rqs.json()['datatable']['data']
columns = pd.DataFrame(rqs.json()['datatable']['columns'])['cname'].to_list()
temp = pd.DataFrame(data,columns=columns)
stock_price = stock_price.append(temp).reset_index(drop=True)
#print('get next_cursor_id:'+id_)
return stock_price
# multi-stocks
coid = 'Y9999,2330,2454,2609'
stock_price = tej_get_data(
db_code='TWN/APRCD',
api_key=api_key,
coid = coid)
stock_price
Use Taiwan Weighted Index (code:Y9999) for example.
🔽 Single Column 🔽
Add parameters &opts.columns=open_d, select column as open price.
# 單一欄位
coid = 'Y9999'
url = 'https://api.tej.com.tw/api/datatables/TWN/APRCD.json?&opts.columns=open_d'+'&api_key='+api_key+'&coid='+coid
rq = requests.get(url)
data = rq.json()['datatable']['data']
columns = pd.DataFrame(rq.json()['datatable']['columns'])['cname'].to_list()
stock_price = pd.DataFrame(data,columns=columns)
stock_price
🔽 Multi-columns 🔽
Columns:stock code, Date, open, high, low, close.
# 多欄位
coid = 'Y9999'
columns = 'coid,mdate,open_d,high_d,low_d,close_d'
url = 'https://api.tej.com.tw/api/datatables/TWN/APRCD.json?'+'&opts.columns='+columns+'&api_key='+api_key+'&coid='+coid
rq = requests.get(url)
data = rq.json()['datatable']['data']
columns = pd.DataFrame(rq.json()['datatable']['columns'])['cname'].to_list()
stock_price = pd.DataFrame(data,columns=columns)
stock_price
Use Taiwan Weighted Index (code:Y9999) for example.
Start: 2020–01–01;End: 2020–12–31.
Parameter setting:
# 日期篩選
coid = 'Y9999'
start = '2020-01-01'
end = '2020-12-31'
url = 'https://api.tej.com.tw/api/datatables/TWN/APRCD.json?'+'&mdate.gte='+start+'&mdate.lte='+end+'&api_key='+api_key+'&coid='+coid
rq = requests.get(url)
data = rq.json()['datatable']['data']
columns = pd.DataFrame(rq.json()['datatable']['columns'])['cname'].to_list()
stock_price = pd.DataFrame(data,columns=columns)
stock_price
datatable_code = 'TWN/APRCD'
url = 'https://api.tej.com.tw/api/datatables/'+datatable_code+'/metadata?api_key='+api_key
rq = requests.get(url)
rq.json()
Search EPS, it will appear many results. (For detail : TEJ API )
matchType:
key_word = '每股盈餘'
url = 'https://api.tej.com.tw/api/search/table/'+key_word+'?api_key='+api_key
rq = requests.get(url)
rq.json()
api_key = 'your key'
url = 'https://api.tej.com.tw/api/apiKeyInfo/'+api_key
rq = requests.get(url)
rq.json()
The content today is for everyone to have a deeper knowledge and understanding of our TEJ Rest API. It is easier to understand, but through the codes, we can understand that these built-in functions could help users to get the data they want from the TEJ’s huge database more conveniently.
Subscribe to newsletter