Building your own database through TEJ API

– how to grab data from TEJ API?

What is TEJ API?

Would you find it very difficult to find a high-quality and stable source of data before doing an analysis on it? Or is it time-consuming to maintain your own web-crawling program after spending a lot of time creating it? Now, TEJ (Taiwan Economic Journal) has launched the TEJ API tool for you to solve this problem! You are very welcome to apply for a trial.

TEJ API will be mainly used as the source of data, and we would like to share some financial data analysis and applications in Python with you in our Medium. For rookies, you could get a preliminary understanding of how to analyze data through programming, and for experts, you can access a stable and fast source of data!

Highlights of this article

  • TEJ API Data Collecting
  • Saving/Reading the data into/from CSV, Xlsx, and SQLite.

Links related to this article

* TEJ API Data Collecting-Using stock data as an example

It is very simple to grab the data from TEJ API.💪
First we have to install tejapi package in Python.

pip install tejapi

The next step is to import this package and set up your own api_key. If you don’t have one, you can apply for a trial on the TEJ API official website. But the trial version has limits on the range of data that could be obtained!

import tejapi
tejapi.ApiConfig.api_key = “你的api_key”

Now we can get the data through this code👀

TSMC = tejapi.get(
'TWN/EWPRCD', 
coid = '2330',
mdate={'gte':'2020-01-01', 'lte':'2020-12-31'}, 
opts={'columns': ['mdate','open_d','high_d','low_d','close_d']}, 
paginate=True
)
TSMC

Parameters:

  1. Name of the database-TWN/EWPRCD
    ❗❗if you are a trial account user, then the name of the database may diverse for the trial version.
  2. Stock/Company symbol-2330
  3. Range of the data-from 2020–01–01 to 2020–12–31
  4. Optional-only select OHLC columns
  5. paginate -data can be obtained in stages through this parameter

** What we used in this example is from the product package in TEJ E SHOP. You can use the E Shop website to understand more information about TEJ’s current products.👍👍 Then we can see the output below:

2020台積電(2330)股市日資料-開高低收
2020 TSMC(2330) Stock price data — OHLC

Saving/Reading the data

📓Numpy, pandas, and datetime are very important packages in Python data analysis. We can do most operations, just like what we are going to do, through them.

import pandas as pd
import numpy as np
import datetime
  • Excel CSV
    We only need to use these lines of code to save or read our data into CSV file:
#save
path =’path you want to save/{name}.csv’.format(name = “TSMC”)
TSMC.to_csv(path , index=False )
#read
TSMC = pd.read_csv('TSMC.csv')
  • Excel Xlsx
    In addition to a CSV file, Excel has another storage file format called Xlsx
    . If there are many data tables you want to save in the same file, we can store them through Xlsx. The only thing you may have to pay attention to is that because the date format of the TEJ API is different from this to_excel() function. Therefore, we will have to process the date format before saving the data.
#path 
path =’path you want to save/{name}.xlsx’.format(name = “Stock_List”)
writer = pd.ExcelWriter(PATH , engine='xlsxwriter')
#data type process
TSMC['mdate'] =
TSMC['mdate'].apply(lambda x: pd.to_datetime(x).date())
#save
TSMC.to_excel(writer, sheet_name='2330' ,index = False)
writer.save()
#read
TSMC = pd.read_excel('Stock_List.xlsx', sheet_name = '2330')
#there are more functions in read_excel(),you can do through its Document to get the parameters you need。
  • SQLite
    What if the amount of data is getting larger and larger and it will take you much time when opening the excel? Now you may need SQLite at this time! SQLite is very convenient to use in Python due to its characteristics. We will briefly introduce how to connect to the database and save/read the data. For detailed instructions, please refer to the SQLite official website.
#import module
import sqlite3
#connect the sqlite db
conn = sqlite3.connect('mydatabase.db')
#save
TSMC.to_sql(name='TSMC', con=conn, if_exists="replace", index=False)
#read
sql_query = "SELECT * FROM TSMC"
TSMC = pd.read_sql(sql_query, con=conn, coerce_float=False)
#close the db
conn.close()

** Using the database management software(we use DBeaver) to use whether we have successfully imported:

After importing into the SQLite

Conclusion

What we share with you this time is the introduction to get the data through TEJ API and how to save/read it. Surely, we will write more articles related to data analysis and applications through Medium in the future!
Finally, if you have any questions or suggestions after reading this article, please leave a message or email us, we will try our best to reply to you.

Links related to this article again

Back
Procesing