[TQuant from 0 to 1 - Day 3] Building a Comprehensive Investment Data Perspective: Stock Pool Screening and Data Retrieval with TejToolAPI

stock pool
Photo by Carlos Muza on Unsplash

Table of Contents

Preface

In financial investment, mastering accurate and comprehensive data is an indispensable skill, and effectively managing stock pools and acquiring stock price data is the key to unlocking this skill. With the tools and APIs provided by TQuant Lab, we can easily define screening criteria, quickly establish stock pools that meet specific requirements, and retrieve relevant historical stock price data.

This article will guide you step by step through setting up the environment and configuring your API Key, as well as explain how to use the get_universe function to obtain stock samples based on specific conditions. Whether you are a beginner in investing or a data analysis enthusiast, this guide will help you master the automation of data retrieval, laying a solid foundation for future quantitative analysis. Through examples and hands-on operations, you can quickly get started and apply these techniques to your investment strategies.


What is a Stock Pool?

A stock pool is like a shopping list in our daily lives. Imagine grocery shopping in a supermarket with thousands of products on the shelves. If you don’t filter your choices, you may spend a long time selecting items or buying things you don’t need. However, if you prepare a shopping list in advance, such as “breakfast ingredients” or “fitness diet,” you can quickly locate the relevant items in the store, saving time and improving efficiency.

Similarly, a stock pool aims to help investors filter out stocks that meet specific criteria from the vast stock market. For instance, you might set a condition like:

“I want to find common stocks of semiconductor companies listed in Taiwan.”

With a stock pool, you get a curated list of stocks that meet this condition, just like a shopping list helps you focus on relevant grocery items. This allows you to concentrate on analyzing only the relevant stocks without wasting time on unrelated companies.

In short, a stock pool acts as a “compass” in your investment journey, helping you navigate the complex market and providing a reliable foundation for decision-making.


Benefits of a Well-Designed Stock Pool

1. Improved Analysis Efficiency

With thousands of stocks in the market, it is easy to get lost in a sea of data if you lack clear selection criteria. A well-defined stock pool lets you focus on relevant stocks quickly, saving time.

2. Reducing the Impact of Survivor Bias

By setting specific selection criteria, a stock pool ensures that you consider both successful and failed companies, rather than only analyzing those that have survived. This helps create a more comprehensive view when backtesting investment strategies.

3. Automation and Quantitative Applications

Stock pools can be integrated with quantitative strategies, aiding in backtesting and data analysis. This ensures that your investment targets are based on a filtered and reliable dataset, improving the accuracy and effectiveness of your analysis.


Introduction to the get_universe Function

In quantitative analysis and investment strategy design, quickly and accurately retrieving stock pool data is critical. With the tools provided by TQuant Lab, we can conveniently use the get_universe function to create custom stock pools based on different screening criteria, laying the groundwork for subsequent analysis.

The get_universe function is primarily used to screen and return a list of stock symbols that meet specific conditions. It allows users to filter stocks within a particular time range based on criteria such as market type, industry, and security type. The resulting stock list can seamlessly integrate with quantitative strategies and backtesting tools.

Key Features of get_universe

✅ Stock Pool Creation – Automatically filters stocks that meet the specified criteria, eliminating the need for manual selection.

 ✅ Conditional Filtering – Supports various screening conditions such as market type, sector, industry, index inclusion, etc.

 ✅ Dynamic Range Selection – Returns stocks that meet the criteria within a specified date range.

get_universe(start,
end = datetime.datetime.now().date().strftime(%Y-%m-%d),
trading_calender = get_calender('TEJ_XTAI'),
**kwargs)


Leveraging the get_universe function can significantly enhance your stock screening process, ensuring your investment analysis is based on structured and high-quality data.

tejtoolapi
The integration of the stock pool content relationships.

Parameters for Defining Screening Time Range and Trading Calendar

The following parameters are responsible for setting the time range and trading calendar when filtering the stock pool:

1. start (datetime or str)

  • Purpose: Specify the start date for stock pool filtering.
  • Meaning: Returns stocks that meet the conditions from start to end.

2. end (datetime or str, optional, defaults to today’s date)

  • Purpose: Specify the end date for stock pool filtering.
  • Description: If not provided, the system will use the execution date as the default end date.

3. trading_calendar (TradingCalendar, optional, defaults to TEJ_XTAI)

  • Purpose: Defines the trading calendar, ensuring that only valid trading days within the specified date range are considered.
  • Default Value: TEJ_XTAI (The Taiwan Stock Exchange trading calendar).

Available kwargs Properties (At Least One Must Be Specified)

The function will return all available stocks if no filtering criteria are specified.

These parameters determine the content of the returned stock pool, including:

  • Market Type
  • Sector (Chinese & English)
  • Security Type (Chinese & English)
  • Primary Industry (Chinese & English)
  • Sub-industry (Chinese & English)
  • Index Name

These criteria can be customized flexibly based on your specific needs. For more details, please refer to the relevant documentation link.

Code Examples

We will walk through nine examples to help you better understand how to utilize the get_universe function.


1. Import Libraries and Set Up

First, we need to import the necessary libraries and set up the API configuration:

import pandas as pd
import datetime
import tejapi
import os
import numpy as np

# Set TEJ API Key and Base URL
os.environ['TEJAPI_KEY'] = "your key" 
os.environ['TEJAPI_BASE'] = "https://api.tej.com.tw"

# Define Start and End Dates for Screening
start = '2024-01-01'
end = '2025-01-01'

# Import get_universe function
from zipline.sources.TEJ_Api_Data import get_universe

2. Nine Stock Pool Screening Examples Using get_universe

Case 1: Retrieve Securities Previously Listed on TWSE General Board & OTC General Board

get_universe(start, end, mkt_bd_e=['TSE', 'OTC'])

Case 2: Retrieve Securities Previously Listed on the Taiwan OTC Market

get_universe(start, end, mkt=['OTC'])

Case 3: Retrieve Common Stocks Previously Listed on TWSE & OTC

get_universe(start, end, mkt=['TWSE', 'OTC'], stktp_c=['普通股'])

Case 4: Retrieve Securities Previously Classified as “M2324 半導體業” (Semiconductor Industry)

get_universe(start, end, sub_ind_c=['M2324 半導體業'])

Case 5: Retrieve Securities Previously Classified as “M2324 Semiconductor” & “M2325 Computer and Peripheral Equipment”

get_universe(start, end, sub_ind_e=['M2324 Semiconductor',

                                    'M2325 Computer and Peripheral Equipment'])

Case 6: Retrieve Securities of Type “ETF” and “普通股” (Common Stocks) Listed on TWSE

get_universe(start, end, stktp_c=['ETF', '普通股'], mkt=['TWSE'])

Case 7: Retrieve Securities Previously Included in “IX0002” (Taiwan 50 Index)

get_universe(start, end, idx_id='IX0002')

Case 8: Retrieve Securities Previously in the “M1100 水泥工業” (Cement Industry) and Part of “IX0006” (Taiwan High Dividend Index)

get_universe(start, end, main_ind_c='M1100 水泥工業', idx_id='IX0006')

Case 9: Retrieve Common Stocks in Non-Financial Industries

get_universe(start, end, main_ind_c=['一般產業'], stktp_c='普通股')

TEJToolAPI

We now have a filtered target list after successfully creating a stock pool that meets our criteria. However, the stock pool is the starting point for investment data analysis. We need additional detailed financial data to perform in-depth analysis or strategy development.

This is where TejToolAPI comes into play. With TejToolAPI, we can efficiently retrieve historical stock-related data, including financial information such as:

 ✅ Earnings per Share (EPS)

 ✅ Growth Rate

 ✅ Dividends

 ✅ Other key financial indicators

By integrating TejToolAPI, we enhance the analytical value of every stock in our stock pool.


TejToolAPI.get_history_data Function

The get_history_data function allows users to retrieve historical data related to securities and transactions. Below are its key parameters:

1. ticker (iterable[str])

  • Purpose: Specifies the stock ticker(s) to query.
  • Example: Supports multiple tickers, e.g., [“2330”, “1101”, “3711”].
  • Tip: You can combine it with get_universe to dynamically select stocks.

2. columns (iterable[str])

  • Purpose: Specifies the data fields to retrieve.
  • Reference: The TEJ API documentation provides the list of available columns.

3. transfer_to_chinese (boolean, default: False)

  • Purpose: Determines whether to translate column names into Chinese.
  • Options:
    • True → Convert column names to Chinese
    • False → Keep column names in English

4. start (pd.Timestamp or str)

  • Purpose: Set the start date for data retrieval.
  • Format: “YYYY-MM-DD” or pd.Timestamp(“YYYY-MM-DD”)
  • Meaning: Returns data from the specified start date to the end date.

5. end (pd.Timestamp or str, optional, defaults to execution date)

  • Purpose: Set the end date for data retrieval.
  • Format: “YYYY-MM-DD” or pd.Timestamp(“YYYY-MM-DD”)

6. fin_type (iterable[str])

  • Purpose: Specify the financial data type.
  • Valid Options:
    • ‘A’ → Cumulative Data
    • ‘F’ → Single Quarter Data
    • ‘TTM’ → Trailing Twelve Months Data
  • Example: Supports multiple types, e.g., [“A”, “TTM”].

7. include_self_acc (str, default: ‘N’)

  • Purpose: Determines whether to include self-reported and board-approved financial data.
  • Valid Options:
    • ‘Y’ → Include self-reported and board-approved financial data.
    • ‘N’ → Include only official investment financial data.

Example: Retrieve Historical EPS Data for Selected Stocks

ticker = ["2330", "1101", "3711"]  # Define Stock Codes
columns = ["eps"]  # Specify Data Fields

TejToolAPI.get_history_data(
    ticker = ticker,
    columns = columns,
    transfer_to_chinese = True,  # Convert column names to Chinese
    start = pd.Timestamp("2021-07-02"),  # Set Start Date
    end = pd.Timestamp("2022-07-02"),  # Set End Date
    fin_type = ["A", "F", "TTM"],  # Retrieve multiple financial data types
    include_self_acc = "Y"  # Include self-reported data
)

Conclusion

By combining get_universe for stock screening and TejToolAPI.get_history_data for retrieving financial metrics, we can automate the stock selection and data collection process. This enables efficient quantitative analysis and enhances investment decision-making.

TEJToolAPI
data

Unlocking the Power of Data with TejToolAPI

With TejToolAPI, we can effortlessly and efficiently extract rich datasets from the TEJ database, including:

 ✅ Financial indicators

 ✅ Growth rates

 ✅ Various multi-dimensional data

These datasets deepen our understanding of each stock in the stock pool and serve as a critical foundation for investment analysis and strategy design.

Important Reminder: This analysis is for reference only and does not constitute any product or investment advice.

We welcome readers interested in various trading strategies to consider purchasing relevant solutions from Quantitative Finance Solution. With our high-quality databases, you can construct a trading strategy that suits your needs.

“Taiwan stock market data, TEJ collect it all.”

The characteristics of the Taiwan stock market differ from those of other European and American markets. Especially in the first quarter of 2024, with the Taiwan Stock Exchange reaching a new high of 20,000 points due to the rise in TSMC’s stock price, global institutional investors are paying more attention to the performance of the Taiwan stock market. 

Taiwan Economical Journal (TEJ), a financial database established in Taiwan for over 30 years, serves local financial institutions and academic institutions, and has long-term cooperation with internationally renowned data providers, providing high-quality financial data for five financial markets in Asia. 

  • Complete Coverage: Includes all listed companies on stock markets in Taiwan, China, Hong Kong, Japan, Korea, etc. 
  • Comprehensive Analysis of Enterprises: Operational aspects, financial aspects, securities market performance, ESG sustainability, etc. 
  • High-Quality Database: TEJ data is cleaned, checked, enhanced, and integrated to ensure it meets the information needs of financial and market analysis. 

With TEJ’s assistance, you can access relevant information about major stock markets in Asia, such as securities market, financials data, enterprise operations, board of directors, sustainability data, etc., providing investors with timely and high-quality content. Additionally, TEJ offers advisory services to help solve problems in theoretical practice and financial management!

Why Use TejToolAPI?

🔹 Flexible parameter settings – Customize data queries to suit different investment needs.

 🔹 Comprehensive field selection – Extract precise financial and market data for analysis.

 🔹 Seamless integration – Quickly integrate retrieved data into the quantitative analysis process.


Next Steps

Now that you have a basic understanding of retrieving the data you need using TejToolAPI, you can start incorporating these datasets into your investment workflow.

For more advanced insights, check out the links below, where we provide deeper explanations and case studies on using TejToolAPI to refine your investment strategies.


“Welcome, investors, to refer to this. We will continue to introduce how to use the TEJ database to construct various indicators and backtest their performance. Therefore, we invite readers who are interested in trading backtests to explore TQuant Lab‘s related plans, which allow you to build trading strategies that suit you using high-quality databases.

A friendly reminder: This analysis is for reference only and does not constitute any advice on products or investments.”

【TQuant: From 0 to 1 – Day 3】(Link to be added)

Extended Reading


Useful Links

Back
Procesing