Analysis of the chip concentration

Photo by Maxim Hopman on Unsplash




Use the top 15 trading net buy/sell, total trading volume, number of participating brokers, and brokers ranking to construct a chip concentration indicator.


This article constructs chip concentration indicators through chip data and presents them in a data visualization way, requiring readers to know about data visualization and chip data.


Taiwan’s stock market is a shallow dish market. A shallow dish means it has no depth and is susceptible to external shocks. In contrast, the stock market is easily disturbed by external forces. It has significant ups and downs, such as being influenced by news or large purchases of a single stock by large investors, which can significantly affect stock price performance. This article uses several chip data to explore the impact on stock prices when chips are concentrated in stock and presents them in an interactive chart.

The Editing Environment and Module Required

This article uses Mac OS as a system and jupyter as an editor.

import pandas as pd
import numpy as np
import tejapi
import matplotlib.pyplot as plt
import matplotlib.transforms as transforms
from matplotlib.font_manager import FontProperties

plt.rcParams['font.sans-serif'] = ['Arial Unicode MS'] # 解決MAC電腦 plot中文問題
plt.rcParams['axes.unicode_minus'] = False

tejapi.ApiConfig.api_key ="Your Key"
tejapi.ApiConfig.ignoretz = True

import as px
import plotly.graph_objects as go
from plotly.subplots import make_subplots


Data Import

The data period was taken from 2017 to 2022, and TSMC was used as an example to obtain information on the number of shares bought, shares sold, volume, and closing price.

df = tejapi.get('TWN/AMTOP1',  #從TEJ api撈取所需要的資料
                chinese_column_name = True,
                paginate = True,
                mdate = {'gt':'2017-01-01'},
                opts={'columns':['coid','mdate', 'key3','buy', 'sell', 'total']})
df1 = tejapi.get('TWN/APRCD1',  #從TEJ api撈取所需要的資料
                chinese_column_name = True,
                paginate = True,
                mdate = {'gt':'2017-01-01'},
                opts={'columns':['coid','mdate', 'volume', 'close_adj']})

Delete the NA value of the data, sort the number of shares bought from the largest to the smallest, and take out the top 15 largest daily purchases.

df.sort_values(['日期','買進股數'], ascending=[1,0], inplace=True)#根據買進股數由大到小排序
df_buy = df.groupby(['日期']).head(15)#取出每個日期的前15筆

Create a new Dataframe and use the date as the index to sum up, the daily buy shares and put them into a new table.

result = pd.DataFrame(index=df_buy['日期'].unique())#創一個新的表,日期當index
result['買進股數'] = df_buy.groupby(['日期'])['買進股數'].sum()# 把每個日期的前15筆買進股數加總

The part of the shares sold is processed similarly to the number of shares bought.

df.sort_values(['日期','賣出股數'], ascending=[1,0], inplace=True)#根據賣出股數由大到小排序
df_sell  = df.groupby(['日期']).head(15)#取出每個日期的前15筆
result['賣出股數'] = df_sell.groupby(['日期'])['賣出股數'].sum()

The number of shares bought and sold is summed according to the 60-day and 120-day rolling window.

result_60 = result.rolling(60).sum() #根據n日做買進/賣出rolling 總和
result_60.rename(columns = {'買進股數':'買進股數_60', '賣出股數':"賣出股數_60"}, inplace=True)
result_120 = result.rolling(120).sum() #根據n日做買進/賣出rolling 總和
result_120.rename(columns = {'買進股數':'買進股數_120', '賣出股數':"賣出股數_120"}, inplace=True)

The volume of the 60-day and 120-day rolling window is also summed and combined with the table of the number of shares.

df1['成交量(千股)_60'] = df1['成交量(千股)'].rolling(60).sum()
df1['成交量(千股)_120'] = df1['成交量(千股)'].rolling(120).sum()
result2 = pd.merge(result_60, df1[['成交量(千股)_60', '成交量(千股)_120','收盤價(元)']], left_index=True, right_index=True) #把成交量跟買進/賣出合起來
result3 = pd.merge(result2, result_120, left_index=True, right_index=True) #把成交量跟買進/賣出合起來

Then, according to the formula of chip concentration: ((Buyer’s top 15 total overbought volume in the last n days — Seller’s top 15 total overbought volume in the last n days)/total volume in the last n days) *100, to calculate the chip concentration.

result3['60日籌碼集中度'] = ((result3['買進股數_60'] - result3['賣出股數_60'])/result2['成交量(千股)_60'])*100 
result3['120日籌碼集中度'] = ((result3['買進股數_120'] - result3['賣出股數_120'])/result2['成交量(千股)_120'])*100 
# ((買方前15名近n日總買超量 - 賣方前15名近n日總賣超量) / 近n日總成交量)*100

The results are made into an interactive chart via the plotly package, making it easy to observe trends at specific times. As you can see, TSMC’s 60-day chip concentration hit a new low in 2017 during the epidemic panic, showing massive chip dispersion and a drop in the stock price to the lowest point of the range.


Data Import

The data period is from mid-2021 to 2022, and TSMC is used as an example to obtain the broker’s name, the broker code, the number of shares bought, and the number of shares sold.

df2 = tejapi.get('TWN/ABSR', #從TEJ api撈取所需要的資料
                  chinese_column_name = True,
                  paginate = True,
                  mdate = {'gt':'2021-06-30', 'lt':'2022-11-22'},

Sort the number of shares bought and sold from the largest to the smallest, and use the rank function to rank the different brokers for each date.

df2_buy = df2.sort_values(['年月日','買進股數(股)'], ascending=[1,0]).reset_index(drop=True)
df2_sell = df2.sort_values(['年月日','賣出股數(股)'], ascending=[1,0]).reset_index(drop=True)
df2_buy['排名'] = df2_buy.groupby('年月日')['買進股數(股)'].apply(lambda x: x.rank(method='dense' ,ascending=False))
df2_sell['排名'] = df2_sell.groupby('年月日')['賣出股數(股)'].apply(lambda x: x.rank(method='dense' ,ascending=False))

Calculate the number of buy orders, sell orders, ranking* orders, and the total number of participating brokers per day for the subsequent indicator calculation.

df2_buy['買進張數'] = df2_buy['買進股數(股)']/1000
df2_buy['名次_張數'] = df2_buy['排名']*df2_buy['買進張數']
attend = df2_buy.groupby('年月日')['買進股數(股)'].count() = '參與交易券商數'
df2_buy = df2_buy.merge(attend, on='年月日')
df2_buy.set_index('年月日', inplace=True)

df2_sell['賣出張數'] = df2_sell['賣出股數(股)']/1000
df2_sell['名次_張數'] = df2_sell['排名']*df2_sell['賣出張數']
attend1 = df2_sell.groupby('年月日')['賣出股數(股)'].count() = '參與交易券商數'
df2_sell = df2_sell.merge(attend, on='年月日')
df2_sell.set_index('年月日', inplace=True)

Buy concentration formula:
(((SUM (number of buy sheets) *number of participating brokers)/2) -(SUM (number of buy sheets)/2) +(SUM (number of buy sheets) *ranking))/ (SUM (number of buy sheets) *ranking)/2

Sell concentration formula:
(((SUM (number of sell sheets) *number of participating brokers)/2) -(SUM (number of sell sheets)/2) +(SUM (number of sell sheets) *ranking))/ (SUM (number of sell sheets) *ranking)/2

Difference = Buy concentration — Sell concentration.

Because the formula is more complicated, I divided it into three parts to calculate, and finally calculated the buy concentration, sell concentration, and the difference.

result = pd.DataFrame(index=(df2_buy.index).unique())
result['a1'] = ((df2_buy.groupby('年月日')['買進張數'].sum())*(df2_buy.groupby('年月日')['參與交易券商數'].head(1)))/2
result['b1'] = (df2_buy.groupby('年月日')['買進張數'].sum())/2
result['c1'] = df2_buy.groupby('年月日')['名次_張數'].sum()
result['a2'] = ((df2_sell.groupby('年月日')['賣出張數'].sum())*(df2_buy.groupby('年月日')['參與交易券商數'].head(1)))/2
result['b2'] = (df2_sell.groupby('年月日')['賣出張數'].sum())/2
result['c2'] = df2_sell.groupby('年月日')['名次_張數'].sum()

result['買進集中度'] = (result['a1'] - (result['b1']+result['c1']))/result['a1']
result['賣出集中度'] = (result['a2'] - (result['b2']+result['c2']))/result['a2']
result['差值'] = result['買進集中度'] - result['賣出集中度']

Integrate the buy concentration, sell concentration, difference, and TSMC’s closing price into the same table to facilitate subsequent charting.

final = result[['買進集中度','賣出集中度','差值']].merge(result3.loc['2021-07-01':]['收盤價(元)'], left_index=True,right_index=True)

First, hide the difference to better observe this chart. You can find that no matter the buy concentration or sell concentration, fluctuations are very large. It is difficult to observe whether the trend is consistent with the stock price and then observe the relationship between the difference and the stock price.


It is clear from the difference and TSMC’s closing price that when the difference is more prominent, TSMC’s closing price performs better on that day, and conversely, when the difference shows a downward trend, TSMC’s closing price also shows a downward trend. It can be seen that the concentration of chips, even for companies with large market capitalization and turnover like TSMC, still has a specific effect.



We can see from the results that both the first 60/120-day chip concentration or the second buy/sell indicators and differences show the impact of chip concentration on the stock price. The stock price tends to fall back when the concentration is lower, or the spread is smaller. It should be noted that if the reader wants to use it for program trading or academic research, further smoothing may need to be considered; otherwise, the data with such high volatility may easily have over-generated signals or problems with stationary.

We will also introduce the use of the TEJ database to construct various indicators and backtest the performance of the indicators; welcome to purchase the plans offered in TEJ E Shop and use the well-complete database to find the potential event.

Source Code

Extended Reading

Related Link