{"id":16377,"date":"2021-10-05T02:34:07","date_gmt":"2021-10-04T18:34:07","guid":{"rendered":"https:\/\/www.tejwin.com\/?post_type=insight&#038;p=16377"},"modified":"2024-07-11T08:53:01","modified_gmt":"2024-07-11T00:53:01","slug":"xgboost-algorithm-predicts-returns-part-1","status":"publish","type":"insight","link":"https:\/\/www.tejwin.com\/en\/insight\/xgboost-algorithm-predicts-returns-part-1\/","title":{"rendered":"XGBoost Algorithm Predicts Returns (Part 1)"},"content":{"rendered":"\n<p>Use algorithm to learn the investment factors and predict returns.<\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter caption-align-center\"><img decoding=\"async\" src=\"https:\/\/www.tejwin.com\/wp-content\/uploads\/1_1eitpp_Cg1UOHp6hnAUp6LQ.jpg\" alt=\"\"\/><figcaption class=\"wp-element-caption\">Photo Creds:&nbsp;<a href=\"https:\/\/unsplash.com\/photos\/NDfqqq_7QWM\" rel=\"noreferrer noopener\" target=\"_blank\">Unsplash<\/a><\/figcaption><\/figure>\n\n\n\n<div id=\"ez-toc-container\" class=\"ez-toc-v2_0_81 counter-hierarchy ez-toc-counter ez-toc-grey ez-toc-container-direction\">\n<p class=\"ez-toc-title\" style=\"cursor:inherit\">Table of Contents<\/p>\n<label for=\"ez-toc-cssicon-toggle-item-69f5f54dec3b5\" class=\"ez-toc-cssicon-toggle-label\"><span class=\"ez-toc-cssicon\"><span class=\"eztoc-hide\" style=\"display:none;\">Toggle<\/span><span class=\"ez-toc-icon-toggle-span\"><svg style=\"fill: #999;color:#999\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" class=\"list-377408\" width=\"20px\" height=\"20px\" viewBox=\"0 0 24 24\" fill=\"none\"><path d=\"M6 6H4v2h2V6zm14 0H8v2h12V6zM4 11h2v2H4v-2zm16 0H8v2h12v-2zM4 16h2v2H4v-2zm16 0H8v2h12v-2z\" fill=\"currentColor\"><\/path><\/svg><svg style=\"fill: #999;color:#999\" class=\"arrow-unsorted-368013\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"10px\" height=\"10px\" viewBox=\"0 0 24 24\" version=\"1.2\" baseProfile=\"tiny\"><path d=\"M18.2 9.3l-6.2-6.3-6.2 6.3c-.2.2-.3.4-.3.7s.1.5.3.7c.2.2.4.3.7.3h11c.3 0 .5-.1.7-.3.2-.2.3-.5.3-.7s-.1-.5-.3-.7zM5.8 14.7l6.2 6.3 6.2-6.3c.2-.2.3-.5.3-.7s-.1-.5-.3-.7c-.2-.2-.4-.3-.7-.3h-11c-.3 0-.5.1-.7.3-.2.2-.3.5-.3.7s.1.5.3.7z\"\/><\/svg><\/span><\/span><\/label><input type=\"checkbox\"  id=\"ez-toc-cssicon-toggle-item-69f5f54dec3b5\"  aria-label=\"Toggle\" \/><nav><ul class='ez-toc-list ez-toc-list-level-1 ' ><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-1\" href=\"https:\/\/www.tejwin.com\/en\/insight\/xgboost-algorithm-predicts-returns-part-1\/#Highlights\" >Highlights<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-2\" href=\"https:\/\/www.tejwin.com\/en\/insight\/xgboost-algorithm-predicts-returns-part-1\/#Preface\" >Preface<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-3\" href=\"https:\/\/www.tejwin.com\/en\/insight\/xgboost-algorithm-predicts-returns-part-1\/#XGBoost_Introduction\" >XGBoost Introduction<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-4\" href=\"https:\/\/www.tejwin.com\/en\/insight\/xgboost-algorithm-predicts-returns-part-1\/#The_Editing_Environment_and_Modules_Required\" >The Editing Environment and Modules Required<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-5\" href=\"https:\/\/www.tejwin.com\/en\/insight\/xgboost-algorithm-predicts-returns-part-1\/#Virtual_Environment\" >Virtual Environment<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-6\" href=\"https:\/\/www.tejwin.com\/en\/insight\/xgboost-algorithm-predicts-returns-part-1\/#Install_XGBoost\" >Install XGBoost<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-7\" href=\"https:\/\/www.tejwin.com\/en\/insight\/xgboost-algorithm-predicts-returns-part-1\/#Install_XGBoost_visualization_module_graphviz\" >Install XGBoost visualization module graphviz<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-8\" href=\"https:\/\/www.tejwin.com\/en\/insight\/xgboost-algorithm-predicts-returns-part-1\/#Install_jupyter_notebook\" >Install jupyter notebook<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-9\" href=\"https:\/\/www.tejwin.com\/en\/insight\/xgboost-algorithm-predicts-returns-part-1\/#Final_Result\" >Final Result<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-10\" href=\"https:\/\/www.tejwin.com\/en\/insight\/xgboost-algorithm-predicts-returns-part-1\/#Database\" >Database<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-11\" href=\"https:\/\/www.tejwin.com\/en\/insight\/xgboost-algorithm-predicts-returns-part-1\/#Conclusion\" >Conclusion<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-12\" href=\"https:\/\/www.tejwin.com\/en\/insight\/xgboost-algorithm-predicts-returns-part-1\/#Extended_Reading\" >Extended Reading<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-13\" href=\"https:\/\/www.tejwin.com\/en\/insight\/xgboost-algorithm-predicts-returns-part-1\/#Related_Link\" >Related Link<\/a><\/li><\/ul><\/nav><\/div>\n<h2 class=\"wp-block-heading\" id=\"9657\"><span class=\"ez-toc-section\" id=\"Highlights\"><\/span><strong>Highlights<\/strong><span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Difficulty\uff1a\u2605\u2605\u2605\u2606\u2606<\/li>\n\n\n\n<li>Setting Virtual Environment<\/li>\n\n\n\n<li>XGBoost Introduction and Installation<\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"faad\"><span class=\"ez-toc-section\" id=\"Preface\"><\/span><strong>Preface<\/strong><span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p id=\"1208\">Recently, a lot of algorithms have emerged, and various mathematical models have been developed to solve problems. The classic model is \u201cregression\u201d. With the advancement of technology, algorithms now been developed which can improve and learn by themselves (Machine Learning). Nowaday has developed into the most popular type of neural network model (Deep Learning).<\/p>\n\n\n\n<p id=\"4b3f\">This article introduces the tree model XGBoost and will be divided into two parts. The first part will teach how to set environment and module installation. The second part is the preprocessing of the data, training, and prediction and visualization.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"14d4\"><span class=\"ez-toc-section\" id=\"XGBoost_Introduction\"><\/span><strong>XGBoost Introduction<\/strong><span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p id=\"88ff\">First, let\u2019s introduce the popular algorithm XGBoost. The so-called Boosting is a kind of aggregating many weak learnings into a more powerful learner, which has higher accuracy for the final prediction result.<\/p>\n\n\n\n<p id=\"51c3\">XGBoost (Extreme Gradient Boosting) is a gradient descent algorithm, Gradient Boosted Tree (GBDT), Each step of learning is based on previous errors, and will retain the original model, and add new functions as a correction the last error, this is a collection of multiple weak learners. The application mainly solves supervised learning, which can deal with classification and regression problems as well.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"032d\"><span class=\"ez-toc-section\" id=\"The_Editing_Environment_and_Modules_Required\"><\/span><strong>The Editing Environment and Modules Required<\/strong><span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p id=\"7570\">Mac OS and Jupyter Notebook<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"01f4\"><span class=\"ez-toc-section\" id=\"Virtual_Environment\"><\/span><strong>Virtual Environment<\/strong><span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p id=\"0048\">Due to XGBoost uses many modules, if the versions are inconsistent, it will cause endless errors. Therefore, we can create a new environment to install these modules. There are many ways to install them. This tutorial is a relatively simple and easy-to-understand way to minimize errors.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\" id=\"d3e6\"><strong>Step 1.&nbsp;Install Anaconda<\/strong><\/h4>\n\n\n\n<figure class=\"wp-block-image aligncenter\"><img decoding=\"async\" src=\"https:\/\/www.tejwin.com\/wp-content\/uploads\/1kp4-bkLFk2LHhGvJpKQug.png\" alt=\"\"\/><\/figure>\n\n\n\n<p id=\"d6fa\">Anaconda can be said to be a lazy package for beginners. It solves the current situation that the inconsistency of various systems causes installation difficulties. It has organized more than 1000 packages that can be installed, which are suitable for Windows, Linux and MacOS. Operating system environment, also has a virtual environment manager, which is simple and fast for installing and executing machine learning environment.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\" id=\"6237\"><strong>Step 2.&nbsp;Click terminal<\/strong><\/h4>\n\n\n\n<p id=\"f1f7\">Windows system is Anaconda Prompt<\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter caption-align-center\"><img decoding=\"async\" src=\"https:\/\/www.tejwin.com\/wp-content\/uploads\/12TwBrhOD97JDsGUe2lWgqQ.png\" alt=\"\"\/><figcaption class=\"wp-element-caption\">windows<\/figcaption><\/figure>\n\n\n\n<p id=\"e240\">Enter the following command<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">conda create -n new_env_name python==3.8<\/pre>\n\n\n\n<figure class=\"wp-block-image aligncenter\"><img decoding=\"async\" src=\"https:\/\/www.tejwin.com\/wp-content\/uploads\/13A1mqW_k_92V5gjUpCtjLw.png\" alt=\"\"\/><\/figure>\n\n\n\n<p id=\"e269\">It will pop up and ask if you want to install it. Enter&nbsp;<code>y<\/code>&nbsp;and&nbsp;<code>enter<\/code>&nbsp;\uff01 The name of our new environment is&nbsp;<code>test<\/code>. Of course you can also type any name you like.<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">conda env list<\/pre>\n\n\n\n<p id=\"9880\">This command will show all of the environment we have created.<\/p>\n\n\n\n<p id=\"b1fc\"><strong>step 3.&nbsp;<\/strong>Activate environment<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">conda activate new_env_name<\/pre>\n\n\n\n<p id=\"3b76\">At this time, the front bracket (base) of the terminal will turn into the name (test). It means we activate the environment successful. If the following installation fails and need to reinstall. We just remove the environment by simply entering a series of commands below.<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">conda env remove -n new_env_name<\/pre>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"a538\"><span class=\"ez-toc-section\" id=\"Install_XGBoost\"><\/span><strong>Install XGBoost<\/strong><span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<h4 class=\"wp-block-heading\" id=\"c7ae\"><strong>step 1.&nbsp;Activate environment<\/strong><\/h4>\n\n\n\n<pre class=\"wp-block-preformatted\">conda activate new_env_name<\/pre>\n\n\n\n<h4 class=\"wp-block-heading\" id=\"4bdf\"><strong>step 2.&nbsp;Enter command<\/strong><\/h4>\n\n\n\n<pre class=\"wp-block-preformatted\">conda install py-xgboost<\/pre>\n\n\n\n<p id=\"1ddf\">The same will ask if you want to install these modules, type&nbsp;<code>y<\/code>&nbsp;and press&nbsp;<code>enter<\/code>&nbsp;to start the installation, and it will be successful after running! Is it very simple!<\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter\"><img decoding=\"async\" src=\"https:\/\/www.tejwin.com\/wp-content\/uploads\/1yZd7GANPZhTQBNsOxcRAsg.png\" alt=\"\"\/><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"54fe\"><span class=\"ez-toc-section\" id=\"Install_XGBoost_visualization_module_graphviz\"><\/span><strong>Install XGBoost visualization module graphviz<\/strong><span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<h4 class=\"wp-block-heading\" id=\"4769\"><strong>step 1.&nbsp;Install Homebrew (under our new environment)<\/strong><\/h4>\n\n\n\n<p id=\"4079\"><a href=\"https:\/\/brew.sh\/\" target=\"_blank\" aria-label=\" (opens in a new tab)\" rel=\"noreferrer noopener\" class=\"ek-link\">Homebrew<\/a>\u00a0We can understand it as an installation method. For example, using\u00a0<code>pip<\/code>\u00a0to install python module. On macOS, Homebrew is the most widely used package management tool.<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">\/bin\/bash -c \"$(curl -fsSL https:\/\/raw.githubusercontent.com\/Homebrew\/install\/HEAD\/install.sh)\"<\/pre>\n\n\n\n<p id=\"4930\">Enter the command on the terminal to install<\/p>\n\n\n\n<h4 class=\"wp-block-heading\" id=\"6de8\"><strong>step 2.&nbsp;graphviz<\/strong><\/h4>\n\n\n\n<pre class=\"wp-block-preformatted\">brew install graphviz<\/pre>\n\n\n\n<p id=\"0f32\">The above are the modules we will mainly use in this article! However, in the new environment, XGBoost does not have some of the modules we need, so we have to install them separately (pandas, matplotlib, tejapi). The command is separated by spaces.<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">pip install pandas matplotlib tejapi<\/pre>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"cc41\"><span class=\"ez-toc-section\" id=\"Install_jupyter_notebook\"><\/span><strong>Install jupyter notebook<\/strong><span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<h4 class=\"wp-block-heading\" id=\"3404\"><strong>step 1.&nbsp;Open Anaconda, choose the name we just created for the environment<\/strong><\/h4>\n\n\n\n<figure class=\"wp-block-image aligncenter\"><img decoding=\"async\" src=\"https:\/\/www.tejwin.com\/wp-content\/uploads\/1_1jgJbRM-r7zW4ZatkRgsQlA.png\" alt=\"\"\/><\/figure>\n\n\n\n<h4 class=\"wp-block-heading\" id=\"3d67\"><strong>step 2.&nbsp;Under jupyter notebook Click install<\/strong><\/h4>\n\n\n\n<figure class=\"wp-block-image aligncenter\"><img decoding=\"async\" src=\"https:\/\/www.tejwin.com\/wp-content\/uploads\/1_1clr9rbb4UG5cjoSTgSeUkw.png\" alt=\"\"\/><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"51a0\"><span class=\"ez-toc-section\" id=\"Final_Result\"><\/span><strong>Final Result<\/strong><span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p id=\"8f83\">Finally, checking whether the installation is successful in jupyter!<\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter\"><img decoding=\"async\" src=\"https:\/\/www.tejwin.com\/wp-content\/uploads\/1_3YBgENgDJbt_eCQL4qlug.png\" alt=\"\"\/><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"a8eb\"><span class=\"ez-toc-section\" id=\"Database\"><\/span><strong>Database<\/strong><span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p id=\"d5ac\">We use&nbsp;<a href=\"https:\/\/api.tej.com.tw\/columndoc.html?subId=119\" rel=\"noreferrer noopener\" target=\"_blank\"><strong>TWN\/AFF_RAW<\/strong><\/a>&nbsp;in this article. It provides trading factors for algorithms learning. Database refer to Kenneth R. French and top three financial journals (JF\u3001RFS\u3001JFE). The indicators are calculated by using Taiwan market data, and the all indicators are sorted out in a monthly frequency.<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">df = tejapi.get('TWN\/AFF_RAW',<br>                coid = '9921',<br>                mdate={'gte': '2015-01-01', 'lte':'2020-12-31'}<br>                chinese_column_name = True,<br>                paginate = True)<\/pre>\n\n\n\n<figure class=\"wp-block-image aligncenter caption-align-center\"><img decoding=\"async\" src=\"https:\/\/www.tejwin.com\/wp-content\/uploads\/1_1y87GS2ffbPLkR-F4ztgk4w.png\" alt=\"\"\/><figcaption class=\"wp-element-caption\">\u8cc7\u6599\u5eab\u6b04\u4f4d<\/figcaption><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"e70d\"><span class=\"ez-toc-section\" id=\"Conclusion\"><\/span><strong>Conclusion<\/strong><span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<p id=\"8298\">The part 1 of this article is about module installation. I believe that most people will encounter many installation situations when first contact the program. The arrangement of the environment is the first class for programmer. After everyone has successfully installed it, the part 2 will start to use the database. We will process the data, feed the model, and predict returns as a reference for our investment.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"0db9\"><span class=\"ez-toc-section\" id=\"Extended_Reading\"><\/span><strong>Extended Reading<\/strong><span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li><a href=\"https:\/\/www.tejwin.com\/en\/insight\/martingale-strategy\/\" class=\"ek-link\">Martingale Strategy<\/a><\/li>\n\n\n\n<li><a href=\"https:\/\/www.tejwin.com\/en\/insight\/efficient-frontier\/\" class=\"ek-link\">Efficient Frontier<\/a><\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"5358\"><span class=\"ez-toc-section\" id=\"Related_Link\"><\/span><strong>Related Link<\/strong><span class=\"ez-toc-section-end\"><\/span><\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li><a href=\"https:\/\/api.tej.com.tw\/index.html\" rel=\"noreferrer noopener\" target=\"_blank\">TEJ API<\/a><\/li>\n\n\n\n<li><a href=\"https:\/\/eshop.tej.com.tw\/E-Shop\/Edata_intro\" rel=\"noreferrer noopener\" target=\"_blank\">TEJ E-Shop<\/a><\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>Use algorithm to learn the investment factors and predict returns. Highlights Preface Recently, a lot of algorithms have emerged, and various mathematical models have been developed to solve problems. The classic model is \u201cregression\u201d. With the advancement of technology, algorithms now been developed which can improve and learn by themselves (Machine Learning). Nowaday has developed [&hellip;]<\/p>\n","protected":false},"featured_media":16378,"template":"","tags":[2583,2371,3007,2646],"insight-category":[690,50],"class_list":["post-16377","insight","type-insight","status-publish","has-post-thumbnail","hentry","tag-finance","tag-python","tag-tejapi-data-analysis","tag-xgboost","insight-category-data-analysis","insight-category-fintech"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.tejwin.com\/en\/wp-json\/wp\/v2\/insight\/16377","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.tejwin.com\/en\/wp-json\/wp\/v2\/insight"}],"about":[{"href":"https:\/\/www.tejwin.com\/en\/wp-json\/wp\/v2\/types\/insight"}],"version-history":[{"count":2,"href":"https:\/\/www.tejwin.com\/en\/wp-json\/wp\/v2\/insight\/16377\/revisions"}],"predecessor-version":[{"id":23301,"href":"https:\/\/www.tejwin.com\/en\/wp-json\/wp\/v2\/insight\/16377\/revisions\/23301"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.tejwin.com\/en\/wp-json\/wp\/v2\/media\/16378"}],"wp:attachment":[{"href":"https:\/\/www.tejwin.com\/en\/wp-json\/wp\/v2\/media?parent=16377"}],"wp:term":[{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.tejwin.com\/en\/wp-json\/wp\/v2\/tags?post=16377"},{"taxonomy":"insight-category","embeddable":true,"href":"https:\/\/www.tejwin.com\/en\/wp-json\/wp\/v2\/insight-category?post=16377"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}